<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="brief-report" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Genet.</journal-id>
<journal-title>Frontiers in Genetics</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Genet.</abbrev-journal-title>
<issn pub-type="epub">1664-8021</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">865371</article-id>
<article-id pub-id-type="doi">10.3389/fgene.2022.865371</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Genetics</subject>
<subj-group>
<subject>Brief Research Report</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Shared and Cell-Type-Specific Gene Expression Patterns Associated With Autism Revealed by Integrative Regularized Non-Negative Matrix Factorization</article-title>
<alt-title alt-title-type="left-running-head">Guan et al.</alt-title>
<alt-title alt-title-type="right-running-head">iRNMF</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Guan</surname>
<given-names>Jinting</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/722948/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Zhuang</surname>
<given-names>Yan</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Kang</surname>
<given-names>Yue</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Ji</surname>
<given-names>Guoli</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/920542/overview"/>
</contrib>
</contrib-group>
<aff id="aff1">
<sup>1</sup>
<institution>Department of Automation</institution>, <institution>Xiamen University</institution>, <addr-line>Xiamen</addr-line>, <country>China</country>
</aff>
<aff id="aff2">
<sup>2</sup>
<institution>National Institute for Data Science in Health and Medicine</institution>, <institution>Xiamen University</institution>, <addr-line>Xiamen</addr-line>, <country>China</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1360163/overview">Kaifang Pang</ext-link>, Baylor College of Medicine, United States</p>
</fn>
<fn fn-type="edited-by">
<p>
<bold>Reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1661279/overview">Flavia Esposito</ext-link>, University of Bari Aldo Moro, Italy</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1214834/overview">Jinjin Tian</ext-link>, Carnegie Mellon University, United States</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/796677/overview">Xiuwei Zhang</ext-link>, Georgia Institute of Technology, United States</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Jinting Guan, <email>jtguan@xmu.edu.cn</email>
</corresp>
<fn fn-type="other">
<p>This article was submitted to Computational Genomics, a section of the journal Frontiers in Genetics</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>11</day>
<month>05</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>13</volume>
<elocation-id>865371</elocation-id>
<history>
<date date-type="received">
<day>29</day>
<month>01</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>11</day>
<month>04</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2022 Guan, Zhuang, Kang and Ji.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Guan, Zhuang, Kang and Ji</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<p>Human brain-related disorders, such as autism spectrum disorder (ASD), are often characterized by cell heterogeneity, as the cell atlas of brains consists of diverse cell types. There are commonality and specificity in gene expression among different cell types of brains; hence, there may also be commonality and specificity in dysregulated gene expression affected by ASD among brain cells. Moreover, as genes interact together, it is important to identify shared and cell-type-specific ASD-related gene modules for studying the cell heterogeneity of ASD. To this end, we propose integrative regularized non-negative matrix factorization (iRNMF) by imposing a new regularization based on integrative non-negative matrix factorization. Using iRNMF, we analyze gene expression data of multiple cell types of the human brain to obtain shared and cell-type-specific gene modules. Based on ASD risk genes, we identify shared and cell-type-specific ASD-associated gene modules. By analyzing these gene modules, we study the commonality and specificity among different cell types in dysregulated gene expression affected by ASD. The shared ASD-associated gene modules are mostly relevant to the functioning of synapses, while in different cell types, different kinds of gene functions may be specifically dysregulated in ASD, such as inhibitory extracellular ligand-gated ion channel activity in GABAergic interneurons and excitatory postsynaptic potential and ionotropic glutamate receptor signaling pathway in glutamatergic neurons. Our results provide new insights into the molecular mechanism and pathogenesis of ASD. The identification of shared and cell-type-specific ASD-related gene modules can facilitate the development of more targeted biomarkers and treatments for ASD.</p>
</abstract>
<kwd-group>
<kwd>ASD</kwd>
<kwd>cell-type-specific gene module</kwd>
<kwd>shared gene module</kwd>
<kwd>gene function</kwd>
<kwd>integrative regularized non-negative matrix factorization</kwd>
</kwd-group>
<contract-num rid="cn001">61803320 61573296</contract-num>
<contract-sponsor id="cn001">National Natural Science Foundation of China<named-content content-type="fundref-id">10.13039/501100001809</named-content>
</contract-sponsor>
</article-meta>
</front>
<body>
<sec id="s1">
<title>Introduction</title>
<p>The human brain is a highly heterogeneous organ, consisting of multiple kinds of cell types. Brain-related disorders, such as autism spectrum disorder (ASD), are often characterized by cell heterogeneity and mainly affect some specific cell types. ASD, a set of neuropsychiatric disorders, is characterized by highly genetic and phenotypic heterogeneity. To date, its actual causes and underlying mechanisms remain unclear. Although there have been hundreds of genes identified to be associated with ASD, they only account for 10&#x2013;20% of ASD cases (<xref ref-type="bibr" rid="B17">Rylaarsdam and Guemez-Gamboa, 2019</xref>). Genes do not act alone, and what determines the manifestation of a disease in different cell types is the presence of disease-associated gene modules instead of individual genes (<xref ref-type="bibr" rid="B11">Kitsak et al., 2016</xref>; <xref ref-type="bibr" rid="B6">Guan et al., 2021</xref>). Moreover, as there are commonality and specificity in gene expression among different cell types of brains, there may also be commonality and specificity in dysregulated gene expression affected by ASD among brain cells. Therefore, based on gene expression datasets of multiple human brain cells, the detection of shared and cell-type-specific ASD-associated gene modules is of significance to study the molecular mechanism and pathogenesis of ASD.</p>
<p>Non-negative matrix factorization (NMF)-based methods have been developed and applied to the analyses of biological sequencing data, such as sparse NMF (sNMF) (<xref ref-type="bibr" rid="B13">Mairal et al., 2010</xref>) and sparse modular activity factorization (SMAF) (<xref ref-type="bibr" rid="B3">Cleary et al., 2017</xref>). In the context of integrating heterogeneous datasets, several methods have been proposed recently. Many of them were developed to integrate multi-modal or multi-omics data and focus on the analysis of samples, such as the joint definition of cell types of samples by taking the advantage of multiple heterogeneous datasets. For example, LIGER (<xref ref-type="bibr" rid="B18">Welch et al., 2019</xref>) was developed based on integrative non-negative matrix factorization (iNMF) (<xref ref-type="bibr" rid="B19">Yang and Michailidis, 2016</xref>) to factorize multiple datasets into a common gene-factor matrix, multiple dataset-specific gene-factor matrices, and multiple dataset-specific sample-factor matrices. Compared with the original algorithm of iNMF, LIGER adopted a novel block coordinate descent algorithm for performing iNMF, which can converge quickly. iNMF can extract consistent patterns embedded in various data sources by separating the homogeneous and heterogeneous effects among the sources, and it was mainly adopted to analyze the low-dimensional sample-factor matrices based on different kinds of data. The low-dimensional gene-factor matrices should be given more attention. The sparsity of sample representation (<xref ref-type="bibr" rid="B19">Yang and Michailidis, 2016</xref>) is beneficial to sample analyses, such as cell-type definition, while to perform gene module analyses, the sparsity or regularization of gene representation could be induced. Except for integrating multi-modal data, performing integrative and comparative analyses on the same type of data from multiple biological conditions, such as various cancer types or subtypes, various cell lines, and various cell types, is also valuable (<xref ref-type="bibr" rid="B21">Zhang and Zhang, 2019</xref>).</p>
<p>To depict the common and dataset-specific gene expression patterns, we proposed integrative regularized non-negative matrix factorization (iRNMF), by adopting iNMF and imposing a new regularization, to obtain a common gene-factor matrix and multiple dataset-specific gene-factor matrices. With iRNMF, we analyzed the gene expression data of multiple human brain cell types and obtained shared and cell-type-specific gene modules. Then, ASD-related risk genes were used to identify shared and cell-type-specific ASD-associated gene modules. By analyzing these gene modules, we studied the shared and cell-type-specific dysregulated gene expression patterns in ASD.</p>
</sec>
<sec sec-type="materials|methods" id="s2">
<title>Materials and Methods</title>
<sec id="s2-1">
<title>Integrative Regularized Non-Negative Matrix Factorization</title>
<p>Non-negative matrix factorization can factorize a high-dimensional gene expression matrix into two low-dimensional matrices, i.e., a gene-factor matrix and a sample-factor matrix, achieving the purpose of dimension reduction. To integrate and factorize multiple gene expression datasets into a common gene-factor matrix, multiple dataset-specific gene-factor matrices, and sample-factor matrices, iNMF (<xref ref-type="bibr" rid="B19">Yang and Michailidis, 2016</xref>) was proposed. The optimization problem is:<disp-formula id="equ1">
<mml:math id="m1">
<mml:mrow>
<mml:munder>
<mml:mrow>
<mml:mtext>min</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi>W</mml:mi>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
</mml:munder>
<mml:munderover>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>k</mml:mi>
</mml:munderover>
<mml:mo>&#x2016;</mml:mo>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>W</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:msubsup>
<mml:mo>&#x2016;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
<mml:mo>&#x2b;</mml:mo>
<mml:mi>&#x3bb;</mml:mi>
<mml:munderover>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>k</mml:mi>
</mml:munderover>
<mml:mo>&#x2016;</mml:mo>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:msubsup>
<mml:mo>&#x2016;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="equ2">
<mml:math id="m2">
<mml:mrow>
<mml:mtext>s</mml:mtext>
<mml:mtext>.t</mml:mtext>
<mml:mtext>.</mml:mtext>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mi>W</mml:mi>
<mml:mo>&#x2265;</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2265;</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2265;</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo>,</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1,2</mml:mn>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>where <inline-formula id="inf1">
<mml:math id="m3">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mi>R</mml:mi>
<mml:mrow>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>g</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> denotes each gene expression dataset, <italic>g</italic> denotes the number of genes, and <inline-formula id="inf2">
<mml:math id="m4">
<mml:mrow>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> denotes the number of samples in the <italic>ith</italic> dataset. <inline-formula id="inf3">
<mml:math id="m5">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is factorized into three low-dimensional matrices, <inline-formula id="inf4">
<mml:math id="m6">
<mml:mrow>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mi>R</mml:mi>
<mml:mrow>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:mi>W</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mi>R</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>g</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf5">
<mml:math id="m7">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mi>R</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>g</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula>, where <italic>m</italic> denotes the number of factors/gene modules. <inline-formula id="inf6">
<mml:math id="m8">
<mml:mrow>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is the representation of samples in the low-dimensional space. <inline-formula id="inf7">
<mml:math id="m9">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf8">
<mml:math id="m10">
<mml:mi>W</mml:mi>
</mml:math>
</inline-formula> are the dataset-specific and shared gene modules, respectively. <inline-formula id="inf9">
<mml:math id="m11">
<mml:mi>&#x3bb;</mml:mi>
</mml:math>
</inline-formula> is a regularization parameter.</p>
<p>The regularization of iNMF can make <inline-formula id="inf10">
<mml:math id="m12">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> sparser to some degree, while to facilitate the analyses of shared and dataset-specific gene modules, we propose integrative regularized non-negative matrix factorization (iRNMF) by imposing a new regularization. The optimization problem is:<disp-formula id="equ3">
<mml:math id="m13">
<mml:mrow>
<mml:munder>
<mml:mrow>
<mml:mtext>min</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi>W</mml:mi>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
</mml:munder>
<mml:munderover>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>k</mml:mi>
</mml:munderover>
<mml:mo>&#x2016;</mml:mo>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>W</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:msubsup>
<mml:mo>&#x2016;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mrow>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:munderover>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>k</mml:mi>
</mml:munderover>
<mml:mo>&#x2016;</mml:mo>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:msubsup>
<mml:mo>&#x2016;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:munderover>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>k</mml:mi>
</mml:munderover>
<mml:mo>&#x2016;</mml:mo>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mi>W</mml:mi>
<mml:msubsup>
<mml:mo>&#x2016;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="equ4">
<mml:math id="m14">
<mml:mrow>
<mml:mtext>s</mml:mtext>
<mml:mtext>.t</mml:mtext>
<mml:mtext>.</mml:mtext>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mi>W</mml:mi>
<mml:mo>&#x2265;</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2265;</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2265;</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo>,</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1,2</mml:mn>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>where <inline-formula id="inf11">
<mml:math id="m15">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mrow>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf12">
<mml:math id="m16">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> are regularization parameters. The multiplicative updates often used for NMF-like optimization problems do not have a convergence guarantee and may need more iterations; therefore, we applied the block coordinate descent algorithm used in LIGER (<xref ref-type="bibr" rid="B18">Welch et al., 2019</xref>). We divided the variables into 2<italic>k</italic> &#x2b; 1 blocks (corresponding to <inline-formula id="inf13">
<mml:math id="m17">
<mml:mrow>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, <inline-formula id="inf14">
<mml:math id="m18">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> for each dataset, and <italic>W</italic>) and performed block coordinate descent, iteratively minimizing the objective with respect to each block, holding the others fixed. We iterated:<disp-formula id="equ5">
<mml:math id="m19">
<mml:mrow>
<mml:mi>W</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:munder>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>g</mml:mi>
<mml:mi>m</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>W</mml:mi>
<mml:mo>&#x2265;</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:munder>
<mml:mo>&#x7c;</mml:mo>
<mml:mo>&#x7c;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mo>&#x22ee;</mml:mo>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msqrt>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
</mml:msqrt>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mo>&#x22ee;</mml:mo>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msqrt>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
</mml:msqrt>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mi>W</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mo>&#x22ee;</mml:mo>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mi>O</mml:mi>
<mml:mrow>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>g</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mo>&#x22ee;</mml:mo>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mi>O</mml:mi>
<mml:mrow>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>g</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mi>F</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
<mml:mo>&#x7c;</mml:mo>
<mml:mo>&#x7c;</mml:mo>
</mml:mrow>
<mml:mo>,</mml:mo>
</mml:math>
</disp-formula>
<disp-formula id="equ6">
<mml:math id="m20">
<mml:mrow>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:munder>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>g</mml:mi>
<mml:mi>m</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2265;</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:munder>
<mml:mo>&#x7c;</mml:mo>
<mml:mo>&#x7c;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msup>
<mml:mi>W</mml:mi>
<mml:mi>T</mml:mi>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msubsup>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>T</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msqrt>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
</mml:msqrt>
<mml:msubsup>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>T</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msqrt>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
</mml:msqrt>
<mml:msup>
<mml:mi>W</mml:mi>
<mml:mi>T</mml:mi>
</mml:msup>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:msubsup>
<mml:mi>H</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>T</mml:mi>
</mml:msubsup>
<mml:mo>&#x2212;</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msubsup>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>T</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mi>O</mml:mi>
<mml:mrow>
<mml:mi>g</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mi>O</mml:mi>
<mml:mrow>
<mml:mi>g</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mi>F</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
<mml:mo>&#x7c;</mml:mo>
<mml:mo>&#x7c;</mml:mo>
</mml:mrow>
<mml:mo>,</mml:mo>
</mml:math>
</disp-formula>
<disp-formula id="equ7">
<mml:math id="m21">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:munder>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>g</mml:mi>
<mml:mi>m</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2265;</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:munder>
<mml:mo>&#x7c;</mml:mo>
<mml:mo>&#x7c;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msqrt>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
</mml:msqrt>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mi>W</mml:mi>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mi>O</mml:mi>
<mml:mrow>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>g</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mi>F</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
<mml:mo>&#x7c;</mml:mo>
<mml:mo>&#x7c;</mml:mo>
</mml:mrow>
<mml:mo>,</mml:mo>
</mml:math>
</disp-formula>until convergence. Each of the optimization subproblems mentioned previously requires solving a non-negative least-squares problem, and we used the fast block principal pivoting algorithm (<xref ref-type="bibr" rid="B10">Kim et al., 2014</xref>) to solve each of these subproblems.</p>
</sec>
<sec id="s2-2">
<title>Gene Expression Data</title>
<p>We downloaded the single-nucleus gene expression data derived from the middle temporal gyrus (MTG) of the human cortex (<xref ref-type="bibr" rid="B8">Hodge et al., 2019</xref>) from the Allen Institute for Brain Science. It includes 15,928 nuclei sampled from eight human donor brains, of which 15,206 were from postmortem donors with no known neuropsychiatric or neurological conditions and 722 were from distal and normal tissues of neurosurgical donors. We preprocessed the data with R packages of scatter (<xref ref-type="bibr" rid="B14">McCarthy et al., 2017</xref>) and scran (<xref ref-type="bibr" rid="B12">Lun et al., 2016</xref>), including the quality control of nuclei and genes, and removing a minority of nuclei assigned to different cell cycle phases by the function of cyclone in scran. Nuclear and mitochondrial genes downloaded from Human MitoCarta2.0 (<xref ref-type="bibr" rid="B2">Calvo et al., 2016</xref>) were excluded, and protein-coding genes were retained. After removing the nuclei not assigned to any specific cell type, we obtained the expression level of 17,120 protein-coding genes in 12,246 nuclei. Then, we used scran to obtain 7,011 highly variable protein-coding genes across all nuclei, which were defined as genes with biological components that are significantly greater than zero at a false discovery rate (FDR) of 0.1. After removing the cell types containing less than 20 nuclei, we obtained the gene expression data of nuclei from glutamatergic neuron (Gluta), GABAergic interneuron (GABA), astrocyte (Ast), oligodendrocyte (Oli), and oligodendrocyte precursor cell (OPC), including 8994, 2762, 227, 112, and 133 nuclei, respectively. The gene expression of 7,011 highly variable protein-coding genes in these five cell types was used for analyses.</p>
</sec>
<sec id="s2-3">
<title>Determination of Parameters</title>
<p>To determine the number of factors/gene modules <italic>m</italic>, we used the same way with LIGER, applying Kullback&#x2013;Leibler (KL) divergence as a criterion. When the number of factors is too low, factors will include many genes and samples will load on many factors, with the distribution of factor loadings for a particular sample approaching a uniform distribution (<xref ref-type="bibr" rid="B18">Welch et al., 2019</xref>). As the number of factors approaches the true number of gene modules, each sample will generally load on only a few factors. Therefore, we calculated the KL divergence, compared to a uniform distribution, of the factor loadings for each sample and plotted the median across samples as a function of <italic>m</italic> to select the saturation point of the curve as the optimal <italic>m</italic>. We also considered the mean squared error (MSE) between <inline-formula id="inf15">
<mml:math id="m22">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and the reconstructed data <inline-formula id="inf16">
<mml:math id="m23">
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="true">&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</inline-formula>, i.e., <inline-formula id="inf17">
<mml:math id="m24">
<mml:mrow>
<mml:munderover>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>k</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>/</mml:mo>
<mml:mo>(</mml:mo>
<mml:mi>ni</mml:mi>
<mml:mo>&#x00D7;</mml:mo>
<mml:mi mathvariant="normal">g</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:mo>&#x7c;</mml:mo>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>W</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:msubsup>
<mml:mo>&#x7c;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula>, to help to determine the optimal <italic>m</italic>. To select the regularization parameters <inline-formula id="inf18">
<mml:math id="m25">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mrow>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>&#xa0;</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf19">
<mml:math id="m26">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, we applied the alignment metric (<xref ref-type="bibr" rid="B1">Butler et al., 2018</xref>) as a criterion, which LIGER also used, and plotted the alignment metric as a function of a combination of <inline-formula id="inf20">
<mml:math id="m27">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mrow>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf21">
<mml:math id="m28">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> to choose the point at which the alignment metric reaches the minimum value. <inline-formula id="inf22">
<mml:math id="m29">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mrow>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>&#xa0;</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf23">
<mml:math id="m30">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> can be a value among 0.01, 0.1, 1, 10, and 100.</p>
</sec>
<sec id="s2-4">
<title>Gene Module Analyses</title>
<p>We used iRNMF to analyze gene expression datasets of multiple cell types derived from human MTG. After obtaining the cell-type-specific and shared gene module matrices <inline-formula id="inf24">
<mml:math id="m31">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mtext>&#xa0;</mml:mtext>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf25">
<mml:math id="m32">
<mml:mrow>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mi>W</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>, for each gene module, we calculated the z-scores of genes, and the genes whose z-scores are larger than one were regarded as module genes. The modules with no less than 20 module genes were reported. The gene modules significantly enriched with ASD genes were regarded as ASD-associated gene modules. ASD candidate genes were downloaded from the Simons Foundation Autism Research Initiative (SFARI), version of 2 September 2021. We identified ASD-associated gene modules by hypergeometric tests and performed the correction for multiple testing by the Bonferroni method (<xref ref-type="bibr" rid="B16">Rupert, 2012</xref>). Gene Ontology analysis was performed using the R package of clusterProfiler (<xref ref-type="bibr" rid="B20">Yu et al., 2012</xref>), with background genes set at the genes in the analyzed expression matrix. The GO term whose FDR-adjusted p-value &#x3c; 0.1 and the number of genes in the term is not less than ten was reported.</p>
</sec>
</sec>
<sec sec-type="results" id="s3">
<title>Results</title>
<sec id="s3-1">
<title>Overall Analytical Procedure</title>
<p>We proposed integrative regularized non-negative matrix factorization (iRNMF) to learn homogeneous and heterogeneous gene expression patterns across multiple datasets. Single-nucleus gene expression datasets of multiple cell types of human MTG (<xref ref-type="bibr" rid="B8">Hodge et al., 2019</xref>) were analyzed using iRNMF, involving glutamatergic neuron (Gluta), GABAergic interneuron (GABA), astrocyte (Ast), oligodendrocyte (Oli), and oligodendrocyte precursor cell (OPC) denoted by <inline-formula id="inf26">
<mml:math id="m33">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mn>5</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, <inline-formula id="inf27">
<mml:math id="m34">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mi>R</mml:mi>
<mml:mrow>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>g</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula>, <inline-formula id="inf28">
<mml:math id="m35">
<mml:mrow>
<mml:mo>&#xa0;</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1,2</mml:mn>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:mn>5</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>, where <italic>g</italic> denotes the number of genes and <inline-formula id="inf29">
<mml:math id="m36">
<mml:mrow>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> denotes the number of samples in the <italic>i</italic>th cell type. iRNMF decomposed each gene expression dataset, <inline-formula id="inf30">
<mml:math id="m37">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, into three low-dimensional matrices, including the representation of samples in the low-dimensional space <inline-formula id="inf31">
<mml:math id="m38">
<mml:mrow>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mi>R</mml:mi>
<mml:mrow>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> and the cell type-specific and shared gene module matrices <inline-formula id="inf32">
<mml:math id="m39">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mi>R</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>g</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf33">
<mml:math id="m40">
<mml:mrow>
<mml:mi>W</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mi>R</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>g</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula>, respectively, where <italic>m</italic> denotes the number of factors/gene modules. As we study the shared and cell type-specific gene expression patterns across cells, we mainly focus on <inline-formula id="inf34">
<mml:math id="m41">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf35">
<mml:math id="m42">
<mml:mi>W</mml:mi>
</mml:math>
</inline-formula>. Based on <inline-formula id="inf36">
<mml:math id="m43">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf37">
<mml:math id="m44">
<mml:mi>W</mml:mi>
</mml:math>
</inline-formula>, for each gene module, we first calculated the z-scores of genes and determined the module genes as those with z-score &#x3e; 1. The modules with no less than 20 module genes were reported. The gene modules determined from <italic>W</italic> were regarded as shared gene modules, and those determined from <inline-formula id="inf38">
<mml:math id="m45">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> were regarded as cell-type-specific gene modules. Then, we identified the gene modules significantly enriched with SFARI ASD candidate genes using hypergeometric tests. The gene modules whose Bonferroni-adjusted hypergeometric test p-values &#x3c; 0.1 were reported as ASD-associated gene modules. By analyzing the shared and cell-type-specific ASD-associated gene modules, we study the shared and cell-type-specific dysregulated gene expression across different cells in ASD.</p>
</sec>
<sec id="s3-2">
<title>The Evaluation of Integrative Regularized Non-Negative Matrix Factorization</title>
<p>To show the effectiveness of iRNMF, we compared iRNMF with LIGER (which only imposes regularization on <inline-formula id="inf39">
<mml:math id="m46">
<mml:mrow>
<mml:munderover>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>k</mml:mi>
</mml:munderover>
<mml:mo>&#x7c;</mml:mo>
<mml:mo>&#x7c;</mml:mo>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x7c;</mml:mo>
<mml:msubsup>
<mml:mo>&#x7c;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> but not on <inline-formula id="inf40">
<mml:math id="m47">
<mml:mrow>
<mml:munderover>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>k</mml:mi>
</mml:munderover>
<mml:mo>&#x7c;</mml:mo>
<mml:mo>&#x7c;</mml:mo>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mi>W</mml:mi>
<mml:mo>&#x7c;</mml:mo>
<mml:msubsup>
<mml:mo>&#x7c;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula>). First, we needed to determine the parameter values for LIGER and iRNMF. KL divergence was used to determine the optimal number of gene modules <italic>m</italic>, and the alignment metric (<xref ref-type="bibr" rid="B1">Butler et al., 2018</xref>) was used to determine the regularization parameters. For LIGER and iRNMF, we plotted the median of KL divergence across samples as a function of <italic>m</italic> to select the saturation point of the curve and also considered mean squared error (MSE), <inline-formula id="inf41">
<mml:math id="m48">
<mml:mrow>
<mml:munderover>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>k</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>/</mml:mo>
<mml:mo>(</mml:mo>
<mml:mi>ni</mml:mi>
<mml:mo>&#x00D7;</mml:mo>
<mml:mi mathvariant="normal">g</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:mo>&#x7c;</mml:mo>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>W</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:msubsup>
<mml:mo>&#x7c;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> (<xref ref-type="sec" rid="s10">Supplementary Figure S1A</xref>). Thus, <italic>m</italic> was set to 100 for both LIGER and iRNMF. For LIGER, the regularization parameter <inline-formula id="inf42">
<mml:math id="m49">
<mml:mi>&#x3bb;</mml:mi>
</mml:math>
</inline-formula> was set to 1, which makes the alignment metric reach the minimum value (<xref ref-type="sec" rid="s10">Supplementary Figure S1B</xref>). For iRNMF, we plotted the alignment metric as a function of a combination of <inline-formula id="inf43">
<mml:math id="m50">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mrow>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf44">
<mml:math id="m51">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>. The parameter values <inline-formula id="inf45">
<mml:math id="m52">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mrow>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> &#x3d; 0.01 and <inline-formula id="inf46">
<mml:math id="m53">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> &#x3d; 10 can make the alignment metric reach the minimum value, while we noticed that <inline-formula id="inf47">
<mml:math id="m54">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mrow>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> &#x3d; 1 and <inline-formula id="inf48">
<mml:math id="m55">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> &#x3d; 10 can give the second smallest alignment metric (<xref ref-type="sec" rid="s10">Supplementary Figure S1C</xref>). The regularization parameter of LIGER was determined as 1, which is actually our <inline-formula id="inf49">
<mml:math id="m56">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mrow>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>; to better compare iRNMF with LIGER and analyze the effectiveness of the newly added constraint, we chose <inline-formula id="inf50">
<mml:math id="m57">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mrow>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> &#x3d; 1 and <inline-formula id="inf51">
<mml:math id="m58">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> &#x3d; 10 instead. The reason why we chose the minimum alignment instead of the maximum alignment as a criterion is that the input datasets are from different cell types and they could not be aligned together. The alignment metric measures the uniformity of mixing for multiple samples in the aligned latent space, which should be high when datasets share underlying cell types and low when datasets do not share cognate populations (<xref ref-type="bibr" rid="B1">Butler et al., 2018</xref>; <xref ref-type="bibr" rid="B18">Welch et al., 2019</xref>). As our analyzed datasets are from different cell types, we used the minimum alignment to determine the regularization parameters.</p>
<p>Next, we compared iRNMF with LIGER based on cell representation. For each cell type, we calculated the Pearson correlation between flatten <inline-formula id="inf52">
<mml:math id="m59">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf53">
<mml:math id="m60">
<mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="true">&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>W</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> (<xref ref-type="sec" rid="s10">Supplementary Figure S2A</xref>) to evaluate the reconstruction. Also, we calculated sample&#x2013;sample distance matrices using <inline-formula id="inf54">
<mml:math id="m61">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf55">
<mml:math id="m62">
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="true">&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</inline-formula> and then flatten the distance matrices to calculate their Pearson correlation (<xref ref-type="sec" rid="s10">Supplementary Figure S2B</xref>). Both correlations of iRNMF are slightly better than those of LIGER. Then, for each cell, we calculated the Pearson correlation between the gene expression levels of this cell in <inline-formula id="inf56">
<mml:math id="m63">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf57">
<mml:math id="m64">
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="true">&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</inline-formula> (<xref ref-type="sec" rid="s10">Supplementary Figure S2C</xref>). We found that iRNMF is better than LIGER in the cell type GABA, and in the other four cell types, iRNMF and LIGER are evenly matched. To compare the low-dimensional <inline-formula id="inf58">
<mml:math id="m65">
<mml:mrow>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> obtained from LIGER and iRNMF, we combined all <inline-formula id="inf59">
<mml:math id="m66">
<mml:mrow>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> as <inline-formula id="inf60">
<mml:math id="m67">
<mml:mi>H</mml:mi>
</mml:math>
</inline-formula> and performed cell clustering based on <inline-formula id="inf61">
<mml:math id="m68">
<mml:mi>H</mml:mi>
</mml:math>
</inline-formula> to check if different cell types are distinguishable. We performed K-means based on <inline-formula id="inf62">
<mml:math id="m69">
<mml:mi>H</mml:mi>
</mml:math>
</inline-formula> and calculated the clustering indexes, including ARI (adjusted Rand index), FMI (Fowlkes and Mallows index), JC (Jaccard coefficient), NMI (normalized mutual information), PUR (purity), and SC (silhouette coefficient) (<xref ref-type="sec" rid="s10">Supplementary Figure S2D</xref>). It can be noted that the clustering performances of iRNMF are better than those of LIGER when being faced with datasets of different cell types.</p>
<p>Lastly, we compared the gene modules obtained using LIGER and iRNMF. We calculated gene&#x2013;gene correlation matrices using <inline-formula id="inf63">
<mml:math id="m70">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf64">
<mml:math id="m71">
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="true">&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</inline-formula> and then flatten the correlation matrices to calculate their Pearson correlation (<xref ref-type="sec" rid="s10">Supplementary Figure S2E</xref>). We also calculated gene&#x2013;gene correlation matrices using <inline-formula id="inf65">
<mml:math id="m72">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and low-dimensional gene representation <inline-formula id="inf66">
<mml:math id="m73">
<mml:mrow>
<mml:mi>W</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and then calculated their Pearson correlation (<xref ref-type="sec" rid="s10">Supplementary Figure S2F</xref>). It can be seen that both correlations of iRNMF are better than those of LIGER. Then, for each gene, we calculated the Pearson correlation between the expression levels of this gene in <inline-formula id="inf67">
<mml:math id="m74">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf68">
<mml:math id="m75">
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="true">&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</inline-formula> (<xref ref-type="sec" rid="s10">Supplementary Figure S2G</xref>). The correlations of iRNMF are significantly higher than those of LIGER in four cell types, except for Gluta, in which iRNMF and LIGER are evenly matched. Moreover, we expected that different modules should represent distinct biological functions and should not overlap too much. To evaluate the distinct biological functions of gene modules, we adopted the evaluation way as in <xref ref-type="bibr" rid="B3">Cleary et al (2017</xref>), using the number of uniquely enriched gene sets. For each gene module of <italic>W</italic> and <inline-formula id="inf69">
<mml:math id="m76">
<mml:mrow>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, we tested it for enrichment in GO terms and considered its top five significant GO terms. Then, we identified the uniquely enriched GO terms of each module, which are the terms enriched in at most one module of this considered cell type, and calculated the average number of unique gene sets per module (<xref ref-type="sec" rid="s10">Supplementary Figure S2H</xref>). It can be seen that in all cell types, the number of uniquely enriched GO terms of iRNMF is larger than that of LIGER. The comparisons indicate that iRNMF is effective, and the obtained low-dimensional matrices are helpful for subsequent gene module analysis.</p>
</sec>
<sec id="s3-3">
<title>Shared Gene Expression Patterns Associated With Autism Spectrum Disorder</title>
<p>Among all shared gene modules determined from <italic>W</italic>, 46 are significantly enriched with ASD genes (<xref ref-type="sec" rid="s10">Supplementary Table S1</xref>). For the top ten shared ASD-associated gene modules, we list their Bonferroni-adjusted p-values, top three z-score genes (<xref ref-type="fig" rid="F1">Figure 1A1</xref>), and top one significant GO term (<xref ref-type="fig" rid="F1">Figure 1A2</xref>). Some top genes are ASD genes, including <italic>PDE1C</italic> and <italic>MKX</italic> in W_M42, <italic>GPC6</italic> in W_M10, and <italic>NTNG1</italic> in W_M26. The top one significant GO term is all related to synapses, whose dysregulation has been proven to be associated with ASD. Then, we checked which kinds of GO terms are the most common among all GO terms of all shared ASD-associated gene modules and found that the top ten common GO terms are also associated with the functioning of synapses, appearing in all shared ASD-associated gene modules (<xref ref-type="fig" rid="F1">Figure 1B1</xref>). Next, we focused on the modules which have module-specific gene functions, by removing the repeated GO terms between gene modules. There are 36 shared ASD-associated gene modules with module-specific GO terms (<xref ref-type="sec" rid="s10">Supplementary Table S2</xref>). The top ten modules with module-specific gene functions are also the ones shown in <xref ref-type="fig" rid="F1">Figure 1A1</xref>, and their top one module-specific GO term is shown in <xref ref-type="fig" rid="F1">Figure 1B2</xref>. The top three modules most significantly enriched with ASD genes, W_M78, W_M91, and W_M42, are related to cortical actin cytoskeleton, heparan sulfate proteoglycan metabolic process, and regulation of sodium ion transmembrane transport, respectively. The actin cytoskeleton has been associated with ASD and provides a strategy for ASD treatment by targeting actin regulators (<xref ref-type="bibr" rid="B5">Duffney et al., 2015</xref>; <xref ref-type="bibr" rid="B7">Hlushchenko et al., 2018</xref>). The lacking of heparin sulfate, a proteoglycan involved in a variety of neurodevelopmental processes, has been correlated with ASD (<xref ref-type="bibr" rid="B9">Irie et al., 2012</xref>; <xref ref-type="bibr" rid="B15">P&#xe9;rez et al., 2016</xref>). Ion channels, including sodium, calcium, and potassium, are implicated in the etiology of ASD (<xref ref-type="bibr" rid="B4">Daghsni et al., 2018</xref>). It can be seen that the identified gene modules are meaningful.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption>
<p>Top ten shared ASD-associated gene modules along with <bold>(A1)</bold> their Bonferroni-adjusted <italic>p</italic>-values, top three z-score genes, and <bold>(A2)</bold> top one significant GO term. <bold>(B1)</bold> Top ten common enriched GO terms among all shared ASD-associated gene modules along with the frequency of occurrence and the total number of gene modules. <bold>(B2)</bold> Module-specific top one GO term of the top ten shared ASD-associated modules. The SFARI ASD genes are bold. The Bonferroni-adjusted p-values were derived from the hypergeometric tests using module genes and ASD genes.</p>
</caption>
<graphic xlink:href="fgene-13-865371-g001.tif"/>
</fig>
</sec>
<sec id="s3-4">
<title>Cell-Type-Specific Gene Expression Patterns Associated With Autism Spectrum Disorder</title>
<p>Among all cell-type-specific gene modules, we identified 11, 25, 29, 45, and 14 cell-type-specific ASD-associated gene modules for Ast, GABA, Gluta, Oli, and OPC, respectively (<xref ref-type="sec" rid="s10">Supplementary Table S1</xref>). We list the top ten significant gene modules along with their Bonferroni-adjusted p-values, top three z-score genes, and top one significant GO term (<xref ref-type="fig" rid="F2">Figure 2</xref>). Noted that for the two kinds of neurons, GABA and Gluta, the cell-type-specific ASD-associated gene modules are more significantly enriched with ASD genes and more top three genes are ASD genes, compared with glial cells. Many of the top GO terms of cell-type-specific ASD-associated gene modules are related to synapses, while different gene functions may still be dysregulated in different cell types. For instance, gamma-tubulin complex, cadherin binding, and protein tyrosine kinase activity are associated with Ast-specific ASD-associated gene modules; regulation of microtubule cytoskeleton organization, phosphatase binding, and desmosome are significant in OPC-specific ASD-associated gene modules. These may indicate that different gene functions may be dysregulated by ASD in different cells, demonstrating the cell heterogeneity of ASD.</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption>
<p>Top ten cell-type-specific ASD-associated gene modules along with their Bonferroni-adjusted hypergeometric test p-values, top three z-score genes, and top one significant GO term. The SFARI ASD genes are bold. The Bonferroni-adjusted p-values were derived from the hypergeometric tests using module genes and ASD genes.</p>
</caption>
<graphic xlink:href="fgene-13-865371-g002.tif"/>
</fig>
<p>Then, we checked which kinds of GO terms are the most common among all cell-type-specific ASD-associated gene modules in each cell type. Indeed, the functioning of synapses is important across all cell types (<xref ref-type="fig" rid="F3">Figure 3A</xref>). Next, we focused on the modules which have module-specific gene functions. There are 7, 23, 24, 24, and 8 cell-type-specific ASD-associated gene modules left in Ast, GABA, Gluta, Oli, and OPC, respectively (<xref ref-type="sec" rid="s10">Supplementary Table S2</xref>). We reported the top ten, along with their top three z-score genes (<xref ref-type="fig" rid="F3">Figure 3B</xref>), and top one GO term (<xref ref-type="fig" rid="F3">Figure 3C</xref>). In Ast, locomotory behavior, integral component of the postsynaptic membrane, and cadherin binding are functions specific to the top three modules, Ast_M99, Ast_M63, and Ast_M39. For GABA, it can be noted that inhibitory extracellular ligand-gated ion channel activity is specific to GABA_M7. On the contrary, the modulation of excitatory postsynaptic potential and ionotropic glutamate receptor signaling pathway are specific to Gluta_M10 and Gluta_M99, respectively. These gene functions are obviously associated with particular cell types. Neurons communicate with one another at synapses using two types of signals, electrical and chemical signals. At an electrical synapse, ions flow directly between cells. At a chemical synapse, neurotransmitters pass messages from the presynaptic to the postsynaptic neuron. The major excitatory and inhibitory neurotransmitters in brains are glutamate and GABA (gamma-aminobutyric acid), respectively. For Oli, regulation of dendrite morphogenesis and regulation of gliogenesis are specific to Oli_M82 and Oli_M97. For OPC, protein homooligomerization and endoplasmic reticulum unfolded protein response are specific to the top two modules, OPC_M34 and OPC_M54, respectively. The analysis of module-specific gene functions and top genes of cell-type-specific ASD-related gene modules can facilitate the development of more targeted biomarkers and treatments for ASD.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption>
<p>
<bold>(A)</bold> Top ten common enriched GO terms among all cell-type-specific ASD-associated gene modules along with the frequency of occurrence and the total number of cell-type-specific ASD-associated gene modules. For the top ten cell-type-specific ASD-associated modules, which have module-specific GO terms, <bold>(B)</bold> their Bonferroni-adjusted hypergeometric test p-values, top three z-score genes, and <bold>(C)</bold> top one GO term are shown. The SFARI ASD genes are bold. The Bonferroni-adjusted p-values were derived from the hypergeometric tests using module genes and ASD genes.</p>
</caption>
<graphic xlink:href="fgene-13-865371-g003.tif"/>
</fig>
<p>Next, we further examined the modules with both cell-type-specific and module-specific gene functions, which are those GO terms that only appear in one module of one cell type. In GABA, Gluta, Oli, and OPC, there are 14, 18, 1, and 1 cell-type-specific ASD-associated gene modules that have both cell-type-specific and module-specific gene functions (<xref ref-type="sec" rid="s10">Supplementary Table S3</xref>). It can be noted that more modules have cell type-specific and module-specific gene functions in neuronal cells, emphasizing the neurons are mainly affected by ASD. For the cell types with more than one cell-type-specific ASD-associated gene modules, we show the top ten modules along with their Bonferroni-adjusted p-values, the enriched top one GO term, and the top three z-score genes (<xref ref-type="fig" rid="F4">Figure 4</xref>). Among the top three genes, <italic>CTNNA2</italic> in GABA_M42, <italic>ZBTB2</italic> in GABA_M40, and <italic>SLC9A9</italic> in GABA_M75 are ASD genes. <italic>MKX</italic> and <italic>PDE1C</italic> in Gluta<italic>_</italic>M42, <italic>GPC6</italic> and <italic>CUX2</italic> in Gluta_M10, and <italic>PXDN</italic> and <italic>NFIA</italic> in Gluta_M99 are ASD genes. These gene modules may need more attention. We note that different kinds of gene functions are specific to ASD-associated modules of different cell types. GABA-specific ASD-associated gene modules are responsible for inhibitory extracellular ligand-gated ion channel activity and forebrain neuron differentiation, and so on. Gluta-specific ASD-associated gene modules are responsible for nerve growth factors, excitatory postsynaptic potential, and ionotropic glutamate receptor signaling pathway, and so on. Oli_M60 and OPC_M12 have a cell-type-specific and module-specific function, regulation of bone mineralization and lipid transporter activity, respectively (<xref ref-type="sec" rid="s10">Supplementary Table S3</xref>). These results indicate that in different cell types, different kinds of gene functions may be specifically dysregulated in ASD, highlighting the cell heterogeneity of ASD.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption>
<p>Cell-type-specific ASD-associated modules, which have both module-specific and cell type-specific GO terms, along with <bold>(A)</bold> their Bonferroni-adjusted hypergeometric test p-values, <bold>(B)</bold> top one enriched GO term, and <bold>(C)</bold> top three z-score genes. The SFARI ASD genes are bold. The Bonferroni-adjusted p-values were derived from the hypergeometric tests using module genes and ASD genes.</p>
</caption>
<graphic xlink:href="fgene-13-865371-g004.tif"/>
</fig>
</sec>
</sec>
<sec sec-type="discussion" id="s4">
<title>Discussion</title>
<p>Brain-related diseases are often characterized by cell heterogeneity and mainly affect some specific cell types, as the brain is highly heterogeneous. To study the common and cell type-specific gene expression patterns across different brain cell types, we proposed iRNMF by adopting iNMF and imposing a further regularization. With iRNMF, we analyzed the gene expression data of multiple human brain cell types to obtain shared and cell-type-specific gene modules and cell-type-specific cell representations. By comparing iRNMF with LIGER in terms of cell representations and gene modules, it has been shown that iRNMF is effective, and the obtained low-dimensional matrices are beneficial for the downstream analyses, especially gene module analyses.</p>
<p>By using curated ASD candidate genes, shared and cell-type-specific ASD-associated gene modules were identified. For the shared ASD-associated gene modules, their significant gene functions are mostly relevant to the functioning of synapses, which has already been proven to be associated with ASD. Then, we identified the module-specific gene functions, including cortical actin cytoskeleton, heparan sulfate proteoglycan metabolic process, and regulation of sodium ion transmembrane transport. As to cell-type-specific ASD-associated gene modules, GABA-specific and Gluta-specific ASD-associated gene modules are more significantly enriched with ASD genes, and more top three genes are ASD genes compared with glial cells, emphasizing that the neurons are mainly affected by ASD. Many top GO terms of cell-type-specific ASD-associated gene modules are related to synapses, while different gene functions may still be specifically dysregulated by ASD in different cell types. Therefore, we focused on the functions which are specific to modules and also cell types. We noted that inhibitory extracellular ligand-gated ion channel activity and forebrain neuron differentiation are functions specifically significant in GABA; nerve growth factor, excitatory postsynaptic potential, and ionotropic glutamate receptor signaling pathway are specifically related to Gluta; lipid transporter activity is specifically significant in OPC.</p>
<p>By analyzing the gene functions and top important genes of shared and cell-type-specific ASD-associated gene modules, we study the shared and cell-type-specific dysregulated gene expression patterns in ASD. Moreover, we highlighted the shared ASD-associated gene modules, which have module-specific gene functions, and cell-type-specific ASD-associated gene modules, which have both module-specific and cell-type-specific gene functions. Analyzing these gene modules can facilitate the development of more targeted biomarkers and treatments for ASD. Our results provide new insights into the molecular mechanism and pathogenesis of ASD, studying the cell heterogeneity of ASD. Our method can also be used to extract homogeneous and heterogeneous patterns embedded in data from multiple biological conditions, such as various cancer types or subtypes and various cell lines.</p>
</sec>
</body>
<back>
<sec id="s5">
<title>Data Availability Statement</title>
<p>The original contributions presented in the study are included in the article/<xref ref-type="sec" rid="s10">Supplementary Material</xref>; further inquiries can be directed to the corresponding author.</p>
</sec>
<sec id="s6">
<title>Author Contributions</title>
<p>JG conceived and designed the study. JG, YZ, and YK conducted the analyses. JG, YZ, and GJ wrote the manuscript. All authors approved the final manuscript.</p>
</sec>
<sec id="s7">
<title>Funding</title>
<p>This study has been supported by the National Natural Science Foundation of China (Nos 61803320 and 61573296).</p>
</sec>
<sec sec-type="COI-statement" id="s8">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s9">
<title>Publisher&#x2019;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors, and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<sec id="s10">
<title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fgene.2022.865371/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fgene.2022.865371/full&#x23;supplementary-material</ext-link>
<supplementary-material>
<label>Supplementary Figure S1</label>
<caption>
<p>Selection of parameter values of iRNMF and LIGER. <bold>(A)</bold> Selection of <italic>m</italic> using mean squared error (MSE) and KL divergence as criteria for iRNMF and LIGER. <bold>(B)</bold> Selection of the regularization parameter <inline-formula id="inf70">
<mml:math id="m77">
<mml:mi>&#x3bb;</mml:mi>
</mml:math>
</inline-formula> using alignment metric for LIGER. <bold>(C)</bold> Selection of the combination of regularization parameters (<inline-formula id="inf71">
<mml:math id="m78">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mrow>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf72">
<mml:math id="m79">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>) using alignment metric for iRNMF.</p>
</caption>
</supplementary-material>
<supplementary-material>
<label>Supplementary Figure S2</label>
<caption>
<p>Comparisons between iRNMF and LIGER. <bold>(A)</bold> Pearson correlation coefficient between original data <inline-formula id="inf73">
<mml:math id="m80">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and reconstructed data <inline-formula id="inf74">
<mml:math id="m81">
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="true">&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</inline-formula>. <bold>(B)</bold> Pearson correlation coefficient between sample&#x2013;sample distance matrices calculated from <inline-formula id="inf75">
<mml:math id="m82">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf76">
<mml:math id="m83">
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="true">&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</inline-formula>. <bold>(C)</bold> Pearson correlation between the gene expression levels of each cell in <inline-formula id="inf77">
<mml:math id="m84">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf78">
<mml:math id="m85">
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="true">&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</inline-formula>. <bold>(D)</bold> Performances of clustering based on <italic>H</italic>, including ARI, FMI, JC, NMI, PUR, and SC. <bold>(E)</bold> Pearson correlation coefficient between gene&#x2013;gene correlation matrices calculated from <inline-formula id="inf79">
<mml:math id="m86">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf80">
<mml:math id="m87">
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="true">&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</inline-formula>. <bold>(F)</bold> Pearson correlation coefficient between gene&#x2013;gene correlation matrices calculated from <inline-formula id="inf81">
<mml:math id="m88">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and low-dimensional gene representation <inline-formula id="inf82">
<mml:math id="m89">
<mml:mrow>
<mml:mi>W</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>V</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>. <bold>(G)</bold> Pearson correlation between the expression levels of each gene in <inline-formula id="inf83">
<mml:math id="m90">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf84">
<mml:math id="m91">
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="true">&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</inline-formula>. <bold>(H)</bold> Number of uniquely enriched GO terms. ns denotes not significant; &#x2a; denotes <italic>p</italic> &#x3c; 0.05, &#x2a;&#x2a; denotes <italic>p</italic> &#x3c; 0.01, &#x2a;&#x2a;&#x2a; denotes <italic>p</italic> &#x3c; 0.001, and &#x2a;&#x2a;&#x2a;&#x2a; denotes <italic>p</italic> &#x3c; 0.0001.</p>
</caption>
</supplementary-material>
<supplementary-material>
<label>Supplementary Table S1</label>
<caption>
<p>Shared and cell-type-specific ASD-associated gene modules. For these modules, their Bonferroni-adjusted hypergeometric <italic>p</italic>-values, module genes sorted by z-scores, and enriched gene functions are listed.</p>
</caption>
</supplementary-material>
<supplementary-material>
<label>Supplementary Table S2</label>
<caption>
<p>Shared and cell-type-specific ASD-associated gene modules that have module-specific gene functions. For these modules, their Bonferroni-adjusted hypergeometric <italic>p</italic>-values, module genes sorted by z-scores, and enriched module-specific gene functions are listed.</p>
</caption>
</supplementary-material>
<supplementary-material>
<label>Supplementary Table S3</label>
<caption>
<p>Cell-type-specific ASD-associated gene modules that have both module-specific and cell type-specific gene functions. For these modules, their Bonferroni-adjusted hypergeometric <italic>p</italic>-values, module genes sorted by z-scores, and enriched module-specific and cell-type-specific gene functions are listed.</p>
</caption>
</supplementary-material>
</p>
<supplementary-material xlink:href="Table2.XLSX" id="SM1" mimetype="application/XLSX" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Table3.XLSX" id="SM2" mimetype="application/XLSX" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Image1.TIF" id="SM3" mimetype="application/TIF" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Table1.XLSX" id="SM4" mimetype="application/XLSX" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Image2.TIFF" id="SM5" mimetype="application/TIFF" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Butler</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Hoffman</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Smibert</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Papalexi</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Satija</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Integrating Single-Cell Transcriptomic Data across Different Conditions, Technologies, and Species</article-title>. <source>Nat. Biotechnol.</source> <volume>36</volume>, <fpage>411</fpage>&#x2013;<lpage>420</lpage>. <pub-id pub-id-type="doi">10.1038/nbt.4096</pub-id> </citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Calvo</surname>
<given-names>S. E.</given-names>
</name>
<name>
<surname>Clauser</surname>
<given-names>K. R.</given-names>
</name>
<name>
<surname>Mootha</surname>
<given-names>V. K.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>MitoCarta2.0: an Updated Inventory of Mammalian Mitochondrial Proteins</article-title>. <source>Nucleic Acids Res.</source> <volume>44</volume> (<issue>D1</issue>), <fpage>D1251</fpage>&#x2013;<lpage>D1257</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkv1003</pub-id> </citation>
</ref>
<ref id="B3">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cleary</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Cong</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Cheung</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Lander</surname>
<given-names>E. S.</given-names>
</name>
<name>
<surname>Regev</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Efficient Generation of Transcriptomic Profiles by Random Composite Measurements</article-title>. <source>Cell</source> <volume>171</volume> (<issue>6</issue>), <fpage>1424</fpage>&#x2013;<lpage>1436</lpage>. <pub-id pub-id-type="doi">10.1016/j.cell.2017.10.023</pub-id> </citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Daghsni</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Rima</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Fajloun</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Ronjat</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Brus&#xe9;s</surname>
<given-names>J. L.</given-names>
</name>
<name>
<surname>M&#x27;Rad</surname>
<given-names>R.</given-names>
</name>
<etal/>
</person-group> (<year>2018</year>). <article-title>Autism throughout Genetics: Perusal of the Implication of Ion Channels</article-title>. <source>Brain Behav.</source> <volume>8</volume> (<issue>8</issue>), <fpage>e00978</fpage>. <pub-id pub-id-type="doi">10.1002/brb3.978</pub-id> </citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Duffney</surname>
<given-names>L. J.</given-names>
</name>
<name>
<surname>Zhong</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Wei</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Matas</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Cheng</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Qin</surname>
<given-names>L.</given-names>
</name>
<etal/>
</person-group> (<year>2015</year>). <article-title>Autism-like Deficits in Shank3-Deficient Mice Are Rescued by Targeting Actin Regulators</article-title>. <source>Cel Rep.</source> <volume>11</volume> (<issue>9</issue>), <fpage>1400</fpage>&#x2013;<lpage>1413</lpage>. <pub-id pub-id-type="doi">10.1016/j.celrep.2015.04.064</pub-id> </citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Guan</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Lin</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Gao</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Ji</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>An Analytical Method for the Identification of Cell Type-specific Disease Gene Modules</article-title>. <source>J. Transl Med.</source> <volume>19</volume> (<issue>1</issue>), <fpage>20</fpage>. <pub-id pub-id-type="doi">10.1186/s12967-020-02690-5</pub-id> </citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hlushchenko</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Khanal</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Abouelezz</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Paavilainen</surname>
<given-names>V. O.</given-names>
</name>
<name>
<surname>Hotulainen</surname>
<given-names>P.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>ASD-associated De Novo Mutations in Five Actin Regulators Show Both Shared and Distinct Defects in Dendritic Spines and Inhibitory Synapses in Cultured Hippocampal Neurons</article-title>. <source>Front. Cel. Neurosci.</source> <volume>12</volume>, <fpage>217</fpage>. <pub-id pub-id-type="doi">10.3389/fncel.2018.00217</pub-id> </citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hodge</surname>
<given-names>R. D.</given-names>
</name>
<name>
<surname>Bakken</surname>
<given-names>T. E.</given-names>
</name>
<name>
<surname>Miller</surname>
<given-names>J. A.</given-names>
</name>
<name>
<surname>Smith</surname>
<given-names>K. A.</given-names>
</name>
<name>
<surname>Barkan</surname>
<given-names>E. R.</given-names>
</name>
<name>
<surname>Graybuck</surname>
<given-names>L. T.</given-names>
</name>
<etal/>
</person-group> (<year>2019</year>). <article-title>Conserved Cell Types with Divergent Features in Human versus Mouse Cortex</article-title>. <source>Nature</source> <volume>573</volume>, <fpage>61</fpage>&#x2013;<lpage>68</lpage>. <pub-id pub-id-type="doi">10.1038/s41586-019-1506-7</pub-id> </citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Irie</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Badie-Mahdavi</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Yamaguchi</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>Autism-like Socio-Communicative Deficits and Stereotypies in Mice Lacking Heparan Sulfate</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A.</source> <volume>109</volume> (<issue>13</issue>), <fpage>5052</fpage>&#x2013;<lpage>5056</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1117881109</pub-id> </citation>
</ref>
<ref id="B10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kim</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>He</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Park</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Algorithms for Nonnegative Matrix and Tensor Factorizations: a Unified View Based on Block Coordinate Descent Framework</article-title>. <source>J. Glob. Optim</source> <volume>58</volume> (<issue>2</issue>), <fpage>285</fpage>&#x2013;<lpage>319</lpage>. <pub-id pub-id-type="doi">10.1007/s10898-013-0035-4</pub-id> </citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kitsak</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Sharma</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Menche</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Guney</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Ghiassian</surname>
<given-names>S. D.</given-names>
</name>
<name>
<surname>Loscalzo</surname>
<given-names>J.</given-names>
</name>
<etal/>
</person-group> (<year>2016</year>). <article-title>Tissue Specificity of Human Disease Module</article-title>. <source>Sci. Rep.</source> <volume>6</volume> (<issue>1</issue>), <fpage>35241</fpage>. <pub-id pub-id-type="doi">10.1038/srep35241</pub-id> </citation>
</ref>
<ref id="B12">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lun</surname>
<given-names>A. T. L.</given-names>
</name>
<name>
<surname>McCarthy</surname>
<given-names>D. J.</given-names>
</name>
<name>
<surname>Marioni</surname>
<given-names>J. C.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>A Step-by-step Workflow for Low-Level Analysis of Single-Cell RNA-Seq Data with Bioconductor</article-title>. <source>F1000Res</source> <volume>5</volume>, <fpage>2122</fpage>. <pub-id pub-id-type="doi">10.12688/f1000research.9501.2</pub-id> </citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mairal</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Bach</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Ponce</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Sapiro</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>2010</year>). <article-title>Online Learning for Matrix Factorization and Sparse Coding</article-title>. <source>J. Machine Learn. Res.</source> <volume>11</volume> (<issue>1</issue>), <fpage>19</fpage>&#x2013;<lpage>60</lpage>. <pub-id pub-id-type="doi">10.48550/arXiv.0908.0050</pub-id> </citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>McCarthy</surname>
<given-names>D. J.</given-names>
</name>
<name>
<surname>Campbell</surname>
<given-names>K. R.</given-names>
</name>
<name>
<surname>Lun</surname>
<given-names>A. T. L.</given-names>
</name>
<name>
<surname>Wills</surname>
<given-names>Q. F.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Scater: Pre-processing, Quality Control, Normalization and Visualization of Single-Cell RNA-Seq Data in R</article-title>. <source>Bioinformatics</source> <volume>33</volume> (<issue>8</issue>), <fpage>btw777</fpage>&#x2013;<lpage>1186</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btw777</pub-id> </citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>P&#xe9;rez</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Sawmiller</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Tan</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>The Role of Heparan Sulfate Deficiency in Autistic Phenotype: Potential Involvement of Slit/Robo/srGAPs-Mediated Dendritic Spine Formation</article-title>. <source>Neural Dev.</source> <volume>11</volume>, <fpage>11</fpage>. <pub-id pub-id-type="doi">10.1186/s13064-016-0066-x</pub-id> </citation>
</ref>
<ref id="B16">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Rupert</surname>
<given-names>G.</given-names>
<suffix>Jr</suffix>
</name>
</person-group> (<year>2012</year>). <source>Simultaneous Statistical Inference</source>. <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Springer Science &#x26; Business Media</publisher-name>. </citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rylaarsdam</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Guemez-Gamboa</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Genetic Causes and Modifiers of Autism Spectrum Disorder</article-title>. <source>Front. Cel. Neurosci.</source> <volume>13</volume>, <fpage>385</fpage>. <pub-id pub-id-type="doi">10.3389/fncel.2019.00385</pub-id> </citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Welch</surname>
<given-names>J. D.</given-names>
</name>
<name>
<surname>Kozareva</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Ferreira</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Vanderburg</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Martin</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Macosko</surname>
<given-names>E. Z.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Single-Cell Multi-Omic Integration Compares and Contrasts Features of Brain Cell Identity</article-title>. <source>Cell</source> <volume>177</volume> (<issue>7</issue>), <fpage>1873</fpage>. <pub-id pub-id-type="doi">10.1016/j.cell.2019.05.006</pub-id> </citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yang</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Michailidis</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>A Non-negative Matrix Factorization Method for Detecting Modules in Heterogeneous Omics Multi-Modal Data</article-title>. <source>Bioinformatics</source> <volume>32</volume> (<issue>1</issue>), <fpage>btv544</fpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btv544</pub-id> </citation>
</ref>
<ref id="B20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yu</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>L.-G.</given-names>
</name>
<name>
<surname>Han</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>He</surname>
<given-names>Q.-Y.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>clusterProfiler: an R Package for Comparing Biological Themes Among Gene Clusters</article-title>. <source>OMICS: A J. Integr. Biol.</source> <volume>16</volume> (<issue>5</issue>), <fpage>284</fpage>&#x2013;<lpage>287</lpage>. <pub-id pub-id-type="doi">10.1089/omi.2011.0118</pub-id> </citation>
</ref>
<ref id="B21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Learning Common and Specific Patterns from Data of Multiple Interrelated Biological Scenarios with Matrix Factorization</article-title>. <source>Nucleic Acids Res.</source> <volume>47</volume> (<issue>13</issue>), <fpage>6606</fpage>&#x2013;<lpage>6617</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkz488</pub-id> </citation>
</ref>
</ref-list>
</back>
</article>