<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Neuroinform.</journal-id>
<journal-title>Frontiers in Neuroinformatics</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Neuroinform.</abbrev-journal-title>
<issn pub-type="epub">1662-5196</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fninf.2022.856175</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Neuroscience</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Adaptive Multimodal Neuroimage Integration for Major Depression Disorder Detection</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Wang</surname> <given-names>Qianqian</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1639225/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Li</surname> <given-names>Long</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1771568/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Qiao</surname> <given-names>Lishan</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1356918/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Liu</surname> <given-names>Mingxia</given-names></name>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
<xref ref-type="corresp" rid="c002"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/696936/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>School of Mathematics Science, Liaocheng University</institution>, <addr-line>Liaocheng</addr-line>, <country>China</country></aff>
<aff id="aff2"><sup>2</sup><institution>Taian Tumor Prevention and Treatment Hospital</institution>, <addr-line>Taian</addr-line>, <country>China</country></aff>
<aff id="aff3"><sup>3</sup><institution>Department of Radiology and BRIC, University of North Carolina at Chapel Hill</institution>, <addr-line>Chapel Hill, NC</addr-line>, <country>United States</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Antonio Fern&#x000E1;ndez-Caballero, University of Castilla-La Mancha, Spain</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Yu Zhang, Lehigh University, United States; Shu Zhang, Northwestern Polytechnical University, China</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Lishan Qiao <email>qiaolishan&#x00040;lcu.edu.cn</email></corresp>
<corresp id="c002">Mingxia Liu <email>mxliu1226&#x00040;gmail.com</email></corresp>
</author-notes>
<pub-date pub-type="epub">
<day>29</day>
<month>04</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>16</volume>
<elocation-id>856175</elocation-id>
<history>
<date date-type="received">
<day>16</day>
<month>01</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>05</day>
<month>04</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2022 Wang, Li, Qiao and Liu.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Wang, Li, Qiao and Liu</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license> </permissions>
<abstract>
<p>Major depressive disorder (MDD) is one of the most common mental health disorders that can affect sleep, mood, appetite, and behavior of people. Multimodal neuroimaging data, such as functional and structural magnetic resonance imaging (MRI) scans, have been widely used in computer-aided detection of MDD. However, previous studies usually treat these two modalities separately, without considering their potentially complementary information. Even though a few studies propose integrating these two modalities, they usually suffer from significant inter-modality data heterogeneity. In this paper, we propose an adaptive multimodal neuroimage integration (AMNI) framework for automated MDD detection based on functional and structural MRIs. The AMNI framework consists of four major components: (1) a graph convolutional network to learn feature representations of functional connectivity networks derived from functional MRIs, (2) a convolutional neural network to learn features of T1-weighted structural MRIs, (3) a feature adaptation module to alleviate inter-modality difference, and (4) a feature fusion module to integrate feature representations extracted from two modalities for classification. To the best of our knowledge, this is among the first attempts to adaptively integrate functional and structural MRIs for neuroimaging-based MDD analysis by explicitly alleviating inter-modality heterogeneity. Extensive evaluations are performed on 533 subjects with resting-state functional MRI and T1-weighted MRI, with results suggesting the efficacy of the proposed method.</p></abstract>
<kwd-group>
<kwd>major depressive disorder</kwd>
<kwd>resting-state functional MRI</kwd>
<kwd>structural MRI</kwd>
<kwd>feature adaptation</kwd>
<kwd>multimodal data fusion</kwd>
</kwd-group>
<counts>
<fig-count count="8"/>
<table-count count="6"/>
<equation-count count="13"/>
<ref-count count="74"/>
<page-count count="15"/>
<word-count count="10500"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>1. Introduction</title>
<p>Major depressive disorder (MDD) is one of the most common mental health disorders, affecting as many as 300 million people annually (Organization et al., <xref ref-type="bibr" rid="B63">2017</xref>). This disease is generally characterized by depressed mood, diminished interests, and impaired cognitive function (Alexopoulos, <xref ref-type="bibr" rid="B1">2005</xref>; Pizzagalli et al., <xref ref-type="bibr" rid="B45">2008</xref>; Otte et al., <xref ref-type="bibr" rid="B41">2016</xref>). Despite decades of research in basic science, clinical neuroscience and psychiatry, the pathological, and biological mechanisms of major depression remain unclear (Holtzheimer III and Nemeroff, <xref ref-type="bibr" rid="B25">2006</xref>). The traditional diagnosis of MDD mainly depends on criteria from the diagnostic and statistical manual of mental disorders (DSM) and treatment response (Papakostas, <xref ref-type="bibr" rid="B42">2009</xref>), which could be subjective and susceptible. As a robust complement to clinical neurobehavior-based detection, computer-aided diagnosis based on data hold the promise of objective diagnosis and prognosis of mental disorders (Foti et al., <xref ref-type="bibr" rid="B15">2014</xref>; Liu and Zhang, <xref ref-type="bibr" rid="B38">2014</xref>; Bron et al., <xref ref-type="bibr" rid="B6">2015</xref>; Shi et al., <xref ref-type="bibr" rid="B52">2018</xref>; Zhang L. et al., <xref ref-type="bibr" rid="B71">2020</xref>; Buch and Liston, <xref ref-type="bibr" rid="B8">2021</xref>).</p>
<p>Multiple neuroimaging modalities, such as resting-state functional magnetic resonance imaging (rs-fMRI) and structural MRI (sMRI), can provide complementary information in discovering objective disease biomarkers, and have been increasingly employed in automated diagnosis of various brain disorders (Hinrichs et al., <xref ref-type="bibr" rid="B24">2011</xref>). Resting-state fMRI helps capture large-scale abnormality or dysfunction on functional connectivity network (FCN) by measuring bold-oxygen-level-dependent (BOLD) signals of subjects (Van Den Heuvel and Pol, <xref ref-type="bibr" rid="B58">2010</xref>; Wang et al., <xref ref-type="bibr" rid="B60">2019</xref>; Zhang Y. et al., <xref ref-type="bibr" rid="B73">2020</xref>; Sun et al., <xref ref-type="bibr" rid="B56">2021</xref>), and thus, can measure hemodynamic response related to neural activity in the brain dynamically. Structural MRI provides relatively high-resolution structural information of the brain, enabling us to study pathological changes in different brain tissues, such as gray matter (GM), white matter (WM), and cerebrospinal fluid (CSF) (Cuadra et al., <xref ref-type="bibr" rid="B13">2005</xref>). It is critical to integrate rs-fMRI and sMRI data to facilitate automated diagnosis of MDD and related disorders.</p>
<p>Existing neuroimaging-based MDD studies usually focus on discovering structural or functional imaging biomarkers, by employing various machine learning approaches such as support vector machines (SVM), Gaussian process classifier (GPC), linear discriminant analysis (LDA), and deep neural networks (Sato et al., <xref ref-type="bibr" rid="B49">2015</xref>; B&#x000FC;rger et al., <xref ref-type="bibr" rid="B9">2017</xref>; Rubin-Falcone et al., <xref ref-type="bibr" rid="B47">2018</xref>; Li et al., <xref ref-type="bibr" rid="B36">2021</xref>). However, these methods generally ignore the potentially complementary information conveyed by functional and structural MRIs. Several recent studies propose to employ functional and structural MRIs for MDD analysis, but they usually suffer from significant inter-modality data discrepancy (Fu et al., <xref ref-type="bibr" rid="B16">2015</xref>; Maglanoc et al., <xref ref-type="bibr" rid="B39">2020</xref>; Ge et al., <xref ref-type="bibr" rid="B19">2021</xref>).</p>
<p>In this article, we propose an adaptive multimodal neuroimage integration (<bold>AMNI</bold>) framework for automated MDD detection using functional and structural MRI data. As shown in <xref ref-type="fig" rid="F1">Figure 1</xref>, the proposed AMNI consists of four major components: (1) a <italic>graph convolutional network</italic> (GCN) for extracting feature representations of functional connectivity networks derived from rs-fMRI scans; (2) a <italic>convolutional neural network</italic> (CNN) for extracting features representations of T1-weighted sMRI scans; (3) a <italic>feature adaptation module</italic> for alleviating inter-modality difference by minimizing a cross-modal maximum mean discrepancy (MMD) loss; and (4) a <italic>feature fusion module</italic> for integrating features of two modalities for classification (<italic>via</italic> Softmax). Experimental results on 533 subjects from the REST-meta-MDD Consortium (Yan et al., <xref ref-type="bibr" rid="B66">2019</xref>) demonstrate the effectiveness of AMNI in MDD detection.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>Illustration of the proposed adaptive multimodal neuroimage integration (AMNI) framework, including (1) a graph convolutional network (GCN) for extracting features of functional connectivity networks derived from resting-state functional MRI (rs-fMRI) data, (2) a convolutional neural network (CNN) for extracting features of T1-weighted structural MRI (sMRI) data, (3) a feature adaptation module for alleviating inter-modality difference by minimizing a cross-modal maximum mean discrepancy (MMD) loss, and (4) a feature fusion module for integrating sMRI and fMRI features for classification. MDD, major depressive disorder; HC, healthy control.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-16-856175-g0001.tif"/>
</fig>
<p>The major contributions of this work are summarized below:</p>
<list list-type="bullet">
<list-item><p>An adaptive integration framework is developed to fuse functional and structural MRIs for automated MDD diagnosis by taking advantage of the complementary information of the two modalities. This is different from previous approaches that focus on only discovering structural or functional imaging biomarkers for MDD analysis.</p></list-item>
<list-item><p>A feature adaptation strategy is designed to explicitly reduce the inter-modality difference by minimizing a cross-modal maximum mean discrepancy loss to re-calibrate features extracted from two heterogeneous modalities.</p></list-item>
<list-item><p>Extensive experiments on 533 subjects with rs-fMRI and sMRI scans have been performed to validate the effectiveness of the proposed method in MDD detection.</p></list-item>
</list>
<p>The rest of this article is organized as follows. In Section 2, we briefly review the most relevant studies. In Section 3, we first introduce the materials and then present the proposed method as well as implementation details. In Section 4, we introduce the experimental settings and report the experimental results. In Section 5, we investigate the effect of several key components in the proposed method and discuss limitations as well as possible future research directions. We finally conclude this article in Section 6.</p>
</sec>
<sec id="s2">
<title>2. Related Work</title>
<p>In this section, we briefly introduce the most relevant studies on structural and functional brain MRI analysis, as well as multimodal neuroimaging-based diagnosis of brain disorders.</p>
<sec>
<title>2.1. Brain Structural MR Imaging Analysis</title>
<p>Currently, MRI is the most sensitive imaging test of the brain in routine clinical practice. Structural MRIs can non-invasively capture the internal brain structure and atrophy, assisting us to understand the brain anatomical changes caused by various mental disorders. Conventional sMRI-based MDD analysis is usually performed manually by human beings <italic>via</italic> visual assessment (Scheltens et al., <xref ref-type="bibr" rid="B50">1992</xref>), which could be subjective and susceptible. To this end, many machine learning methods (Gao et al., <xref ref-type="bibr" rid="B18">2018</xref>), such as support vector machines (SVM), Gaussian process classifier (GPC), and linear discriminant analysis (LDA), have been used for automated MRI-based MDD diagnosis. However, these methods generally rely on handcrafted MRI features and these features may be suboptimal for subsequent analysis, thus significantly limiting their practical utility.</p>
<p>In recent years, deep learning methods such as convolutional neural networks (CNNs) have been widely used in the fields of computer vision and medical image analysis (Yue-Hei Ng et al., <xref ref-type="bibr" rid="B70">2015</xref>; Chen et al., <xref ref-type="bibr" rid="B12">2016</xref>; Zhang L. et al., <xref ref-type="bibr" rid="B71">2020</xref>). As a special type of multi-layer neural network, CNN is capable of automatic feature learning, which eliminates the subjectivity in extracting and selecting informative features for specific tasks (Lee et al., <xref ref-type="bibr" rid="B35">2017</xref>). Based on the LeNet5 network, Sarraf and Tofighi (<xref ref-type="bibr" rid="B48">2016</xref>) presented a 2D convolutional neural network that could classify sMRI scan slices for Alzheimer&#x00027;s disease diagnosis. With the development of high-performance computing resources, Hosseini-Asl et al. (<xref ref-type="bibr" rid="B27">2016</xref>) developed a deep neural network that used 3D convolution layers to extract features of 3D medical images for Alzheimer&#x00027;s disease diagnosis. Chakraborty et al. (<xref ref-type="bibr" rid="B11">2020</xref>) developed a 3D CNN architecture for learning intricate patterns in MRI scans for Parkinson&#x00027;s disease diagnosis. Compared with 2D convolution, 3D convolution on the entire MR image is able to capture the rich spatial information, which is essential for disease classification.</p>
</sec>
<sec>
<title>2.2. Brain Functional MR Imaging Analysis</title>
<p>Existing studies have revealed that fMRI can capture large-scale abnormality or dysfunction on functional connectivity networks by measuring the blood-oxygen-level in the brain (Van Den Heuvel and Pol, <xref ref-type="bibr" rid="B58">2010</xref>; Zhang et al., <xref ref-type="bibr" rid="B74">2019</xref>). With fMRI data, we usually construct a functional connectivity network for representing each subject, where each node represents a specific brain region-of-interest (ROI) and each edge denotes the pairwise relationship between ROIs (Honey et al., <xref ref-type="bibr" rid="B26">2009</xref>; Dvornek et al., <xref ref-type="bibr" rid="B14">2017</xref>). By capturing the dependencies between BOLD signals of paired ROIs, functional connectivity networks (FCNs) have been widely used to identify potential neuroimaging biomarkers for mental disorder analysis. Previous studies often extract handcrafted FCN features (e.g., clustering coefficient and node degree) to build prediction/classification models (Guo et al., <xref ref-type="bibr" rid="B22">2021</xref>; Zhang et al., <xref ref-type="bibr" rid="B72">2021</xref>), but the definition of the optimal FCN features highly relies on expert knowledge, so it is often subjective. Extracting effective feature representations of functional connectivity networks is essential for subsequent analysis.</p>
<p>Recent studies have shown that spectral graph convolutional networks (GCNs) are effective in learning representations of brain functional connectivity networks, where each FCN is treated as a graph (Bruna et al., <xref ref-type="bibr" rid="B7">2013</xref>; Parisot et al., <xref ref-type="bibr" rid="B43">2018</xref>; Bai et al., <xref ref-type="bibr" rid="B3">2020</xref>; Yao et al., <xref ref-type="bibr" rid="B68">2021</xref>). Motivated by breakthroughs of deep learning on grid data, people make efforts to extend CNN to graphs, giving rise to the spectral graph convolutional networks (GCNs) (Bruna et al., <xref ref-type="bibr" rid="B7">2013</xref>). Recent studies have shown that GCNs are effective in learning representations of brain functional connectivity networks compared to traditional machine learning algorithms. For example, Parisot et al. (<xref ref-type="bibr" rid="B43">2018</xref>) proposed a GCN-based method for group-level population diagnosis that exploited the concept of spectral graph convolutions. Yao et al. (<xref ref-type="bibr" rid="B68">2021</xref>) presented a mutual multi-scale triplet GCN model to extract multi-scale feature representations of brain functional connectivity networks. Bai et al. (<xref ref-type="bibr" rid="B3">2020</xref>) developed a backtrackless aligned-spatial GCN model to transitively align vertices between graphs and learn effective features for graph classification. Compared with traditional CNN with Euclidean data, GCN generalizes convolution operations to non-Euclidean data, and helps mine topological information of brain connectivity networks.</p>
</sec>
<sec>
<title>2.3. Multimodal Neuroimaging-Based Brain Disease Diagnosis</title>
<p>Previous studies have been shown that multimodal neuroimaging data can provide complementary information of individual subjects to improve the performance of computer-aided disease diagnosis (Sui et al., <xref ref-type="bibr" rid="B55">2013</xref>; Calhoun and Sui, <xref ref-type="bibr" rid="B10">2016</xref>; Maglanoc et al., <xref ref-type="bibr" rid="B39">2020</xref>; Guan and Liu, <xref ref-type="bibr" rid="B21">2021</xref>). For example, Sui et al. (<xref ref-type="bibr" rid="B55">2013</xref>) developed a machine learning model to enable fusion of three or more multimodal datasets based on multi-set canonical correlation analysis and joint independent component analysis algorithms. Maglanoc et al. (<xref ref-type="bibr" rid="B39">2020</xref>) used linked independent component analysis to fuse structural and functional MRI features for depression diagnosis. Even though previous studies have yielded promising performance, they often extract sMRI and fMRI features manually, which requires domain-specific knowledge (Shen et al., <xref ref-type="bibr" rid="B51">2017</xref>). Several deep learning models of multimodal medical image fusion are proposed to employ multimodal neuroimaging data for brain disease diagnosis (Rajalingam and Priya, <xref ref-type="bibr" rid="B46">2018</xref>). However, existing studies usually focus on combining feature representation of multiple modalities and ignore significant inter-modality heterogeneity (Huang et al., <xref ref-type="bibr" rid="B28">2019</xref>). To this end, we propose an adaptive multimodal neuroimage integration (AMNI) framework for automated MDD diagnosis based on resting-state functional MRI and T1-weighted structural MRI data. The proposed method can not only extract high-level feature representations of structural and functional data <italic>via</italic> CNN and GCN, respectively, but also alleviate the heterogeneity between modalities with the help of a unique feature adaptation module.</p>
</sec>
</sec>
<sec sec-type="materials and methods" id="s3">
<title>3. Materials and Methods</title>
<p>In this section, we first introduce the materials and image pre-processing method used in this work, and then present the proposed method and implementation details.</p>
<sec>
<title>3.1. Materials</title>
<sec>
<title>3.1.1. Data Acquisition</title>
<p>Resting-state fMRI and T1-weighted structural MRI data were acquired from 282 MDD subjects and 251 healthy controls (HCs) recruited from the Southwest University, an imaging site of the REST-meta-MDD consortium (Yan et al., <xref ref-type="bibr" rid="B66">2019</xref>). Resting-state fMRI were acquired through a Siemens scanner with the following parameters: repetition time (TP) &#x0003D; 2, 000 <italic>ms</italic>, echo time (TE) &#x0003D; 30 <italic>ms</italic>, flip angle &#x0003D; 90<italic>&#x000B7;</italic>, slice thickness &#x0003D; 3.0 <italic>mm</italic>, gap &#x0003D; 1.0 <italic>ms</italic>, time point &#x0003D; 242, voxel size &#x0003D; 3.44 &#x000D7; 3.44 &#x000D7; 4.00 <italic>mm</italic><sup>3</sup>. More detailed information can be found online<xref ref-type="fn" rid="fn0001"><sup>1</sup></xref>. The demographic and clinical information of these studied subjects is summarized in <xref ref-type="table" rid="T1">Table 1</xref>.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Demographic and clinical information of subjects from Southwest University [a part of the REST-meta-MDD consortium (Yan et al., <xref ref-type="bibr" rid="B66">2019</xref>)].</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Category</bold></th>
<th valign="top" align="center"><bold>Gender</bold></th>
<th valign="top" align="center"><bold>Age</bold></th>
<th valign="top" align="center"><bold>Education</bold></th>
<th valign="top" align="center"><bold>First period</bold></th>
<th valign="top" align="center"><bold>On medication</bold></th>
<th valign="top" align="center"><bold>Duration of illness</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">MDD</td>
<td valign="top" align="center">99 M</td>
<td valign="top" align="center">38.7 &#x000B1; 13.6</td>
<td valign="top" align="center">10.8 &#x000B1; 3.6</td>
<td valign="top" align="center">209 (Y)/49 (N)</td>
<td valign="top" align="center">124 (Y)/125 (N)</td>
<td valign="top" align="center">50.0 &#x000B1; 65.9</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">183 F</td>
<td/>
<td/>
<td valign="top" align="center">24 (D)</td>
<td valign="top" align="center">33 (D)</td>
<td valign="top" align="center">35 (D)</td>
</tr>
<tr>
<td valign="top" align="left">HC</td>
<td valign="top" align="center">87 M</td>
<td valign="top" align="center">39.6 &#x000B1; 15.8</td>
<td valign="top" align="center">13.0 &#x000B1; 3.9</td>
<td valign="top" align="center">&#x02212;</td>
<td valign="top" align="center">&#x02212;</td>
<td valign="top" align="center">&#x02212;</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">164 F</td>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>Values are reported as Mean &#x000B1; Standard deviation. M, Male; F, Female; Y, Yes; N, No; D, Lack of record</italic>.</p>
</table-wrap-foot>
</table-wrap>
</sec>
<sec>
<title>3.1.2. Image Pre-processing</title>
<p>The resting-state fMRI and structural T1-weighted MRI scans were pre-processed using the Diffeomorphic Anatomical Registration Through Exponentiated Lie algebra (DPARSF) software (Yan and Zang, <xref ref-type="bibr" rid="B65">2010</xref>) with a standardized protocol (Yan et al., <xref ref-type="bibr" rid="B64">2016</xref>). For rs-fMRI data, we first discard the first 10 volumes the initial 10 volumes were discarded, and slice-timing correction was performed. Then, the time series of images for each subject were realigned using a six-parameter (rigid body) linear transformation. After realignment, individual T1-weighted images were co-registered to the mean functional image using a 6 degrees-of-freedom linear transformation without re-sampling and then segmented into gray matter (GM), white matter (WM) and cerebrospinal fluid (CSF). Finally, transformations from individual native space to MNI space were computed with the Diffeomorphic Anatomical Registration Through Exponentiated Lie algebra (DARTEL) tool (Ashburner, <xref ref-type="bibr" rid="B2">2007</xref>). After that, the fMRI data were normalized with an EPI template in the MNI space, and resampled to the resolution of 3  &#x000D7; 3 &#x000D7; 3 <italic>mm</italic><sup>3</sup>, followed by spatial smoothing using a 6 <italic>mm</italic> full width half maximum Gaussian kernel. Note that subjects with poor image quality or excessive head motion (mean framewise-displacement &#x0003E;0.2 <italic>mm</italic>) were excluded from analysis (Jenkinson et al., <xref ref-type="bibr" rid="B29">2002</xref>). Finally, we extracted the mean rs-fMRI time series with band-pass filtering (0.01&#x02212;0.1<italic>Hz</italic>) of a set of 112 pre-defined regions-of-interest (ROIs), including cortical and subcortical areas based on the Harvard-Oxford atlas. Each T1-weighted structural MR image was also segmented into three tissues (i.e., GM, WM, and CSF) and transformed into the MNI space with DARTEL tool (Ashburner, <xref ref-type="bibr" rid="B2">2007</xref>), resulting in a 3D volume (size: 121 &#x000D7; 145 &#x000D7; 121). Here, we employ gray matter volume in the MNI space for representing the original sMRI.</p>
</sec>
</sec>
<sec>
<title>3.2. Proposed Method</title>
<p>As illustrated in <xref ref-type="fig" rid="F1">Figure 1</xref>, the proposed AMNI consists of four major components: (1) a GCN module to extract features from rs-fMRI, (2) a CNN module to extract features from T1-weighted sMRI, (3) a feature adaptation module to reduce inter-modality discrepancy, and (4) a feature fusion module for classification, with details introduced below.</p>
<sec>
<title>3.2.1. GCN for Functional MRI Feature Learning</title>
<p>Based on resting-state fMRI data, one usually constructs a functional connectivity matrix/network (FCN) for representing each subject, with each node representing a specific brain ROI and each edge denoting the pairwise functional connection/relationship between ROIs (Honey et al., <xref ref-type="bibr" rid="B26">2009</xref>; Dvornek et al., <xref ref-type="bibr" rid="B14">2017</xref>). That is, FCNs help capture the dependencies between BOLD signals of paired ROIs. Considering the fact that FCNs are non-Euclidean data, we treat each functional connectivity network as a specific graph and resort to spectral graph convolutional network (GCN) for FCN feature learning by capturing graph topology information. Previous studies have shown that GCN is effective in learning graph-level representations by gradually aggregating feature vectors of all nodes (Yao et al., <xref ref-type="bibr" rid="B69">2019</xref>). In this work, we aim to learn graph-level representations based on node representations of input FCNs.</p>
<p>(<bold>i</bold>) <bold>Graph Construction</bold>. Denote <italic>N</italic> and <italic>M</italic> as the numbers of ROIs and time points, respectively, where <italic>N</italic> &#x0003D; 112 and <italic>M</italic> &#x0003D; 232 in this work. We assume that the rs-fMRI time-series data for a subject is <inline-formula><mml:math id="M1"><mml:mi>Y</mml:mi><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>&#x022EF;</mml:mo><mml:mspace width="0.3em" class="thinspace"/><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msup><mml:mo>&#x02208;</mml:mo><mml:msup><mml:mrow><mml:mi>R</mml:mi></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:mo>&#x000D7;</mml:mo><mml:mi>M</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula>, where each element <inline-formula><mml:math id="M2"><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:msup><mml:mrow><mml:mi>R</mml:mi></mml:mrow><mml:mrow><mml:mi>M</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> (<italic>n</italic> &#x0003D; 1, &#x022EF;&#x02009;, <italic>N</italic>) denotes BOLD measurements of the <italic>n</italic>-th ROI at <italic>M</italic> successive time points.</p>
<p>As the simplest and most widely used method, Pearson correlation (PC) is usually used to construct functional connectivity networks from raw rs-fMRI time-series data. Denote <inline-formula><mml:math id="M3"><mml:mi>B</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x02208;</mml:mo><mml:msup><mml:mrow><mml:mi>R</mml:mi></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:mo>&#x000D7;</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> as the functional connectivity matrix based on the Pearson correlation algorithm. Each element <italic>b</italic><sub><italic>ij</italic></sub>&#x02208;[&#x02212;1, 1] in <italic>B</italic> represents the Pearson correlation coefficient between the <italic>i</italic>-th and <italic>j</italic>-th ROIs, defined as follows:</p>
<disp-formula id="E1"><label>(1)</label><mml:math id="M4"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>&#x00233;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>&#x00233;</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:msqrt><mml:mrow><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>&#x00233;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>&#x00233;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msqrt><mml:msqrt><mml:mrow><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>&#x00233;</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>&#x00233;</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msqrt></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where &#x00233;<sub><italic>i</italic></sub> and &#x00233;<sub><italic>j</italic></sub> are the mean vector corresponding to <inline-formula><mml:math id="M5"><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:msup><mml:mrow><mml:mi>R</mml:mi></mml:mrow><mml:mrow><mml:mi>M</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> and <inline-formula><mml:math id="M6"><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:msup><mml:mrow><mml:mi>R</mml:mi></mml:mrow><mml:mrow><mml:mi>M</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula>, respectively, and <italic>M</italic> represents the length of time points of BOLD signals in each brain region.</p>
<p>For each subject, we regard each brain FCN as an undirected graph <italic>G</italic> &#x0003D; {<italic>V, E</italic>}, where <italic>V</italic> &#x0003D; {<italic>v</italic><sub>1</sub>, &#x022EF; &#x02009;, <italic>v</italic><sub><italic>N</italic></sub>} is a set of <italic>N</italic> nodes/ROIs and <italic>b</italic><sub><italic>ij</italic></sub>&#x02208;<italic>B</italic> denotes the functional connectivity between a paired nodes <italic>v</italic><sub><italic>i</italic></sub> and <italic>v</italic><sub><italic>j</italic></sub>. Since spectral GCNs work on adjacency matrices by updating and aggregating node features (Bruna et al., <xref ref-type="bibr" rid="B7">2013</xref>), it is essential to generate such an adjacency matrix <italic>A</italic> and a node feature matrix <italic>X</italic> from each graph <italic>G</italic>.</p>
<p>To reduce the influence of noisy/redundant information, we propose to construct a K-Nearest Neighbor (KNN) graph based on each densely-connected functional connectivity matrix. Specifically, a KNN graph is generated by only keep the top k important edges according to their functional connectivity strength (i.e., PC coefficient) for each node. Then, the topology structure of the graph <italic>G</italic> can be described by adjacency matrix <inline-formula><mml:math id="M7"><mml:mi>A</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x02208;</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo>}</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:mo>&#x000D7;</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula>, where <italic>a</italic><sub><italic>ij</italic></sub> &#x0003D; 1 if there exists an edge between the <italic>i</italic>-th and the <italic>j</italic>-th ROIs, and <italic>a</italic><sub><italic>ij</italic></sub> &#x0003D; 0, otherwise. In addition, the node features are defined by the functional connection weights of edges connected to each node, i.e., corresponding to a specific row in the functional connectivity matrix. Thus, the node features of the graph <italic>G</italic> can be represented by the node feature matrix <italic>X</italic> &#x0003D; <italic>B</italic>.</p>
<p>(<bold>ii</bold>) <bold>Graph Feature Learning</bold>. In GCN models, the convolution operation on the graph is defined as the multiplication of filters and signals in the Fourier domain. Specifically, GCN model learns new node representations by calculating the weighted sum of feature vectors of central nodes and the neighboring nodes. Mathematically, the simplest spectral GCN layer (Kipf and Welling, <xref ref-type="bibr" rid="B31">2016</xref>) can be formulated as:</p>
<disp-formula id="E2"><label>(2)</label><mml:math id="M8"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msup><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:mi>f</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msup><mml:mo>,</mml:mo><mml:mi>A</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>&#x003C3;</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover><mml:msup><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msup><mml:msup><mml:mrow><mml:mi>W</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <italic>H</italic><sup><italic>l</italic></sup> is the matrix of activations in the <italic>l</italic>-th layer, and <italic>W</italic><sup><italic>l</italic></sup> is a layer-specific trainable weight matrix.</p>
<p>In addition, <inline-formula><mml:math id="M9"><mml:mover accent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:mfrac></mml:mrow></mml:msup><mml:mi>A</mml:mi><mml:msup><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:mfrac></mml:mrow></mml:msup></mml:math></inline-formula> is the normalized adjacency matrix with self loops, and &#x003C3;(&#x000B7;) is an activation function, such as the <italic>ReLU</italic>(&#x000B7;) &#x0003D; <italic>max</italic>(0, &#x000B7;). In addition, <italic>D</italic> is the diagonal degree matrix, with the <italic>i</italic>-th diagonal element defined as <inline-formula><mml:math id="M10"><mml:msub><mml:mrow><mml:mi>d</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x02260;</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:munder><mml:msub><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>.</p>
<p>In the GCN module in our AMNI framework, we stack two graph convolutional layers with the adjacency matrix <italic>A</italic> and node features matrix <italic>X</italic> as inputs. The output of this two-layer GCN module is calculated as:</p>
<disp-formula id="E3"><label>(3)</label><mml:math id="M11"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>Z</mml:mi><mml:mo>=</mml:mo><mml:mi>f</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>A</mml:mi><mml:mo>,</mml:mo><mml:mi>X</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>R</mml:mi><mml:mi>e</mml:mi><mml:mi>L</mml:mi><mml:mi>U</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover><mml:mi>R</mml:mi><mml:mi>e</mml:mi><mml:mi>L</mml:mi><mml:mi>U</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover><mml:mi>X</mml:mi><mml:msup><mml:mrow><mml:mi>W</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:msup><mml:mrow><mml:mi>W</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Note that the number of neurons in the two graph convolutional layers is set as 64 and 64, respectively.</p>
<p>Given that this is a graph classification task, we employ a simple graph pooling strategy (Lee et al., <xref ref-type="bibr" rid="B34">2019</xref>) to generate graph-level FCN representations. To be specific, we employ both global average pooling and global max pooling that aggregate node features to generate new feature representations. The output feature of the graph pooling layer is as follows:</p>
<disp-formula id="E4"><label>(4)</label><mml:math id="M12"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mi>F</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:munderover></mml:mstyle><mml:msub><mml:mrow><mml:mi>z</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>||</mml:mo><mml:mstyle displaystyle="true"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo class="qopname">max</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:munderover></mml:mstyle><mml:msub><mml:mrow><mml:mi>z</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <italic>N</italic> is the number of ROIs, <italic>z</italic><sub><italic>i</italic></sub> is the feature vector of <italic>i</italic>-th ROI obtained by the graph convolution operation, and || denotes concatenation.</p>
<p>By stacking multiple graph convolution layers and graph pooling layers, GCN can learn higher-order node features from neighboring nodes. In addition, GCN propagates information on a graph structure and gradually aggregates the information of neighboring nodes, which allows us to effectively capture the complex dependencies among ROIs.</p>
</sec>
<sec>
<title>3.2.2. CNN for Structural MRI Feature Learning</title>
<p>In recent years, convolutional neural networks (CNNs) have shown much predomination in image recognition and classification (Simonyan and Zisserman, <xref ref-type="bibr" rid="B53">2014</xref>; He et al., <xref ref-type="bibr" rid="B23">2016</xref>). Due to the 3D nature of structural MR images (sMRI), it is important to learn feature representations of all three dimensions from volumetric medical data. Considering that 3D convolutional kernels can encode richer spatial information, we adopt 3D CNN model to extract feature representations of T1-weighted MRIs.</p>
<p>In the AMNI framework, the CNN module consists of four convolution blocks and two fully-connected (FC) layers for local to global sMRI feature extraction. To be specific, each convolution block consists of one convolutional layer, one batch normalization layer, one activation function and one max pooling layer. To capture local patterns, 3D convolution is achieved by convolving a 3D kernel over 3D feature cubes. Formally, the <italic>j</italic>-th feature map in the <italic>i</italic>-th layer, denoted as <italic>v</italic><sub><italic>i, j</italic></sub>, is given by</p>
<disp-formula id="E5"><label>(5)</label><mml:math id="M13"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>f</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>W</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>*</mml:mo><mml:msub><mml:mrow><mml:mi>V</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <italic>W</italic><sub><italic>i, j</italic></sub> and <italic>b</italic><sub><italic>i, j</italic></sub> are the kernel weights and the bias for the <italic>j</italic>-th feature map, respectively, <italic>V</italic><sub><italic>i</italic>&#x02212;1</sub> are the sets of input feature maps connected to the current layer from the (<italic>i</italic>&#x02212;1)<italic>th</italic> layer, &#x0002A; is the convolution operation, and <italic>f</italic> is the non-linear activation function. The size of each convolution filter is 3 &#x000D7; 3 &#x000D7; 3, and the numbers of convolution filters are set to 16, 32, 64, 128, respectively. In addition, max pooling is applied for each 2 &#x000D7; 2 &#x000D7; 2 region which reduces the spatial size of the feature maps and the number of parameters, and ReLU is used as the activation function. Meanwhile, batch normalization technique can promote faster convergence and better generalization of trained networks.</p>
<p>For the pooling layer, we use the Global Average Pooling (GAP) operation (Lin et al., <xref ref-type="bibr" rid="B37">2013</xref>), which performs downsampling by computing the mean of the height, width, and depth dimensions of the input. The formula for GAP is as follows:</p>
<disp-formula id="E6"><label>(6)</label><mml:math id="M14"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mstyle displaystyle="false"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>h</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>H</mml:mi></mml:mrow></mml:munderover></mml:mstyle><mml:mstyle displaystyle="false"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>w</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>W</mml:mi></mml:mrow></mml:munderover></mml:mstyle><mml:mstyle displaystyle="false"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>D</mml:mi></mml:mrow></mml:munderover></mml:mstyle><mml:msubsup><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mi>h</mml:mi><mml:mo>,</mml:mo><mml:mi>w</mml:mi><mml:mo>,</mml:mo><mml:mi>d</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mrow><mml:mi>H</mml:mi><mml:mo>&#x000D7;</mml:mo><mml:mi>W</mml:mi><mml:mo>&#x000D7;</mml:mo><mml:mi>D</mml:mi></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <inline-formula><mml:math id="M15"><mml:msubsup><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mi>h</mml:mi><mml:mo>,</mml:mo><mml:mi>w</mml:mi><mml:mo>,</mml:mo><mml:mi>d</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> is the value at position (<italic>h, w, d</italic>) of the <italic>j</italic>-th input feature map, <italic>H</italic>, <italic>W</italic>, and <italic>D</italic> are the height, width, and depth respectively and <italic>g</italic><sub><italic>j</italic></sub> is getting value of the <italic>j</italic>-th input feature map through GAP. Thus, the sMRI feature <italic>g</italic><sub><italic>S</italic></sub> generated by CNN is given by:</p>
<disp-formula id="E7"><label>(7)</label><mml:math id="M16"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mi>S</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>&#x022EF;</mml:mo><mml:mspace width="0.3em" class="thinspace"/><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msup></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <italic>c</italic> is the number of input feature map. It can be seen that the GAP layer converts a 4D tensor to a 1-dimensional feature vector, thus significantly reducing the number of network parameters.</p>
<p>The two fully-connected layers have 128 and 64 neurons, respectively. To avoid overfitting, we employ the dropout technique (Srivastava et al., <xref ref-type="bibr" rid="B54">2014</xref>), with a probability of 0.5 after each fully-connected layer. More detailed information about the CNN architecture can be found in <xref ref-type="table" rid="T2">Table 2</xref>.</p>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Architecture of the CNN module in the proposed AMINI framework.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold> Layer</bold></th>
<th valign="top" align="center"><bold>Kernel size</bold></th>
<th valign="top" align="center"><bold>Stride</bold></th>
<th valign="top" align="center"><bold>Output size</bold></th>
<th valign="top" align="center"><bold>Feature volumes</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Input</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">121 &#x000D7;145 &#x000D7;121</td>
<td valign="top" align="center">1</td>
</tr>
<tr>
<td valign="top" align="left">C1</td>
<td valign="top" align="center">3 &#x000D7;3 &#x000D7;3</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">121 &#x000D7;145 &#x000D7;121</td>
<td valign="top" align="center">16</td>
</tr>
<tr>
<td valign="top" align="left">M1</td>
<td valign="top" align="center">2 &#x000D7;2 &#x000D7;2</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">60 &#x000D7;72 &#x000D7;60</td>
<td valign="top" align="center">16</td>
</tr>
<tr>
<td valign="top" align="left">C2</td>
<td valign="top" align="center">3 &#x000D7;3 &#x000D7;3</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">60 &#x000D7;72 &#x000D7;60</td>
<td valign="top" align="center">32</td>
</tr>
<tr>
<td valign="top" align="left">M2</td>
<td valign="top" align="center">2 &#x000D7;2 &#x000D7;2</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">30 &#x000D7;36 &#x000D7;30</td>
<td valign="top" align="center">32</td>
</tr>
<tr>
<td valign="top" align="left">C3</td>
<td valign="top" align="center">3 &#x000D7;3 &#x000D7;3</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">30 &#x000D7;36 &#x000D7;30</td>
<td valign="top" align="center">64</td>
</tr>
<tr>
<td valign="top" align="left">M3</td>
<td valign="top" align="center">2 &#x000D7;2 &#x000D7;2</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">15 &#x000D7;18 &#x000D7;15</td>
<td valign="top" align="center">64</td>
</tr>
<tr>
<td valign="top" align="left">C4</td>
<td valign="top" align="center">3 &#x000D7;3 &#x000D7;3</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">15 &#x000D7;18 &#x000D7;15</td>
<td valign="top" align="center">128</td>
</tr>
<tr>
<td valign="top" align="left">M4</td>
<td valign="top" align="center">2 &#x000D7;2 &#x000D7;2</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">7 &#x000D7;9 &#x000D7;7</td>
<td valign="top" align="center">128</td>
</tr>
<tr>
<td valign="top" align="left">GAP</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">1 &#x000D7;1 &#x000D7;1</td>
<td valign="top" align="center">128</td>
</tr>
<tr>
<td valign="top" align="left">FC</td>
<td valign="top" align="center">&#x02013;</td>
<td/>
<td valign="top" align="center">1 &#x000D7;1 &#x000D7;1</td>
<td valign="top" align="center">64</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>Cn, the n-th convolutional layer; Mn, the n-th max pooling layer; GAP, global average pooling; FC, fully-connected layer</italic>.</p>
</table-wrap-foot>
</table-wrap>
</sec>
<sec>
<title>3.2.3. Feature Adaptation Module</title>
<p>Due to the heterogeneous nature of multimodal data, it is necessary to reduce the discrepancy between feature representations of different modalities before feature fusion. Inspired by existing studies on domain adaptation (Tzeng et al., <xref ref-type="bibr" rid="B57">2014</xref>), we employ a cross-modal loss based on maximum mean discrepancy (MMD) (Gretton et al., <xref ref-type="bibr" rid="B20">2012</xref>) to re-calibrate channel-wise features extracted from sMRI and fMRI. Denote <italic>G</italic><sub><italic>F</italic></sub> and <italic>G</italic><sub><italic>S</italic></sub> as feature representations of fMRI and sMRI, respectively. The cross-modal MMD loss <italic>L</italic><sub><italic>M</italic></sub> is formulated as follows:</p>
<disp-formula id="E8"><label>(8)</label><mml:math id="M17"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mi>L</mml:mi><mml:mi>M</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mi>M</mml:mi><mml:mi>M</mml:mi><mml:mi>D</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>G</mml:mi><mml:mi>F</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>G</mml:mi><mml:mi>S</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mo>=</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mrow><mml:mo>&#x0007C;</mml:mo><mml:mo>&#x0007C;</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mrow><mml:mo>&#x0007C;</mml:mo><mml:msub><mml:mi>G</mml:mi><mml:mi>F</mml:mi></mml:msub><mml:mo>&#x0007C;</mml:mo></mml:mrow></mml:mfrac><mml:mstyle displaystyle='true'><mml:munder><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:msub><mml:mi>g</mml:mi><mml:mi>F</mml:mi></mml:msub><mml:mo>&#x02208;</mml:mo><mml:msub><mml:mi>G</mml:mi><mml:mi>F</mml:mi></mml:msub></mml:mrow></mml:munder><mml:mi>&#x003D5;</mml:mi></mml:mstyle><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>g</mml:mi><mml:mi>F</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x02212;</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mrow><mml:mo>&#x0007C;</mml:mo><mml:msub><mml:mi>G</mml:mi><mml:mi>S</mml:mi></mml:msub><mml:mo>&#x0007C;</mml:mo></mml:mrow></mml:mfrac><mml:mstyle displaystyle='true'><mml:munder><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:msub><mml:mi>g</mml:mi><mml:mi>S</mml:mi></mml:msub><mml:mo>&#x02208;</mml:mo><mml:msub><mml:mi>G</mml:mi><mml:mi>S</mml:mi></mml:msub></mml:mrow></mml:munder><mml:mi>&#x003D5;</mml:mi></mml:mstyle><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>g</mml:mi><mml:mi>S</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x0007C;</mml:mo><mml:mo>&#x0007C;</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where &#x003D5;(&#x000B7;) denotes the feature map associated with the kernel map, and <italic>g</italic><sub><italic>F</italic></sub> and <italic>g</italic><sub><italic>S</italic></sub> are elements in <italic>G</italic><sub><italic>F</italic></sub> and <italic>G</italic><sub><italic>S</italic></sub>, respectively. During model training, the cross-modal MMD loss will be used as a regularization term to penalize heterogeneity of the features between the two modalities.</p>
<p>As shown in <xref ref-type="fig" rid="F1">Figure 1</xref>, this cross-modal MMD loss is applied to features from two fully-connected layers in the proposed CNN and GCN modules. This would enable the feature adaptation module to learn shared and aligned information across modalities by minimizing the distribution difference between two feature representations.</p>
</sec>
<sec>
<title>3.2.4. Feature Fusion Module</title>
<p>To enable our AMNI method to capture the complementary information provided by functional and structural MRIs, we also design a feature fusion module for classification/prediction.</p>
<p>Assuming that <italic>F</italic><sub>1</sub> and <italic>F</italic><sub>2</sub> are two feature representations obtained by feature adaptation module, we first concatenate them to obtain a new representation. The new representation <italic>F</italic> can be described as follows:</p>
<disp-formula id="E9"><label>(9)</label><mml:math id="M18"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>F</mml:mi><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mi>F</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>F</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msup></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>After concatenation, the obtained new representation is fed to two fully-connected layers (with 64 and 2 neurons, respectively), and the learned features are further fed into a Softmax layer for classification.</p>
<p>During the training stage, we use the cross-entropy loss function to optimize the parameters in our AMINI model. The classification loss <italic>L</italic><sub><italic>C</italic></sub> is defined as:</p>
<disp-formula id="E10"><label>(10)</label><mml:math id="M19"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>L</mml:mi></mml:mrow><mml:mrow><mml:mi>C</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mo>-</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:munderover></mml:mstyle><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>g</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>g</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <italic>N</italic> is the number of samples, and <italic>y</italic><sub><italic>i</italic></sub> is the true label of the <italic>i</italic>-th sample, with 1 representing the sample being a MDD patient and 0 denoting the sample being a healthy control. In addition, <italic>p</italic> is the predicted probability that the sample belongs to the MDD category.</p>
<p>In our model, we aim to minimize not only the classification loss, but also the cross-modal loss to reduce the inter-modality difference. Hence, the total loss function <italic>L</italic> of the proposed AMNI is defined as follows:</p>
<disp-formula id="E11"><label>(11)</label><mml:math id="M20"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>L</mml:mi><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>L</mml:mi></mml:mrow><mml:mrow><mml:mi>C</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:mi>&#x003BB;</mml:mi><mml:msub><mml:mrow><mml:mi>L</mml:mi></mml:mrow><mml:mrow><mml:mi>M</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where &#x003BB; is a hyperparameter to tune the contributions of two terms in Equation (11).</p>
</sec>
</sec>
<sec>
<title>3.3. Implementation Details</title>
<p>We optimize the proposed AMNI model <italic>via</italic> the Adam (Kingma and Ba, <xref ref-type="bibr" rid="B30">2014</xref>) algorithm, with the learning rate of 0.0001, weight decay rate of 0.0015, training epoch of 100, and mini-batch size of 16. The proposed model is implemented based on Pytorch (Paszke et al., <xref ref-type="bibr" rid="B44">2017</xref>), and the model is trained by using a single GPU (NVIDIA Quadro RTX 6000 with 24 GB memory). The hyperparameter &#x003BB; in Equation (11) is empirically set as 0.01. And we will experimentally investigate its influence in Section 5.</p>
</sec>
</sec>
<sec id="s4">
<title>4. Experiments</title>
<p>In this section, we introduce experimental settings and several competing methods, present the experimental results, and visualize feature distributions of different methods.</p>
<sec>
<title>4.1. Experimental Settings</title>
<p>We randomly select 80% samples as training data, and the remaining 20% samples are used as test data. To avoid bias introduced by random partition, we repeat the random partition procedure 10 times independently, and record the mean and standard deviation results. Eight metrics are used to evaluate the performance of different methods in the task of MDD detection (i.e., MDD vs. HC classification), including accuracy (ACC), sensitivity (SEN), specificity (SPE), balanced accuracy (BAC), positive predicted value (PPV), negative predictive value (NPV), F1-Score (F1), and area under the receiver operating characteristic curve (AUC).</p>
</sec>
<sec>
<title>4.2. Methods for Comparison</title>
<p>In this work, we compare the proposed AMNI method with six traditional machine learning methods and three popular deep learning methods. More details can be found below.</p>
<p>(1) <bold>PCA&#x0002B;SVM-s</bold>: The PCA&#x0002B;SVM-s method only uses sMRI data. The 3D image of the whole brain is down-sampled from 121 &#x000D7; 145 &#x000D7; 121 to 61 &#x000D7; 73 &#x000D7; 61, and further flattened into a vectorized feature representation for each subject. We use principal component analysis (PCA) (Wold et al., <xref ref-type="bibr" rid="B62">1987</xref>) by keeping the top 32 principal components to reduce feature dimension based on the above feature representations of all subjects. Finally, the support vector machine (SVM) with Radial Basis Function (RBF) kernel is employed for classification.</p>
<p>(2) <bold>EC&#x0002B;SVM</bold>: The EC&#x0002B;SVM method uses rs-fMRI data. Similar to our AMNI, we first construct a functional connectivity matrix based on Pearson correlation coefficient for each subject. We then extract eigenvector centralities (EC) (Bonacich, <xref ref-type="bibr" rid="B5">2007</xref>), which measure a node&#x00027;s importance while giving consideration to the importance of its neighbors in the FC network, as features of the FCN and feed these 112-dimensional features into an SVM classifier with RBF kernel for disease detection.</p>
<p>(3) <bold>DC&#x0002B;SVM</bold>: Similar to EC&#x0002B;SVM, the DC&#x0002B;SVM method first constructs a FCN based on Pearson correlation coefficient for each subject, and then extracts degree centrality (DC) (Nieminen, <xref ref-type="bibr" rid="B40">1974</xref>) as FCN features by measuring node importance based on the number of links incident upon a node. The 112-dimensional DC features are finally feed into an SVM for classification.</p>
<p>(4) <bold>CC&#x0002B;SVM</bold>: Similar to EC/CC&#x0002B;SVM, this method extracts the local clustering coefficient (CC) (Wee et al., <xref ref-type="bibr" rid="B61">2012</xref>) to measure clustering degree of each node in each FCN. The 112-dimensional CC features are fed into an SVM for classification.</p>
<p>(5) <bold>PCA&#x0002B;SVM-f</bold>: In the PCA&#x0002B;SVM-f method, the upper triangle of a FC matrix is flattened into a vector for each subject after the FC matrix is constructed. Then, we use PCA by keeping the top 32 principal components to reduce feature dimension based on the above feature representations of all subjects. Finally, an SVM is used for classification.</p>
<p>(6) <bold>PP&#x0002B;SVM</bold>: In this method, we integrate rs-fMRI and sMRI features for classification based on SVM. Specifically, we first employ PCA&#x0002B;SVM-s and PCA&#x0002B;SVM-f to extract features from structural and functional MRIs, respectively. Then, we concatenate features of these two modalities for the same subject, followed by an SVM for classification.</p>
<p>(7) <bold>2DCNN</bold>: In this method, we employ the original FC matrix of each subject as input of a CNN model (LeCun et al., <xref ref-type="bibr" rid="B33">1989</xref>). Specifically, this CNN contains three convolutional layers and two fully-connected layers. Each convolutional layer is followed by batch normalization and ReLU activation. The channel numbers for the three convolutional layers are 4, 8, and 8, respectively, and the corresponding size of the convolution kernel is 3 &#x000D7; 3, 5 &#x000D7; 5, 7 &#x000D7; &#x000D7; 7, respectively. The two fully-connected (FC) layers contain 4, 096 and 2 neurons, respectively.</p>
<p>(8) <bold>ST-GCN</bold>: We also compare our method with the spatio-temporal graph convolutional network (ST-GCN), a state-of-the-art method for modeling spatio-temporal dependency of fMRI data (Gadgil et al., <xref ref-type="bibr" rid="B17">2020</xref>). Specifically, the ST-GCN comprises two layers of spatio-temporal graph convolution (ST-GC) units, global average pooling and a fully connected layer. Note that each ST-GC layer produces 64-channel outputs with the temporal kernel size of 11, a stride of 1, and a dropout rate of 0.5.</p>
<p>(9) <bold>3DCNN&#x0002B;2DCNN</bold>: In this method, we employ 3DCNN and 2DCNN to extract features from sMRI and fMRI, respectively. We then concatenate features learned from 3DCNN and 2DCNN, and feed the concatenated features to a fully-connected layer and the softmax layer for classification.</p>
</sec>
<sec>
<title>4.3. Experimental Results</title>
<p>The quantitative results of the proposed AMNI and nine competing methods in the task of MDD vs. HC classification are reported in <xref ref-type="table" rid="T3">Table 3</xref>. In <xref ref-type="fig" rid="F2">Figures 2A,B</xref>, we also show ROC curves of different methods. From <xref ref-type="table" rid="T3">Table 3</xref> and <xref ref-type="fig" rid="F2">Figures 2A,B</xref>, we have the following interesting observations.</p>
<table-wrap position="float" id="T3">
<label>Table 3</label>
<caption><p>Classification results in terms of &#x0201C;mean (standard deviation)&#x0201D; achieved by ten methods in MDD vs. HC classification, with best results shown in bold.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Method</bold></th>
<th valign="top" align="center"><bold>Data</bold></th>
<th valign="top" align="center"><bold>ACC</bold></th>
<th valign="top" align="center"><bold>SEN</bold></th>
<th valign="top" align="center"><bold>SPE</bold></th>
<th valign="top" align="center"><bold>BAC</bold></th>
<th valign="top" align="center"><bold>PPV</bold></th>
<th valign="top" align="center"><bold>NPV</bold></th>
<th valign="top" align="center"><bold>F1</bold></th>
<th valign="top" align="center"><bold>AUC</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">PCA&#x0002B;SVM-s</td>
<td valign="top" align="center">S</td>
<td valign="top" align="center">0.566 (0.011)</td>
<td valign="top" align="center">0.669 (0.021)</td>
<td valign="top" align="center">0.456 (0.007)</td>
<td valign="top" align="center">0.563 (0.010)</td>
<td valign="top" align="center">0.580 (0.006)</td>
<td valign="top" align="center">0.553 (0.017)</td>
<td valign="top" align="center">0.618 (0.013)</td>
<td valign="top" align="center">0.591 (0.008)</td>
</tr>
<tr>
<td valign="top" align="left">EC&#x0002B;SVM</td>
<td valign="top" align="center">F</td>
<td valign="top" align="center">0.560 (0.014)</td>
<td valign="top" align="center">0.651 (0.009)</td>
<td valign="top" align="center">0.462 (0.029)</td>
<td valign="top" align="center">0.557 (0.015)</td>
<td valign="top" align="center">0.577 (0.013)</td>
<td valign="top" align="center">0.539 (0.018)</td>
<td valign="top" align="center">0.609 (0.009)</td>
<td valign="top" align="center">0.586 (0.019)</td>
</tr>
<tr>
<td valign="top" align="left">CC&#x0002B;SVM</td>
<td valign="top" align="center">F</td>
<td valign="top" align="center">0.574 (0.007)</td>
<td valign="top" align="center">0.674 (0.018)</td>
<td valign="top" align="center">0.470 (0.014)</td>
<td valign="top" align="center">0.572 (0.006)</td>
<td valign="top" align="center">0.589 (0.005)</td>
<td valign="top" align="center">0.562 (0.011)</td>
<td valign="top" align="center">0.625(0.009)</td>
<td valign="top" align="center">0.597(0.014)</td>
</tr>
<tr>
<td valign="top" align="left">DC&#x0002B;SVM</td>
<td valign="top" align="center">F</td>
<td valign="top" align="center">0.578 (0.014)</td>
<td valign="top" align="center">0.676 (0.019)</td>
<td valign="top" align="center">0.477 (0.016)</td>
<td valign="top" align="center">0.577 (0.017)</td>
<td valign="top" align="center">0.593 (0.015)</td>
<td valign="top" align="center">0.568 (0.021)</td>
<td valign="top" align="center">0.627 (0.014)</td>
<td valign="top" align="center">0.605 (0.015)</td>
</tr>
<tr>
<td valign="top" align="left">PCA&#x0002B;SVM-f</td>
<td valign="top" align="center">F</td>
<td valign="top" align="center">0.570 (0.011)</td>
<td valign="top" align="center">0.653 (0.014)</td>
<td valign="top" align="center">0.483 (0.019)</td>
<td valign="top" align="center">0.568 (0.012)</td>
<td valign="top" align="center">0.588 (0.010)</td>
<td valign="top" align="center">0.554 (0.016)</td>
<td valign="top" align="center">0.614 (0.009)</td>
<td valign="top" align="center">0.602 (0.013)</td>
</tr>
<tr style="border-bottom: thin solid #000000;">
<td valign="top" align="left">PP&#x0002B;SVM</td>
<td valign="top" align="center">SF</td>
<td valign="top" align="center">0.593 (0.026)</td>
<td valign="top" align="center">0.675 (0.022)</td>
<td valign="top" align="center">0.502 (0.036)</td>
<td valign="top" align="center">0.588 (0.027)</td>
<td valign="top" align="center">0.605 (0.026)</td>
<td valign="top" align="center">0.578 (0.030)</td>
<td valign="top" align="center">0.636 (0.022)</td>
<td valign="top" align="center">0.631 (0.027)</td>
</tr> <tr>
<td valign="top" align="left">2DCNN</td>
<td valign="top" align="center">F</td>
<td valign="top" align="center">0.613 (0.013)</td>
<td valign="top" align="center">0.670 (0.022)</td>
<td valign="top" align="center">0.551 (0.024)</td>
<td valign="top" align="center">0.611 (0.013)</td>
<td valign="top" align="center">0.628 (0.013)</td>
<td valign="top" align="center">0.599 (0.016)</td>
<td valign="top" align="center">0.643 (0.014)</td>
<td valign="top" align="center">0.645 (0.013)</td>
</tr>
<tr>
<td valign="top" align="left">STGCN</td>
<td valign="top" align="center">F</td>
<td valign="top" align="center">0.583(0.022)</td>
<td valign="top" align="center">0.616 (0.027)</td>
<td valign="top" align="center">0.544 (0.026)</td>
<td valign="top" align="center">0.580 (0.022)</td>
<td valign="top" align="center">0.612 (0.015)</td>
<td valign="top" align="center">0.548 (0.037)</td>
<td valign="top" align="center">0.614 (0.018)</td>
<td valign="top" align="center">0.591 (0.008)</td>
</tr>
<tr style="border-bottom: thin solid #000000;">
<td valign="top" align="left">3D&#x0002B;2DCNN</td>
<td valign="top" align="center">SF</td>
<td valign="top" align="center">0.632 (0.028)</td>
<td valign="top" align="center">0.667 (0.022)</td>
<td valign="top" align="center">0.593 (0.043)</td>
<td valign="top" align="center">0.630 (0.029)</td>
<td valign="top" align="center"><bold>0.649 (0.034)</bold></td>
<td valign="top" align="center">0.617(0.041)</td>
<td valign="top" align="center">0.656 (0.026)</td>
<td valign="top" align="center">0.655 (0.013)</td>
</tr> <tr>
<td valign="top" align="left">AMNI (Ours)</td>
<td valign="top" align="center">SF</td>
<td valign="top" align="center"><bold>0.650 (0.016)</bold></td>
<td valign="top" align="center"><bold>0.694 (0.068)</bold></td>
<td valign="top" align="center"><bold>0.609 (0.056)</bold></td>
<td valign="top" align="center"><bold>0.651 (0.016)</bold></td>
<td valign="top" align="center">0.640 (0.031)</td>
<td valign="top" align="center"><bold>0.667 (0.055)</bold></td>
<td valign="top" align="center"><bold>0.663 (0.021)</bold></td>
<td valign="top" align="center"><bold>0.665 (0.017)</bold></td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>S, sMRI; F, fMRI; SF, sMRI&#x0002B;fMRI</italic>.</p>
</table-wrap-foot>
</table-wrap>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>ROC curves and related AUC values achieved by different methods in MDD vs. HC classification. <bold>(A)</bold> AMNI vs. six conventional methods. <bold>(B)</bold> AMNI vs. three deep learning methods. <bold>(C)</bold> AMNI vs. its three variants.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-16-856175-g0002.tif"/>
</fig>
<p><italic>First</italic>, our AMNI and two deep learning methods (i.e., 2DCNN and 3DCNN&#x0002B;2DCNN) generally achieve better performance in terms of eight metrics, compared with six traditional machine learning methods. For example, in terms of ACC values, the AMNI yields the performance improvement of 5.7%, compared with the best traditional machine learning method (e.g., PP&#x0002B;SVM) in MDD detection. These results demonstrate that, deep learning methods that can learn diagnosis-oriented neuroimage features is more effective in MDD detection, compared with traditional machine learning methods that rely on handcrafted features. <italic>Second</italic>, three multimodal methods (i.e., PP&#x0002B;SVM, 3DCNN&#x0002B;2DCNN, and AMNI) generally outperform their single-modality counterparts (i.e., PCA&#x0002B;SVM-s, PCA&#x0002B;SVM-f, and 2DCNN). For instance, both our AMNI and 3DCNN&#x0002B;2DCNN methods that integrate sMRI and fMRI data are superior to 2DCNN which only use functional data. This implies that taking advantage of multimodal MRIs (as we do in this work) helps promote the diagnosis performance, thanks to the complementary information provided by functional and structural MRIs. Furthermore, our proposed AMNI achieves better performance in terms of most metrics, compared with eight competing methods. These results imply that adaptive integration of multimodal neuroimages helps boost the performance of MDD identification.</p>
</sec>
<sec>
<title>4.4. Statistical Significance Analysis</title>
<p>We further calculate predicted probability distribution difference on test data between our model and each of eight competing methods by paired sample <italic>t</italic>-test. Denote <italic>u</italic><sub>1</sub> and <italic>u</italic><sub>2</sub> as the population mean of predicted probability distributions from our AMNI and one competing method, respectively. The hypotheses can be expressed as follows:</p>
<disp-formula id="E12"><label>(12)</label><mml:math id="M21"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mtable style="text-align:axis;" equalrows="false" columnlines="" equalcolumns="false" class="array"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>:</mml:mo><mml:msub><mml:mrow><mml:mi>u</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>u</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>:</mml:mo><mml:msub><mml:mrow><mml:mi>u</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>&#x02260;</mml:mo><mml:msub><mml:mrow><mml:mi>u</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub></mml:mtd></mml:mtr></mml:mtable></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <italic>H</italic><sub>0</sub> is the null hypothesis, meaning that our model and the competing method do not have significant difference. And <italic>H</italic><sub>1</sub> is the alternative hypothesis, meaning that our model and the competing method have significance difference. The test statistic for the paired samples <italic>t</italic>-test is as follows:</p>
<disp-formula id="E13"><label>(13)</label><mml:math id="M22"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mover accent="false" class="mml-overline"><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mo accent="true">&#x000AF;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mtext>diff</mml:mtext></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mtext>diff</mml:mtext></mml:mrow></mml:msub><mml:mo>/</mml:mo><mml:msqrt><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msqrt></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <inline-formula><mml:math id="M23"><mml:mrow><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mtext>diff</mml:mtext></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> is sample mean of the differences, <italic>s</italic><sub>diff</sub> is sample standard deviation of the differences and <italic>n</italic> is the sample size (i.e., number of pairs). The <italic>p</italic>-values that corresponds to the test statistic <italic>t</italic> are shown in <xref ref-type="table" rid="T4">Table 4</xref>.</p>
<table-wrap position="float" id="T4">
<label>Table 4</label>
<caption><p>Results of statistical significance analysis between the proposed AMNI and eight competing methods.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Pairwise comparison</bold></th>
<th valign="top" align="center"><bold>p-value</bold></th>
<th valign="top" align="center"><bold>p &#x0003C;0.05</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">AMNI vs. PCA&#x0002B;SVM-s</td>
<td valign="top" align="center">3.40 &#x000D7;10<sup>&#x02212;4</sup></td>
<td valign="top" align="center">Yes</td>
</tr>
<tr>
<td valign="top" align="left">AMNI vs. EC&#x0002B;SVM</td>
<td valign="top" align="center">3.93 &#x000D7;10<sup>&#x02212;4</sup></td>
<td valign="top" align="center">Yes</td>
</tr>
<tr>
<td valign="top" align="left">AMNI vs. CC&#x0002B;SVM</td>
<td valign="top" align="center">3.16 &#x000D7;10<sup>&#x02212;4</sup></td>
<td valign="top" align="center">Yes</td>
</tr>
<tr>
<td valign="top" align="left">AMNI vs. DC&#x0002B;SVM</td>
<td valign="top" align="center">2.43 &#x000D7;10<sup>&#x02212;4</sup></td>
<td valign="top" align="center">Yes</td>
</tr>
<tr>
<td valign="top" align="left">AMNI vs. PCA&#x0002B;SVM-f</td>
<td valign="top" align="center">1.01 &#x000D7;10<sup>&#x02212;5</sup></td>
<td valign="top" align="center">Yes</td>
</tr>
<tr>
<td valign="top" align="left">AMNI vs. PP&#x0002B;SVM</td>
<td valign="top" align="center">2.71 &#x000D7;10<sup>&#x02212;5</sup></td>
<td valign="top" align="center">Yes</td>
</tr>
<tr>
<td valign="top" align="left">AMNI vs. 2DCNN</td>
<td valign="top" align="center">9.48 &#x000D7;10<sup>&#x02212;3</sup></td>
<td valign="top" align="center">Yes</td>
</tr>
<tr>
<td valign="top" align="left">AMNI vs. 3DCNN&#x0002B;2DCNN</td>
<td valign="top" align="center">1.07 &#x000D7;10<sup>&#x02212;3</sup></td>
<td valign="top" align="center">Yes</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>As shown in <xref ref-type="table" rid="T4">Table 4</xref>, all obtained <italic>p</italic>-values are less than our chosen significance level (i.e., 0.05). Therefore, <italic>H</italic><sub>0</sub> is rejected, which means that our AMNI method differs significantly from each of the eight competing methods.</p>
</sec>
<sec>
<title>4.5. Feature Visualization</title>
<p>In <xref ref-type="fig" rid="F3">Figure 3</xref>, we visualize the data distributions of features derived from two multimodal methods (i.e., PP&#x0002B;SVM and AMNI) <italic>via</italic> t-SNE (Van der Maaten and Hinton, <xref ref-type="bibr" rid="B59">2008</xref>). Note that the features of PP&#x0002B;SVM are generated by concatenating handcrafted features from two modalities, while the features of our AMNI are extracted based on an end-to-end deep learning model (see <xref ref-type="fig" rid="F1">Figure 1</xref>). As shown in <xref ref-type="fig" rid="F3">Figure 3</xref>, the feature distributions of two categories (i.e., MDD and HC) generated from our AMNI method have more significant difference, while their feature distribution gap is not evident for the PP&#x0002B;SVM method. This may indicate that our AMNI can learn more discriminative features for MDD detection by explicitly reducing the inter-modality discrepancy, compared with the traditional PP&#x0002B;SVM method.</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>Visualization of feature distributions from the PP&#x0002B;SVM and the proposed AMNI models via t-SNE (Van der Maaten and Hinton, <xref ref-type="bibr" rid="B59">2008</xref>). The horizontal and vertical axes denote two dimensions after feature mapping. <bold>(A)</bold> Distribution of features derived from PP&#x0002B;SVM. <bold>(B)</bold> Distribution of features derived from AMNI.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-16-856175-g0003.tif"/>
</fig>
</sec>
</sec>
<sec sec-type="discussion" id="s5">
<title>5. Discussion</title>
<sec>
<title>5.1. Ablation Study</title>
<p>To evaluate the effectiveness of each component in the proposed AMNI, we further compare AMNI with its three variants: (1) <bold>AMNI-s</bold> that only uses CNN branch and feature fusion module of AMNI, without considering functional MRI, (2) <bold>AMNI-f</bold> that only uses GCN branch and feature fusion module of AMNI, without considering structural MRI, (3) <bold>AMNI-w/oMMD</bold> that directly feeds concatenated fMRI and sMRI features (<italic>via</italic> GCN and CNN modules, respectively) into the feature fusion module for classification, without using the proposed feature adaption module. The experimental results are reported in <xref ref-type="fig" rid="F4">Figures 4</xref>, <xref ref-type="fig" rid="F2">2C</xref>.</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p>Performance of our AMNI and its three variants in the task of MDD vs. HC classification, with best results shown in bold.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-16-856175-g0004.tif"/>
</fig>
<p>It can be seen from <xref ref-type="fig" rid="F4">Figure 4</xref> that two multimodal methods (i.e., AMNI-w/oMMD and AMNI) generally outperform the single modality methods (i.e., AMNI-s and AMNI-f). This further demonstrates that multimodal data can provide complementary information to help boost the performance of MDD identification. Besides, our AMNI achieves consistently better performance compared with AMNI-w/oMMD that ignores the heterogeneity between the two modalities. These results further validate the effectiveness of the proposed feature adaption module in alleviating the inter-modality discrepancy between different modalities. In addition, <xref ref-type="fig" rid="F2">Figure 2C</xref> suggests that our proposed AMNI achieves good ROC performance and the best AUC value compared with its three variants.</p>
</sec>
<sec>
<title>5.2. Influence of Hyperparameter</title>
<p>The hyperparameter &#x003BB; in Equation (11) is used to tune the contribution of the proposed feature adaptation module for re-calibrating feature distributions of two modalities. We now report the classification accuracy of the proposed AMNI with different values of &#x003BB; in <xref ref-type="fig" rid="F5">Figure 5</xref>. As shown in <xref ref-type="fig" rid="F5">Figure 5</xref>, with &#x003BB; &#x0003D; 0.01, our AMNI can achieve best performance. But using a too large value (e.g., &#x003BB; &#x0003D; 1) will yield worse performance. A possible reason is that focusing too much on the reduction of differences between modalities (with a large &#x003BB;) may lose the specific and unique information of each modality, thereby degrading the learning performance.</p>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption><p>Accuracy achieved by the proposed AMNI method with different values of &#x003BB; in Equation (11) in the task of MDD vs. HC classification.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-16-856175-g0005.tif"/>
</fig>
</sec>
<sec>
<title>5.3. Influence of Graph Construction Strategy</title>
<p>In the main experiment, we build a KNN graph to generate an adjacency matrix for each FCN. To investigate the influence of the use of different graph construction strategies, besides KNN, we also construct a fully-connected graph and a threshold graph to generate the adjacency matrix, respectively. For the fully-connected graph, we directly take <italic>A</italic> &#x0003D; (|<italic>w</italic><sub><italic>ij</italic></sub>|) as the adjacency matrix, which is an edge-weighted graph. For the threshold graph, we generate the adjacency matrix <italic>A</italic> by binarizing the FC matrix <italic>B</italic> to regulate the sparsity of the graph. Thus, the adjacency matrix can be described as <inline-formula><mml:math id="M24"><mml:mi>A</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x02208;</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo>}</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:mo>&#x000D7;</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula>, where <italic>a</italic><sub><italic>ij</italic></sub> &#x0003D; 1 if the connection coefficient between <italic>i</italic>-th and <italic>j</italic>-th ROI is greater than a threshold <italic>q</italic>; and <italic>a</italic><sub><italic>ij</italic></sub> &#x0003D; 0, otherwise. The threshold <italic>q</italic> is set as 0.2 here. The experimental results of our AMNI with three different graph construction strategies are reported in <xref ref-type="fig" rid="F6">Figure 6</xref>.</p>
<fig id="F6" position="float">
<label>Figure 6</label>
<caption><p>Results of the proposed AMNI based on three different graph construction methods (e.g.fully-connected graph, threshold graph, and KNN graph) in the task of MDD vs. HC classification, with best results shown in bold.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-16-856175-g0006.tif"/>
</fig>
<p>As can be seen from <xref ref-type="fig" rid="F6">Figure 6</xref>, our AMNI model based on KNN graph outperforms its two variants that use fully-connected graph and threshold graph. The underlying reason could be that KNN graph can preserve node-centralized local topology information while removing noisy/redundant information in graph (Ktena et al., <xref ref-type="bibr" rid="B32">2018</xref>; Yao et al., <xref ref-type="bibr" rid="B68">2021</xref>).</p>
</sec>
<sec>
<title>5.4. Influence of Network Architecture</title>
<p>To explore the influence of different network architectures of AMNI on the experimental results, we adjust the the network depth of two branches of the AMNI model, respectively. <italic>On the one hand</italic>, with the CNN branch fixed, we vary the number of graph convolutional layers for the GCN branch of AMNI and report the corresponding results of AMNI in <xref ref-type="table" rid="T5">Table 5</xref>. This table shows that the AMNI achieves the overall best performances (e.g., ACC=0.6495 and AUC=0.6648) with two graph convolutional layers in the GCN branch. In addition, as the number of graph convolutional layers increases (see AMNI-G3 and AMNI-G4), the performance is not good. This may be due to the over-smoothing problem (that is, Laplacian smoothing makes the node representations more similar as the graph convolutional layer increases; Yang et al., <xref ref-type="bibr" rid="B67">2020</xref>), which may reduce the discriminative compatibility of learned features. <italic>On the other hand</italic>, we fix the GCN branch and vary the architecture of the CNN in AMNI for performance evaluation. Specifically, we vary the number of convoluational layers in CNN within [3, 6] and report the results of AMIN in MDD vs. HC classification in <xref ref-type="table" rid="T5">Table 5</xref>. This table shows that fine-tuning the network architecture of the CNN branch in AMNI achieves comparable results, which implies that our AMNI is robust to different network architectures. Further, AMNI with five convoluational layers in the CNN branch (e.g., AMNI-G5) achieves better performance in terms of accuracy, sensitivity, balanced accuracy, positive predicted value and F1-Score.</p>
<table-wrap position="float" id="T5">
<label>Table 5</label>
<caption><p>Classification results of our AMNI in MDD vs. HC classification with different network depth, with best results shown in bold.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Method</bold></th>
<th valign="top" align="center"><bold>ACC</bold></th>
<th valign="top" align="center"><bold>SEN</bold></th>
<th valign="top" align="center"><bold>SPE</bold></th>
<th valign="top" align="center"><bold>BAC</bold></th>
<th valign="top" align="center"><bold>PPV</bold></th>
<th valign="top" align="center"><bold>NPV</bold></th>
<th valign="top" align="center"><bold>F1</bold></th>
<th valign="top" align="center"><bold>AUC</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">AMNI-G1</td>
<td valign="top" align="center">0.634 (0.014)</td>
<td valign="top" align="center">0.677 (0.065)</td>
<td valign="top" align="center">0.587 (0.054)</td>
<td valign="top" align="center">0.632 (0.019)</td>
<td valign="top" align="center"><bold>0.669 (0.041)</bold></td>
<td valign="top" align="center">0.598 (0.077)</td>
<td valign="top" align="center"><bold>0.669 (0.011)</bold></td>
<td valign="top" align="center">0.627 (0.032)</td>
</tr>
<tr>
<td valign="top" align="left">AMNI-G2</td>
<td valign="top" align="center"><bold>0.650 (0.016)</bold></td>
<td valign="top" align="center"><bold>0.694 (0.068)</bold></td>
<td valign="top" align="center"><bold>0.609 (0.056)</bold></td>
<td valign="top" align="center"><bold>0.651(0.016)</bold></td>
<td valign="top" align="center">0.640 (0.031)</td>
<td valign="top" align="center"><bold>0.667 (0.055)</bold></td>
<td valign="top" align="center">0.663 (0.021)</td>
<td valign="top" align="center"><bold>0.665 (0.017)</bold></td>
</tr>
<tr>
<td valign="top" align="left">AMNI-G3</td>
<td valign="top" align="center">0.595 (0.008)</td>
<td valign="top" align="center">0.629 (0.034)</td>
<td valign="top" align="center">0.559 (0.041)</td>
<td valign="top" align="center">0.594 (0.010)</td>
<td valign="top" align="center">0.600 (0.010)</td>
<td valign="top" align="center">0.590 (0.019)</td>
<td valign="top" align="center">0.614 (0.016)</td>
<td valign="top" align="center">0.605 (0.009)</td>
</tr>
<tr style="border-bottom: thin solid #000000;">
<td valign="top" align="left">AMNI-G4</td>
<td valign="top" align="center">0.587 (0.011)</td>
<td valign="top" align="center">0.618 (0.023)</td>
<td valign="top" align="center">0.554 (0.025)</td>
<td valign="top" align="center">0.586 (0.011)</td>
<td valign="top" align="center">0.610 (0.042)</td>
<td valign="top" align="center">0.561 (0.036)</td>
<td valign="top" align="center">0.613 (0.022)</td>
<td valign="top" align="center">0.599 (0.022)</td>
</tr> <tr>
<td valign="top" align="left">AMNI-C3</td>
<td valign="top" align="center">0.628 (0.005)</td>
<td valign="top" align="center">0.692 (0.045)</td>
<td valign="top" align="center">0.551 (0.057)</td>
<td valign="top" align="center">0.622 (0.007)</td>
<td valign="top" align="center">0.647 (0.014)</td>
<td valign="top" align="center">0.603 (0.012)</td>
<td valign="top" align="center">0.668 (0.013)</td>
<td valign="top" align="center">0.622 (0.007)</td>
</tr>
<tr>
<td valign="top" align="left">AMNI-C4</td>
<td valign="top" align="center">0.650 (0.016)</td>
<td valign="top" align="center">0.694 (0.068)</td>
<td valign="top" align="center"><bold>0.609 (0.056)</bold></td>
<td valign="top" align="center">0.651 (0.016)</td>
<td valign="top" align="center">0.640 (0.031)</td>
<td valign="top" align="center"><bold>0.667 (0.055)</bold></td>
<td valign="top" align="center">0.663 (0.021)</td>
<td valign="top" align="center"><bold>0.665 (0.017)</bold></td>
</tr>
<tr>
<td valign="top" align="left">AMNI-C5</td>
<td valign="top" align="center"><bold>0.660 (0.022)</bold></td>
<td valign="top" align="center"><bold>0.742 (0.042)</bold></td>
<td valign="top" align="center">0.565 (0.049)</td>
<td valign="top" align="center"><bold>0.653 (0.023)</bold></td>
<td valign="top" align="center"><bold>0.663 (0.011)</bold></td>
<td valign="top" align="center">0.657 (0.040)</td>
<td valign="top" align="center"><bold>0.700 (0.020)</bold></td>
<td valign="top" align="center">0.653 (0.023)</td>
</tr>
<tr>
<td valign="top" align="left">AMNI-C6</td>
<td valign="top" align="center">0.642 (0.014)</td>
<td valign="top" align="center">0.701 (0.046)</td>
<td valign="top" align="center">0.580 (0.041)</td>
<td valign="top" align="center">0.641 (0.017)</td>
<td valign="top" align="center">0.651 (0.029)</td>
<td valign="top" align="center">0.634 (0.053)</td>
<td valign="top" align="center">0.673 (0.008)</td>
<td valign="top" align="center">0.628 (0.018)</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>Note that AMNI-Gn contains n graph convolutional layers in the GCN module of AMNI, and AMNI-Cn contains n convolutional layers in the CNN module of AMNI</italic>.</p>
</table-wrap-foot>
</table-wrap>
<p>Besides, we also further discuss the influence of network width of each branch on the experimental results. <italic>For one thing</italic>, with the CNN branch fixed, we change the number of neurons in the graph convolutional layers and then report the corresponding results of AMNI in <xref ref-type="table" rid="T6">Table 6</xref>. It can be found from <xref ref-type="table" rid="T6">Table 6</xref> that the AMINI model using different numbers of neurons in graph convolutional layers achieves comparable experimental results, which means our model is not very sensitive to the change of network width of the GCN branch. <italic>For another thing</italic>, with the GCN branch fixed, we change the number of filters in each 3D convolutional layer and record the results in <xref ref-type="table" rid="T6">Table 6</xref>. As shown in <xref ref-type="table" rid="T6">Table 6</xref>, with the increase of the number of filters in 3D CNN module of AMNI, the model (i.e., AMNI-c3 and AMNI-c4) generally achieves better performance. This may be due to that using more filters in CNN can capture richer features across global and local information of sMRI.</p>
<table-wrap position="float" id="T6">
<label>Table 6</label>
<caption><p>Classification results of our AMNI in MDD vs. HC classification with different network width, with best results shown in bold.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Method</bold></th>
<th valign="top" align="center"><bold>ACC</bold></th>
<th valign="top" align="center"><bold>SEN</bold></th>
<th valign="top" align="center"><bold>SPE</bold></th>
<th valign="top" align="center"><bold>BAC</bold></th>
<th valign="top" align="center"><bold>PPV</bold></th>
<th valign="top" align="center"><bold>NPV</bold></th>
<th valign="top" align="center"><bold>F1</bold></th>
<th valign="top" align="center"><bold>AUC</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">AMNI-g40</td>
<td valign="top" align="center">0.620 (0.035)</td>
<td valign="top" align="center">0.626 (0.089)</td>
<td valign="top" align="center">0.614 (0.097)</td>
<td valign="top" align="center">0.620 (0.035)</td>
<td valign="top" align="center">0.652 (0.039)</td>
<td valign="top" align="center">0.593 (0.040)</td>
<td valign="top" align="center">0.635 (0.049)</td>
<td valign="top" align="center">0.650 (0.036)</td>
</tr>
<tr>
<td valign="top" align="left">AMNI-g64</td>
<td valign="top" align="center"><bold>0.650 (0.016)</bold></td>
<td valign="top" align="center">0.694 (0.068)</td>
<td valign="top" align="center">0.609 (0.056)</td>
<td valign="top" align="center"><bold>0.651 (0.016)</bold></td>
<td valign="top" align="center">0.640 (0.031)</td>
<td valign="top" align="center"><bold>0.667 (0.055)</bold></td>
<td valign="top" align="center">0.663 (0.021)</td>
<td valign="top" align="center">0.665 (0.017)</td>
</tr>
<tr>
<td valign="top" align="left">AMNI-g88</td>
<td valign="top" align="center">0.626 (0.015)</td>
<td valign="top" align="center"><bold>0.697 (0.048)</bold></td>
<td valign="top" align="center">0.542 (0.052)</td>
<td valign="top" align="center">0.620 (0.015)</td>
<td valign="top" align="center">0.644 (0.016)</td>
<td valign="top" align="center">0.604 (0.023)</td>
<td valign="top" align="center"><bold>0.669 (0.021)</bold></td>
<td valign="top" align="center"><bold>0.667 (0.011)</bold></td>
</tr>
<tr style="border-bottom: thin solid #000000;">
<td valign="top" align="left">AMNI-g112</td>
<td valign="top" align="center">0.631 (0.016)</td>
<td valign="top" align="center">0.647 (0.053)</td>
<td valign="top" align="center"><bold>0.612 (0.037)</bold></td>
<td valign="top" align="center">0.629 (0.015)</td>
<td valign="top" align="center"><bold>0.659 (0.015)</bold></td>
<td valign="top" align="center">0.602 (0.024)</td>
<td valign="top" align="center">0.651 (0.029)</td>
<td valign="top" align="center">0.637 (0.037)</td>
</tr> <tr>
<td valign="top" align="left">AMNI-c1</td>
<td valign="top" align="center">0.598 (0.017)</td>
<td valign="top" align="center">0.643 (0.046)</td>
<td valign="top" align="center">0.535 (0.081)</td>
<td valign="top" align="center">0.589 (0.028)</td>
<td valign="top" align="center"><bold>0.643 (0.028)</bold></td>
<td valign="top" align="center">0.535 (0.073)</td>
<td valign="top" align="center">0.642 (0.026)</td>
<td valign="top" align="center">0.607 (0.0148)</td>
</tr>
<tr>
<td valign="top" align="left">AMNI-c2</td>
<td valign="top" align="center">0.630 (0.020)</td>
<td valign="top" align="center">0.693 (0.080)</td>
<td valign="top" align="center">0.575 (0.096)</td>
<td valign="top" align="center">0.634 (0.016)</td>
<td valign="top" align="center">0.593 (0.033)</td>
<td valign="top" align="center"><bold>0.685 (0.029)</bold></td>
<td valign="top" align="center">0.635 (0.023)</td>
<td valign="top" align="center">0.667 (0.004)</td>
</tr>
<tr>
<td valign="top" align="left">AMNI-c3</td>
<td valign="top" align="center"><bold>0.650 (0.016)</bold></td>
<td valign="top" align="center"><bold>0.694 (0.068)</bold></td>
<td valign="top" align="center">0.609 (0.056)</td>
<td valign="top" align="center"><bold>0.651 (0.016)</bold></td>
<td valign="top" align="center">0.640 (0.031)</td>
<td valign="top" align="center">0.667 (0.055)</td>
<td valign="top" align="center"><bold>0.663 (0.021)</bold></td>
<td valign="top" align="center">0.665 (0.017)</td>
</tr>
<tr>
<td valign="top" align="left">AMNI-c4</td>
<td valign="top" align="center">0.641 (0.015)</td>
<td valign="top" align="center">0.654 (0.051)</td>
<td valign="top" align="center"><bold>0.629 (0.030)</bold></td>
<td valign="top" align="center">0.642 (0.015)</td>
<td valign="top" align="center">0.628 (0.030)</td>
<td valign="top" align="center">0.658 (0.044)</td>
<td valign="top" align="center">0.638 (0.028)</td>
<td valign="top" align="center"><bold>0.689 (0.042)</bold></td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>Note that AMNI-gn contains n neurons in the graph convolutional layers of the GCN module. And the filter sequences in CNN module of AMNI-c1, AMNI-c2, AMNI-c3 and AMNI-c4 are [4, 8, 16, 32], [8, 16, 32, 64], [16, 32, 64, 128], and [32, 64, 128, 256], respectively</italic>.</p>
</table-wrap-foot>
</table-wrap>
</sec>
<sec>
<title>5.5. Influence of Multimodality Fusion Strategy</title>
<p>We fuse fMRI and sMRI data at the feature-level (see <xref ref-type="fig" rid="F1">Figure 1</xref>) in the main experiments. We further investigate the influence of different fusion strategies by comparing our AMNI (using feature-level fusion) with its variant (called <bold>AMNI_lf</bold>) using a decision-level fusion strategy. As shown in <xref ref-type="fig" rid="F7">Figure 7</xref>, in the AMNI_lf, the fMRI feature derived from GCN is fed into two fully connected layers and a Softmax layer for feature abstraction and classification. Similarly, the sMRI feature derived from CNN is fed into three fully connected layers and a Softmax layer. The outputs of these two branches are further fused <italic>via</italic> a weighted sum operation. We vary the weighted ratio between fMRI and sMRI branches within [<inline-formula><mml:math id="M28"><mml:mfrac><mml:mrow><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>2</mml:mn></mml:mrow><mml:mrow><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>8</mml:mn></mml:mrow></mml:mfrac></mml:math></inline-formula>, <inline-formula><mml:math id="M29"><mml:mfrac><mml:mrow><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>5</mml:mn></mml:mrow><mml:mrow><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>5</mml:mn></mml:mrow></mml:mfrac></mml:math></inline-formula>, <inline-formula><mml:math id="M30"><mml:mfrac><mml:mrow><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>8</mml:mn></mml:mrow><mml:mrow><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>2</mml:mn></mml:mrow></mml:mfrac></mml:math></inline-formula>] and denote these three methods as AMNI_lf1, AMNI_lf2, and AMNI_lf3, respectively, with the experimental results shown in <xref ref-type="fig" rid="F8">Figure 8</xref>.</p>
<fig id="F7" position="float">
<label>Figure 7</label>
<caption><p>Illustration of the adaptive multimodal neuroimage integration (AMNI) framework based on a decision-level fusion strategy.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-16-856175-g0007.tif"/>
</fig>
<fig id="F8" position="float">
<label>Figure 8</label>
<caption><p>Experimental results of late fusion method and our AMNI method in MDD vs. HC classification. Note that AMNI_lf1, AMNI_lf2, and AMNI_lf3 denote that the weight ratio between fMRI and sMRI branch is <inline-formula><mml:math id="M25"><mml:mfrac><mml:mrow><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>2</mml:mn></mml:mrow><mml:mrow><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>8</mml:mn></mml:mrow></mml:mfrac></mml:math></inline-formula>, <inline-formula><mml:math id="M26"><mml:mfrac><mml:mrow><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>5</mml:mn></mml:mrow><mml:mrow><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>5</mml:mn></mml:mrow></mml:mfrac></mml:math></inline-formula>, and <inline-formula><mml:math id="M27"><mml:mfrac><mml:mrow><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>8</mml:mn></mml:mrow><mml:mrow><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>2</mml:mn></mml:mrow></mml:mfrac></mml:math></inline-formula>, respectively.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-16-856175-g0008.tif"/>
</fig>
<p>As shown in <xref ref-type="fig" rid="F8">Figure 8</xref>, as the weight of GCN branch increases, the model achieves better performance in terms of most metrics. However, the results of AMNI using the decision-level fusion method are generally inferior to that of the feature-level fusion method proposed by this article. This implies that feature-level fusion of functional and structural representations could be more effective.</p>
</sec>
<sec>
<title>5.6. Limitations and Future Work</title>
<p>Several limitations need to be considered. First, we only integrate T1-weighted MRI and functional MRI data for automated MDD diagnosis. Actually, diffusion tensor imaging (DTI) data can examine and quantify white matter microstructure of the brain, which can further help uncover the neurobiological mechanisms of MDD. Therefore, it is valuable to incorporate DTI data into multimodal research in our future work. Second, we use functional connectivity networks for representing rs-fMRI data and treat them as input of the proposed method. It is interesting to extract diagnosis-oriented fMRI features, as we do for T1-weighed MRIs, which will also be our future work. Besides, a feature adaptation module with a cross-modal MDD loss is designed for reducing inter-modality data heterogeneity. Many other data adaptation methods (Ben-David et al., <xref ref-type="bibr" rid="B4">2007</xref>) can also be incorporated into the proposed AMNI framework for further performance improvement.</p>
</sec>
</sec>
<sec sec-type="conclusions" id="s6">
<title>6. Conclusion</title>
<p>In this article, we propose an adaptive multimodal neuroimage integration (AMNI) framework for automated MDD diagnosis based on functional and structural MRI data. We first employ GCN and CNN to learn feature representations of functional connectivity networks and structural MR images. Then, a feature adaptation module is designed to alleviate inter-modality difference by minimizing the distribution difference between two modalities. Finally, high-level features extracted from functional and structural MRI modalities are integrated and delivered to a classifier for disease detection. Experimental results on 533 subjects with rs-fMRI and T1-weighted sMRI demonstrate the effectiveness of the proposed method in identifying MDD patients from healthy controls.</p>
</sec>
<sec sec-type="data-availability" id="s7">
<title>Data Availability Statement</title>
<p>The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found at: REST-meta-MDD Consortium Data Sharing.</p>
</sec>
<sec id="s8">
<title>Ethics Statement</title>
<p>The studies involving human participants were reviewed and approved by REST-meta-MDD Consortium Data Sharing. The patients/participants provided their written informed consent to participate in this study.</p>
</sec>
<sec id="s9">
<title>Author Contributions</title>
<p>QW and ML designed the study. QW downloaded and analyzed the data, performed experiments, and drafted the manuscript. QW, LL, LQ, and ML revised the manuscript. All authors read and approved the final manuscript.</p>
</sec>
<sec sec-type="funding-information" id="s10">
<title>Funding</title>
<p>QW and LQ were partly supported by National Natural Science Foundation of China (Nos. 62176112, 61976110, and 11931008), Natural Science Foundation of Shandong Province (Nos. ZR2018MF020 and ZR2019YQ27), and Taishan Scholar Program of Shandong Province.</p>
</sec>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s11">
<title>Publisher&#x00027;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec> </body>
<back>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Alexopoulos</surname> <given-names>G. S..</given-names></name></person-group> (<year>2005</year>). <article-title>Depression in the elderly</article-title>. <source>Lancet</source> <volume>365</volume>, <fpage>1961</fpage>&#x02013;<lpage>1970</lpage>. <pub-id pub-id-type="doi">10.1016/S0140-6736(05)66665-2</pub-id><pub-id pub-id-type="pmid">15936426</pub-id></citation></ref>
<ref id="B2">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ashburner</surname> <given-names>J..</given-names></name></person-group> (<year>2007</year>). <article-title>A fast diffeomorphic image registration algorithm</article-title>. <source>Neuroimage</source> <volume>38</volume>, <fpage>95</fpage>&#x02013;<lpage>113</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2007.07.007</pub-id><pub-id pub-id-type="pmid">17761438</pub-id></citation></ref>
<ref id="B3">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bai</surname> <given-names>L.</given-names></name> <name><surname>Cui</surname> <given-names>L.</given-names></name> <name><surname>Jiao</surname> <given-names>Y.</given-names></name> <name><surname>Rossi</surname> <given-names>L.</given-names></name> <name><surname>Hancock</surname> <given-names>E.</given-names></name></person-group> (<year>2020</year>). <article-title>Learning backtrackless aligned-spatial graph convolutional networks for graph classification</article-title>. <source>IEEE Trans. Pattern Anal. Mach. Intell</source>. <volume>44</volume>, <fpage>783</fpage>&#x02013;<lpage>798</lpage>. <pub-id pub-id-type="doi">10.1109/TPAMI.2020.3011866</pub-id><pub-id pub-id-type="pmid">32750832</pub-id></citation></ref>
<ref id="B4">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Ben-David</surname> <given-names>S.</given-names></name> <name><surname>Blitzer</surname> <given-names>J.</given-names></name> <name><surname>Crammer</surname> <given-names>K.</given-names></name> <name><surname>Pereira</surname> <given-names>F.</given-names></name> <etal/></person-group>. (<year>2007</year>). <article-title>Analysis of representations for domain adaptation</article-title>. <source>Adv. Neural Inform. Process. Syst</source>. <volume>19</volume>:<fpage>137</fpage>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://proceedings.neurips.cc/paper/2006/file/b1b0432ceafb0ce714426e9114852ac7-Paper.pdf">https://proceedings.neurips.cc/paper/2006/file/b1b0432ceafb0ce714426e9114852ac7-Paper.pdf</ext-link></citation>
</ref>
<ref id="B5">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bonacich</surname> <given-names>P..</given-names></name></person-group> (<year>2007</year>). <article-title>Some unique properties of eigenvector centrality</article-title>. <source>Soc. Netw</source>. <volume>29</volume>, <fpage>555</fpage>&#x02013;<lpage>564</lpage>. <pub-id pub-id-type="doi">10.1016/j.socnet.2007.04.002</pub-id></citation>
</ref>
<ref id="B6">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bron</surname> <given-names>E. E.</given-names></name> <name><surname>Smits</surname> <given-names>M.</given-names></name> <name><surname>Van Der Flier</surname> <given-names>W. M.</given-names></name> <name><surname>Vrenken</surname> <given-names>H.</given-names></name> <name><surname>Barkhof</surname> <given-names>F.</given-names></name> <name><surname>Scheltens</surname> <given-names>P.</given-names></name> <etal/></person-group>. (<year>2015</year>). <article-title>Standardized evaluation of algorithms for computer-aided diagnosis of dementia based on structural MRI: The CADDementia challenge</article-title>. <source>Neuroimage</source> <volume>111</volume>, <fpage>562</fpage>&#x02013;<lpage>579</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2015.01.048</pub-id><pub-id pub-id-type="pmid">25652394</pub-id></citation></ref>
<ref id="B7">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Bruna</surname> <given-names>J.</given-names></name> <name><surname>Zaremba</surname> <given-names>W.</given-names></name> <name><surname>Szlam</surname> <given-names>A.</given-names></name> <name><surname>LeCun</surname> <given-names>Y.</given-names></name></person-group> (<year>2013</year>). <source>Spectral networks and locally connected networks on graphs. <italic>arXiv preprint arXiv:1312.6203</italic></source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/1312.6203">https://arxiv.org/abs/1312.6203</ext-link></citation>
</ref>
<ref id="B8">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Buch</surname> <given-names>A. M.</given-names></name> <name><surname>Liston</surname> <given-names>C.</given-names></name></person-group> (<year>2021</year>). <article-title>Dissecting diagnostic heterogeneity in depression by integrating neuroimaging and genetics</article-title>. <source>Neuropsychopharmacology</source> <volume>46</volume>, <fpage>156</fpage>&#x02013;<lpage>175</lpage>. <pub-id pub-id-type="doi">10.1038/s41386-020-00789-3</pub-id><pub-id pub-id-type="pmid">32781460</pub-id></citation></ref>
<ref id="B9">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>B&#x000FC;rger</surname> <given-names>C.</given-names></name> <name><surname>Redlich</surname> <given-names>R.</given-names></name> <name><surname>Grotegerd</surname> <given-names>D.</given-names></name> <name><surname>Meinert</surname> <given-names>S.</given-names></name> <name><surname>Dohm</surname> <given-names>K.</given-names></name> <name><surname>Schneider</surname> <given-names>I.</given-names></name> <etal/></person-group>. (<year>2017</year>). <article-title>Differential abnormal pattern of anterior cingulate gyrus activation in unipolar and bipolar depression: an fMRI and pattern classification approach</article-title>. <source>Neuropsychopharmacology</source> <volume>42</volume>, <fpage>1399</fpage>&#x02013;<lpage>1408</lpage>. <pub-id pub-id-type="doi">10.1038/npp.2017.36</pub-id><pub-id pub-id-type="pmid">28205606</pub-id></citation></ref>
<ref id="B10">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Calhoun</surname> <given-names>V. D.</given-names></name> <name><surname>Sui</surname> <given-names>J.</given-names></name></person-group> (<year>2016</year>). <article-title>Multimodal fusion of brain imaging data: a key to finding the missing link(s) in complex mental illness</article-title>. <source>Biol. Psychiatry</source> <volume>1</volume>, <fpage>230</fpage>&#x02013;<lpage>244</lpage>. <pub-id pub-id-type="doi">10.1016/j.bpsc.2015.12.005</pub-id><pub-id pub-id-type="pmid">27347565</pub-id></citation></ref>
<ref id="B11">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chakraborty</surname> <given-names>S.</given-names></name> <name><surname>Aich</surname> <given-names>S.</given-names></name> <name><surname>Kim</surname> <given-names>H.-C.</given-names></name></person-group> (<year>2020</year>). <article-title>Detection of Parkinson&#x00027;s disease from 3T T1 weighted MRI scans using 3D convolutional neural network</article-title>. <source>Diagnostics</source> <volume>10</volume>:<fpage>402</fpage>. <pub-id pub-id-type="doi">10.3390/diagnostics10060402</pub-id><pub-id pub-id-type="pmid">32545609</pub-id></citation></ref>
<ref id="B12">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>J.</given-names></name> <name><surname>Yang</surname> <given-names>L.</given-names></name> <name><surname>Zhang</surname> <given-names>Y.</given-names></name> <name><surname>Alber</surname> <given-names>M.</given-names></name> <name><surname>Chen</surname> <given-names>D. Z.</given-names></name></person-group> (<year>2016</year>). <article-title>&#x0201C;Combining fully convolutional and recurrent neural networks for 3D biomedical image segmentation,&#x0201D;</article-title> in <source>The 30th Conference on Neural Information Processing Systems</source> (<publisher-loc>Barcelona</publisher-loc>), <fpage>3036</fpage>&#x02013;<lpage>3044</lpage>.<pub-id pub-id-type="pmid">33915526</pub-id></citation></ref>
<ref id="B13">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cuadra</surname> <given-names>M. B.</given-names></name> <name><surname>Cammoun</surname> <given-names>L.</given-names></name> <name><surname>Butz</surname> <given-names>T.</given-names></name> <name><surname>Cuisenaire</surname> <given-names>O.</given-names></name> <name><surname>Thiran</surname> <given-names>J.-P.</given-names></name></person-group> (<year>2005</year>). <article-title>Comparison and validation of tissue modelization and statistical classification methods in T1-weighted MR brain images</article-title>. <source>IEEE Trans. Med. Imaging</source> <volume>24</volume>, <fpage>1548</fpage>&#x02013;<lpage>1565</lpage>. <pub-id pub-id-type="doi">10.1109/TMI.2005.857652</pub-id><pub-id pub-id-type="pmid">16350916</pub-id></citation></ref>
<ref id="B14">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Dvornek</surname> <given-names>N. C.</given-names></name> <name><surname>Ventola</surname> <given-names>P.</given-names></name> <name><surname>Pelphrey</surname> <given-names>K. A.</given-names></name> <name><surname>Duncan</surname> <given-names>J. S.</given-names></name></person-group> (<year>2017</year>). <article-title>&#x0201C;Identifying Autism from resting-state fMRI using long short-term memory networks,&#x0201D;</article-title> in <source>International Workshop on Machine Learning in Medical Imaging</source> (<publisher-loc>Quebec City, QC</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>362</fpage>&#x02013;<lpage>370</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-319-67389-9_42</pub-id><pub-id pub-id-type="pmid">29104967</pub-id></citation></ref>
<ref id="B15">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Foti</surname> <given-names>D.</given-names></name> <name><surname>Carlson</surname> <given-names>J. M.</given-names></name> <name><surname>Sauder</surname> <given-names>C. L.</given-names></name> <name><surname>Proudfit</surname> <given-names>G. H.</given-names></name></person-group> (<year>2014</year>). <article-title>Reward dysfunction in major depression: multimodal neuroimaging evidence for refining the melancholic phenotype</article-title>. <source>Neuroimage</source> <volume>101</volume>, <fpage>50</fpage>&#x02013;<lpage>58</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2014.06.058</pub-id><pub-id pub-id-type="pmid">24996119</pub-id></citation></ref>
<ref id="B16">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fu</surname> <given-names>C. H.</given-names></name> <name><surname>Costafreda</surname> <given-names>S. G.</given-names></name> <name><surname>Sankar</surname> <given-names>A.</given-names></name> <name><surname>Adams</surname> <given-names>T. M.</given-names></name> <name><surname>Rasenick</surname> <given-names>M. M.</given-names></name> <name><surname>Liu</surname> <given-names>P.</given-names></name> <etal/></person-group>. (<year>2015</year>). <article-title>Multimodal functional and structural neuroimaging investigation of major depressive disorder following treatment with duloxetine</article-title>. <source>BMC Psychiatry</source> <volume>15</volume>:<fpage>82</fpage>. <pub-id pub-id-type="doi">10.1186/s12888-015-0457-2</pub-id><pub-id pub-id-type="pmid">25880400</pub-id></citation></ref>
<ref id="B17">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Gadgil</surname> <given-names>S.</given-names></name> <name><surname>Zhao</surname> <given-names>Q.</given-names></name> <name><surname>Pfefferbaum</surname> <given-names>A.</given-names></name> <name><surname>Sullivan</surname> <given-names>E. V.</given-names></name> <name><surname>Adeli</surname> <given-names>E.</given-names></name> <name><surname>Pohl</surname> <given-names>K. M.</given-names></name></person-group> (<year>2020</year>). <article-title>&#x0201C;Spatio-temporal graph convolution for resting-state fMRI analysis,&#x0201D;</article-title> in <source>International Conference on Medical Image Computing and Computer-Assisted Intervention</source> (<publisher-loc>Springer</publisher-loc>), <fpage>528</fpage>&#x02013;<lpage>538</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-030-59728-3_52</pub-id><pub-id pub-id-type="pmid">33257918</pub-id></citation></ref>
<ref id="B18">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gao</surname> <given-names>S.</given-names></name> <name><surname>Calhoun</surname> <given-names>V. D.</given-names></name> <name><surname>Sui</surname> <given-names>J.</given-names></name></person-group> (<year>2018</year>). <article-title>Machine learning in major depression: from classification to treatment outcome prediction</article-title>. <source>CNS Neurosci. Therap</source>. <volume>24</volume>, <fpage>1037</fpage>&#x02013;<lpage>1052</lpage>. <pub-id pub-id-type="doi">10.1111/cns.13048</pub-id><pub-id pub-id-type="pmid">30136381</pub-id></citation></ref>
<ref id="B19">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ge</surname> <given-names>R.</given-names></name> <name><surname>Gregory</surname> <given-names>E.</given-names></name> <name><surname>Wang</surname> <given-names>J.</given-names></name> <name><surname>Ainsworth</surname> <given-names>N.</given-names></name> <name><surname>Jian</surname> <given-names>W.</given-names></name> <name><surname>Yang</surname> <given-names>C.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>Magnetic seizure therapy is associated with functional and structural brain changes in MDD: therapeutic versus side effect correlates</article-title>. <source>J. Affect. Disord</source>. <volume>286</volume>, <fpage>40</fpage>&#x02013;<lpage>48</lpage>. <pub-id pub-id-type="doi">10.1016/j.jad.2021.02.051</pub-id><pub-id pub-id-type="pmid">33676262</pub-id></citation></ref>
<ref id="B20">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Gretton</surname> <given-names>A.</given-names></name> <name><surname>Borgwardt</surname> <given-names>K. M.</given-names></name> <name><surname>Rasch</surname> <given-names>M. J.</given-names></name> <name><surname>Sch&#x000F6;lkopf</surname> <given-names>B.</given-names></name> <name><surname>Smola</surname> <given-names>A.</given-names></name></person-group> (<year>2012</year>). <article-title>A kernel two-sample test</article-title>. <source>J. Mach. Learn. Res</source>. <volume>13</volume>, <fpage>723</fpage>&#x02013;<lpage>773</lpage>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.jmlr.org/papers/volume13/gretton12a/gretton12a.pdf?ref=https://githubhelp.com">https://www.jmlr.org/papers/volume13/gretton12a/gretton12a.pdf?ref=https://githubhelp.com</ext-link></citation>
</ref>
<ref id="B21">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Guan</surname> <given-names>H.</given-names></name> <name><surname>Liu</surname> <given-names>M.</given-names></name></person-group> (<year>2021</year>). <article-title>Domain adaptation for medical image analysis: a survey</article-title>. <source>IEEE Trans. Biomed. Eng</source>. <volume>69</volume>, <fpage>1173</fpage>&#x02013;<lpage>1185</lpage>. <pub-id pub-id-type="doi">10.1109/TBME.2021.3117407</pub-id><pub-id pub-id-type="pmid">34606445</pub-id></citation></ref>
<ref id="B22">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Guo</surname> <given-names>T.</given-names></name> <name><surname>Zhang</surname> <given-names>Y.</given-names></name> <name><surname>Xue</surname> <given-names>Y.</given-names></name> <name><surname>Qiao</surname> <given-names>L.</given-names></name> <name><surname>Shen</surname> <given-names>D.</given-names></name></person-group> (<year>2021</year>). <article-title>Brain function network: higher order vs. more discrimination</article-title>. <source>Front. Neurosci</source>. <volume>2021</volume>:<fpage>1033</fpage>. <pub-id pub-id-type="doi">10.3389/fnins.2021.696639</pub-id><pub-id pub-id-type="pmid">34497485</pub-id></citation></ref>
<ref id="B23">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>He</surname> <given-names>K.</given-names></name> <name><surname>Zhang</surname> <given-names>X.</given-names></name> <name><surname>Ren</surname> <given-names>S.</given-names></name> <name><surname>Sun</surname> <given-names>J.</given-names></name></person-group> (<year>2016</year>). <article-title>&#x0201C;Deep residual learning for image recognition,&#x0201D;</article-title> in <source>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</source> (<publisher-loc>Las Vegas, NV</publisher-loc>), <fpage>770</fpage>&#x02013;<lpage>778</lpage>. <pub-id pub-id-type="doi">10.1109/CVPR.2016.90</pub-id><pub-id pub-id-type="pmid">32166560</pub-id></citation></ref>
<ref id="B24">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hinrichs</surname> <given-names>C.</given-names></name> <name><surname>Singh</surname> <given-names>V.</given-names></name> <name><surname>Xu</surname> <given-names>G.</given-names></name> <name><surname>Johnson</surname> <given-names>S. C.</given-names></name></person-group> (<year>2011</year>). <article-title>Predictive markers for AD in a multi-modality framework: an analysis of MCI progression in the ADNI population</article-title>. <source>Neuroimage</source> <volume>55</volume>, <fpage>574</fpage>&#x02013;<lpage>589</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2010.10.081</pub-id><pub-id pub-id-type="pmid">21146621</pub-id></citation></ref>
<ref id="B25">
<citation citation-type="journal"><person-group person-group-type="author"><collab>Holtzheimer P. E. III. and Nemeroff C. B.</collab></person-group> (<year>2006</year>). <article-title>Future prospects in depression research</article-title>. <source>Dial. Clin. Neurosci</source>. <volume>8</volume>:<fpage>175</fpage>. <pub-id pub-id-type="doi">10.31887/DCNS.2006.8.2/pholtzheimer</pub-id></citation>
</ref>
<ref id="B26">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Honey</surname> <given-names>C. J.</given-names></name> <name><surname>Sporns</surname> <given-names>O.</given-names></name> <name><surname>Cammoun</surname> <given-names>L.</given-names></name> <name><surname>Gigandet</surname> <given-names>X.</given-names></name> <name><surname>Thiran</surname> <given-names>J.-P.</given-names></name> <name><surname>Meuli</surname> <given-names>R.</given-names></name> <etal/></person-group>. (<year>2009</year>). <article-title>Predicting human resting-state functional connectivity from structural connectivity</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A</source>. <volume>106</volume>, <fpage>2035</fpage>&#x02013;<lpage>2040</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.0811168106</pub-id><pub-id pub-id-type="pmid">19188601</pub-id></citation></ref>
<ref id="B27">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Hosseini-Asl</surname> <given-names>E.</given-names></name> <name><surname>Keynton</surname> <given-names>R.</given-names></name> <name><surname>El-Baz</surname> <given-names>A.</given-names></name></person-group> (<year>2016</year>). <article-title>&#x0201C;Alzheimer&#x00027;s disease diagnostics by adaptation of 3D convolutional network,&#x0201D;</article-title> in <source>IEEE International Conference on Image Processing (ICIP)</source> (<publisher-loc>Phoenix, AZ</publisher-loc>), <fpage>126</fpage>&#x02013;<lpage>130</lpage>. <pub-id pub-id-type="doi">10.1109/ICIP.2016.7532332</pub-id></citation>
</ref>
<ref id="B28">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Huang</surname> <given-names>Y.</given-names></name> <name><surname>Xu</surname> <given-names>J.</given-names></name> <name><surname>Zhou</surname> <given-names>Y.</given-names></name> <name><surname>Tong</surname> <given-names>T.</given-names></name> <name><surname>Zhuang</surname> <given-names>X.</given-names></name></person-group> (<year>2019</year>). <article-title>Diagnosis of Alzheimer&#x00027;s disease <italic>via</italic> multi-modality 3D convolutional neural network</article-title>. <source>Front. Neurosci</source>. <volume>13</volume>:<fpage>509</fpage>. <pub-id pub-id-type="doi">10.3389/fnins.2019.00509</pub-id><pub-id pub-id-type="pmid">31213967</pub-id></citation></ref>
<ref id="B29">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jenkinson</surname> <given-names>M.</given-names></name> <name><surname>Bannister</surname> <given-names>P.</given-names></name> <name><surname>Brady</surname> <given-names>M.</given-names></name> <name><surname>Smith</surname> <given-names>S.</given-names></name></person-group> (<year>2002</year>). <article-title>Improved optimization for the robust and accurate linear registration and motion correction of brain images</article-title>. <source>Neuroimage</source> <volume>17</volume>, <fpage>825</fpage>&#x02013;<lpage>841</lpage>. <pub-id pub-id-type="doi">10.1006/nimg.2002.1132</pub-id><pub-id pub-id-type="pmid">12377157</pub-id></citation></ref>
<ref id="B30">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Kingma</surname> <given-names>D. P.</given-names></name> <name><surname>Ba</surname> <given-names>J.</given-names></name></person-group> (<year>2014</year>). <source>Adam: a method for stochastic optimization. <italic>arXiv preprint arXiv:1412.6980</italic></source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/1412.6980">https://arxiv.org/abs/1412.6980</ext-link></citation>
</ref>
<ref id="B31">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Kipf</surname> <given-names>T. N.</given-names></name> <name><surname>Welling</surname> <given-names>M.</given-names></name></person-group> (<year>2016</year>). <source>Semi-supervised classification with graph convolutional networks. <italic>arXiv preprint arXiv:1609.02907</italic></source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/1609.02907">https://arxiv.org/abs/1609.02907</ext-link></citation>
</ref>
<ref id="B32">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ktena</surname> <given-names>S. I.</given-names></name> <name><surname>Parisot</surname> <given-names>S.</given-names></name> <name><surname>Ferrante</surname> <given-names>E.</given-names></name> <name><surname>Rajchl</surname> <given-names>M.</given-names></name> <name><surname>Lee</surname> <given-names>M.</given-names></name> <name><surname>Glocker</surname> <given-names>B.</given-names></name> <etal/></person-group>. (<year>2018</year>). <article-title>Metric learning with spectral graph convolutions on brain connectivity networks</article-title>. <source>NeuroImage</source> <volume>169</volume>, <fpage>431</fpage>&#x02013;<lpage>442</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2017.12.052</pub-id><pub-id pub-id-type="pmid">29278772</pub-id></citation></ref>
<ref id="B33">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>LeCun</surname> <given-names>Y.</given-names></name> <name><surname>Boser</surname> <given-names>B.</given-names></name> <name><surname>Denker</surname> <given-names>J. S.</given-names></name> <name><surname>Henderson</surname> <given-names>D.</given-names></name> <name><surname>Howard</surname> <given-names>R. E.</given-names></name> <name><surname>Hubbard</surname> <given-names>W.</given-names></name> <etal/></person-group>. (<year>1989</year>). <article-title>Backpropagation applied to handwritten zip code recognition</article-title>. <source>Neural Comput</source>. <volume>1</volume>, <fpage>541</fpage>&#x02013;<lpage>551</lpage>. <pub-id pub-id-type="doi">10.1162/neco.1989.1.4.541</pub-id></citation>
</ref>
<ref id="B34">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Lee</surname> <given-names>J.</given-names></name> <name><surname>Lee</surname> <given-names>I.</given-names></name> <name><surname>Kang</surname> <given-names>J.</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;Self-attention graph pooling,&#x0201D;</article-title> in <source>Proceedings of the 36 th International Conference on Machine Learning</source> (<publisher-loc>Long Beach, CA</publisher-loc>: <publisher-name>PMLR</publisher-name>), <fpage>3734</fpage>&#x02013;<lpage>3743</lpage>.</citation>
</ref>
<ref id="B35">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lee</surname> <given-names>J.-G.</given-names></name> <name><surname>Jun</surname> <given-names>S.</given-names></name> <name><surname>Cho</surname> <given-names>Y.-W.</given-names></name> <name><surname>Lee</surname> <given-names>H.</given-names></name> <name><surname>Kim</surname> <given-names>G. B.</given-names></name> <name><surname>Seo</surname> <given-names>J. B.</given-names></name> <etal/></person-group>. (<year>2017</year>). <article-title>Deep learning in medical imaging: general overview</article-title>. <source>Korean J. Radiol</source>. <volume>18</volume>, <fpage>570</fpage>&#x02013;<lpage>584</lpage>. <pub-id pub-id-type="doi">10.3348/kjr.2017.18.4.570</pub-id><pub-id pub-id-type="pmid">28670152</pub-id></citation></ref>
<ref id="B36">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>M.</given-names></name> <name><surname>Liu</surname> <given-names>M.</given-names></name> <name><surname>Kang</surname> <given-names>J.</given-names></name> <name><surname>Zhang</surname> <given-names>W.</given-names></name> <name><surname>Lu</surname> <given-names>S.</given-names></name></person-group> (<year>2021</year>). <article-title>&#x0201C;Depression recognition method based on regional homogeneity features from emotional response fMRI using deep convolutional neural network,&#x0201D;</article-title> in 2021 <italic>3rd International Conference on Intelligent Medicine and Image Processing</italic> (Tianjin), <fpage>45</fpage>&#x02013;<lpage>49</lpage>. <pub-id pub-id-type="doi">10.1145/3468945.3468953</pub-id></citation>
</ref>
<ref id="B37">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Lin</surname> <given-names>M.</given-names></name> <name><surname>Chen</surname> <given-names>Q.</given-names></name> <name><surname>Yan</surname> <given-names>S.</given-names></name></person-group> (<year>2013</year>). <source>Network in network. <italic>arXiv preprint arXiv:1312.4400</italic></source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/1312.4400">https://arxiv.org/abs/1312.4400</ext-link></citation>
</ref>
<ref id="B38">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>M.</given-names></name> <name><surname>Zhang</surname> <given-names>D.</given-names></name></person-group> (<year>2014</year>). <article-title>Sparsity score: a novel graph-preserving feature selection method</article-title>. <source>Int. J. Pattern Recogn. Artif. Intell</source>. <volume>28</volume>:<fpage>1450009</fpage>. <pub-id pub-id-type="doi">10.1142/S0218001414500098</pub-id></citation>
</ref>
<ref id="B39">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Maglanoc</surname> <given-names>L. A.</given-names></name> <name><surname>Kaufmann</surname> <given-names>T.</given-names></name> <name><surname>Jonassen</surname> <given-names>R.</given-names></name> <name><surname>Hilland</surname> <given-names>E.</given-names></name> <name><surname>Beck</surname> <given-names>D.</given-names></name> <name><surname>Landr&#x000F8;</surname> <given-names>N. I.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Multimodal fusion of structural and functional brain imaging in depression using linked independent component analysis</article-title>. <source>Hum. Brain Mapp</source>. <volume>41</volume>, <fpage>241</fpage>&#x02013;<lpage>255</lpage>. <pub-id pub-id-type="doi">10.1002/hbm.24802</pub-id><pub-id pub-id-type="pmid">31571370</pub-id></citation></ref>
<ref id="B40">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nieminen</surname> <given-names>J..</given-names></name></person-group> (<year>1974</year>). <article-title>On the centrality in a graph</article-title>. <source>Scand. J. Psychol</source>. <volume>15</volume>, <fpage>332</fpage>&#x02013;<lpage>336</lpage>. <pub-id pub-id-type="doi">10.1111/j.1467-9450.1974.tb00598.x</pub-id><pub-id pub-id-type="pmid">4453827</pub-id></citation></ref>
<ref id="B41">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Otte</surname> <given-names>C.</given-names></name> <name><surname>Gold</surname> <given-names>S. M.</given-names></name> <name><surname>Penninx</surname> <given-names>B. W.</given-names></name> <name><surname>Pariante</surname> <given-names>C. M.</given-names></name> <name><surname>Etkin</surname> <given-names>A.</given-names></name> <name><surname>Fava</surname> <given-names>M.</given-names></name> <etal/></person-group>. (<year>2016</year>). <article-title>Major depressive disorder</article-title>. <source>Nat. Rev. Dis. Rrimers</source> <volume>2</volume>, <fpage>1</fpage>&#x02013;<lpage>20</lpage>. <pub-id pub-id-type="doi">10.1038/nrdp.2016.65</pub-id><pub-id pub-id-type="pmid">27629598</pub-id></citation></ref>
<ref id="B42">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Papakostas</surname> <given-names>G. I..</given-names></name></person-group> (<year>2009</year>). <article-title>Managing partial response or nonresponse: switching, augmentation, and combination strategies for major depressive disorder</article-title>. <source>J. Clin. Psychiatry</source> <volume>70</volume>, <fpage>16</fpage>&#x02013;<lpage>25</lpage>. <pub-id pub-id-type="doi">10.4088/JCP.8133su1c.03</pub-id><pub-id pub-id-type="pmid">19922740</pub-id></citation></ref>
<ref id="B43">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Parisot</surname> <given-names>S.</given-names></name> <name><surname>Ktena</surname> <given-names>S. I.</given-names></name> <name><surname>Ferrante</surname> <given-names>E.</given-names></name> <name><surname>Lee</surname> <given-names>M.</given-names></name> <name><surname>Guerrero</surname> <given-names>R.</given-names></name> <name><surname>Glocker</surname> <given-names>B.</given-names></name> <etal/></person-group>. (<year>2018</year>). <article-title>Disease prediction using graph convolutional networks: application to autism spectrum disorder and Alzheimer&#x00027;s disease</article-title>. <source>Med. Image Anal</source>. <volume>48</volume>, <fpage>117</fpage>&#x02013;<lpage>130</lpage>. <pub-id pub-id-type="doi">10.1016/j.media.2018.06.001</pub-id><pub-id pub-id-type="pmid">29890408</pub-id></citation></ref>
<ref id="B44">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Paszke</surname> <given-names>A.</given-names></name> <name><surname>Gross</surname> <given-names>S.</given-names></name> <name><surname>Chintala</surname> <given-names>S.</given-names></name> <name><surname>Chanan</surname> <given-names>G.</given-names></name> <name><surname>Yang</surname> <given-names>E.</given-names></name> <name><surname>Devito</surname> <given-names>Z.</given-names></name> <etal/></person-group>. (<year>2017</year>). <article-title>&#x0201C;Automatic differentiation in pytorch,&#x0201D;</article-title> in 31st Conference on Neural Information Processing Systems (Long Beach, CA), <fpage>1</fpage>&#x02013;<lpage>4</lpage>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://openreview.net/pdf?id=BJJsrmfCZ">https://openreview.net/pdf?id=BJJsrmfCZ</ext-link></citation>
</ref>
<ref id="B45">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pizzagalli</surname> <given-names>D. A.</given-names></name> <name><surname>Iosifescu</surname> <given-names>D.</given-names></name> <name><surname>Hallett</surname> <given-names>L. A.</given-names></name> <name><surname>Ratner</surname> <given-names>K. G.</given-names></name> <name><surname>Fava</surname> <given-names>M.</given-names></name></person-group> (<year>2008</year>). <article-title>Reduced hedonic capacity in major depressive disorder: evidence from a probabilistic reward task</article-title>. <source>J. Psychiatr. Res</source>. <volume>43</volume>, <fpage>76</fpage>&#x02013;<lpage>87</lpage>. <pub-id pub-id-type="doi">10.1016/j.jpsychires.2008.03.001</pub-id><pub-id pub-id-type="pmid">18433774</pub-id></citation></ref>
<ref id="B46">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rajalingam</surname> <given-names>B.</given-names></name> <name><surname>Priya</surname> <given-names>R.</given-names></name></person-group> (<year>2018</year>). <article-title>Multimodal medical image fusion based on deep learning neural network for clinical treatment analysis</article-title>. <source>Int. J. ChemTech Res</source>. <volume>11</volume>, <fpage>160</fpage>&#x02013;<lpage>176</lpage>. <pub-id pub-id-type="doi">10.20902/ijctr.2018.110621</pub-id><pub-id pub-id-type="pmid">31527580</pub-id></citation></ref>
<ref id="B47">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rubin-Falcone</surname> <given-names>H.</given-names></name> <name><surname>Zanderigo</surname> <given-names>F.</given-names></name> <name><surname>Thapa-Chhetry</surname> <given-names>B.</given-names></name> <name><surname>Lan</surname> <given-names>M.</given-names></name> <name><surname>Miller</surname> <given-names>J. M.</given-names></name> <name><surname>Sublette</surname> <given-names>M. E.</given-names></name> <etal/></person-group>. (<year>2018</year>). <article-title>Pattern recognition of magnetic resonance imaging-based gray matter volume measurements classifies bipolar disorder and major depressive disorder</article-title>. <source>J. Affect. Disord</source>. <volume>227</volume>, <fpage>498</fpage>&#x02013;<lpage>505</lpage>. <pub-id pub-id-type="doi">10.1016/j.jad.2017.11.043</pub-id><pub-id pub-id-type="pmid">29156364</pub-id></citation></ref>
<ref id="B48">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sarraf</surname> <given-names>S.</given-names></name> <name><surname>Tofighi</surname> <given-names>G.</given-names></name></person-group> (<year>2016</year>). <article-title>DeepAD: Alzheimer&#x00027;s disease classification <italic>via</italic> deep convolutional neural networks using MRI and fMRI</article-title>. <source>BioRxiv</source> <volume>2016</volume>:<fpage>070441</fpage>. <pub-id pub-id-type="doi">10.1101/070441</pub-id></citation>
</ref>
<ref id="B49">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sato</surname> <given-names>J. R.</given-names></name> <name><surname>oll</surname> <given-names>J.</given-names></name> <name><surname>Green</surname> <given-names>S.</given-names></name> <name><surname>Deakin</surname> <given-names>J. F.</given-names></name> <name><surname>Thomaz</surname> <given-names>C. E.</given-names></name> <name><surname>Zahn</surname> <given-names>R.</given-names></name></person-group> (<year>2015</year>). <article-title>Machine learning algorithm accurately detects fMRI signature of vulnerability to major depression</article-title>. <source>Psychiatry Res</source>. <volume>233</volume>, <fpage>289</fpage>&#x02013;<lpage>291</lpage>. <pub-id pub-id-type="doi">10.1016/j.pscychresns.2015.07.001</pub-id><pub-id pub-id-type="pmid">26187550</pub-id></citation></ref>
<ref id="B50">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Scheltens</surname> <given-names>P.</given-names></name> <name><surname>Leys</surname> <given-names>D.</given-names></name> <name><surname>Barkhof</surname> <given-names>F.</given-names></name> <name><surname>Huglo</surname> <given-names>D.</given-names></name> <name><surname>Weinstein</surname> <given-names>H.</given-names></name> <name><surname>Vermersch</surname> <given-names>P.</given-names></name> <etal/></person-group>. (<year>1992</year>). <article-title>Atrophy of medial temporal lobes on MRI in &#x0201C;probable&#x0201D; Alzheimer&#x00027;s disease and normal ageing: diagnostic value and neuropsychological correlates</article-title>. <source>J. Neurol. Neurosurg. Psychiatry</source> <volume>55</volume>, <fpage>967</fpage>&#x02013;<lpage>972</lpage>. <pub-id pub-id-type="doi">10.1136/jnnp.55.10.967</pub-id><pub-id pub-id-type="pmid">22566596</pub-id></citation></ref>
<ref id="B51">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shen</surname> <given-names>D.</given-names></name> <name><surname>Wu</surname> <given-names>G.</given-names></name> <name><surname>Suk</surname> <given-names>H.-I.</given-names></name></person-group> (<year>2017</year>). <article-title>Deep learning in medical image analysis</article-title>. <source>Annu. Rev. Biomed. Eng</source>. <volume>19</volume>, <fpage>221</fpage>&#x02013;<lpage>248</lpage>. <pub-id pub-id-type="doi">10.1146/annurev-bioeng-071516-044442</pub-id><pub-id pub-id-type="pmid">28301734</pub-id></citation></ref>
<ref id="B52">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shi</surname> <given-names>J.</given-names></name> <name><surname>Xue</surname> <given-names>Z.</given-names></name> <name><surname>Dai</surname> <given-names>Y.</given-names></name> <name><surname>Peng</surname> <given-names>B.</given-names></name> <name><surname>Dong</surname> <given-names>Y.</given-names></name> <name><surname>Zhang</surname> <given-names>Q.</given-names></name> <etal/></person-group>. (<year>2018</year>). <article-title>Cascaded multi-column RVFL&#x0002B; classifier for single-modal neuroimaging-based diagnosis of Parkinson&#x00027;s disease</article-title>. <source>IEEE Trans. Biomed. Eng</source>. <volume>66</volume>, <fpage>2362</fpage>&#x02013;<lpage>2371</lpage>. <pub-id pub-id-type="doi">10.1109/TBME.2018.2889398</pub-id><pub-id pub-id-type="pmid">30582522</pub-id></citation></ref>
<ref id="B53">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Simonyan</surname> <given-names>K.</given-names></name> <name><surname>Zisserman</surname> <given-names>A.</given-names></name></person-group> (<year>2014</year>). <source>Very deep convolutional networks for large-scale image recognition. <italic>arXiv preprint arXiv:1409.1556</italic></source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/1409.1556">https://arxiv.org/abs/1409.1556</ext-link></citation>
</ref>
<ref id="B54">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Srivastava</surname> <given-names>N.</given-names></name> <name><surname>Hinton</surname> <given-names>G.</given-names></name> <name><surname>Krizhevsky</surname> <given-names>A.</given-names></name> <name><surname>Sutskever</surname> <given-names>I.</given-names></name> <name><surname>Salakhutdinov</surname> <given-names>R.</given-names></name></person-group> (<year>2014</year>). <article-title>Dropout: a simple way to prevent neural networks from overfitting</article-title>. <source>J. Mach. Learn. Res</source>. <volume>15</volume>, <fpage>1929</fpage>&#x02013;<lpage>1958</lpage>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf?utm_content=buffer79b43&#x00026;utm_medium=social&#x00026;utm_source=twitter.com&#x00026;utm_campaign=buffer">https://www.jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf?utm_content=buffer79b43&#x00026;utm_medium=social&#x00026;utm_source=twitter.com&#x00026;utm_campaign=buffer</ext-link></citation>
</ref>
<ref id="B55">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sui</surname> <given-names>J.</given-names></name> <name><surname>He</surname> <given-names>H.</given-names></name> <name><surname>Pearlson</surname> <given-names>G. D.</given-names></name> <name><surname>Adali</surname> <given-names>T.</given-names></name> <name><surname>Kiehl</surname> <given-names>K. A.</given-names></name> <name><surname>Yu</surname> <given-names>Q.</given-names></name> <etal/></person-group>. (<year>2013</year>). <article-title>Three-way (N-way) fusion of brain imaging data based on mCCA&#x0002B; jICA and its application to discriminating schizophrenia</article-title>. <source>Neuroimage</source> <volume>66</volume>, <fpage>119</fpage>&#x02013;<lpage>132</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2012.10.051</pub-id><pub-id pub-id-type="pmid">23108278</pub-id></citation></ref>
<ref id="B56">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sun</surname> <given-names>L.</given-names></name> <name><surname>Xue</surname> <given-names>Y.</given-names></name> <name><surname>Zhang</surname> <given-names>Y.</given-names></name> <name><surname>Qiao</surname> <given-names>L.</given-names></name> <name><surname>Zhang</surname> <given-names>L.</given-names></name> <name><surname>Liu</surname> <given-names>M.</given-names></name></person-group> (<year>2021</year>). <article-title>Estimating sparse functional connectivity networks via hyperparameter-free learning model</article-title>. <source>Artif. Intell. Med</source>. <volume>111</volume>:<fpage>102004</fpage>. <pub-id pub-id-type="doi">10.1016/j.artmed.2020.102004</pub-id><pub-id pub-id-type="pmid">33461688</pub-id></citation></ref>
<ref id="B57">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Tzeng</surname> <given-names>E.</given-names></name> <name><surname>Hoffman</surname> <given-names>J.</given-names></name> <name><surname>Zhang</surname> <given-names>N.</given-names></name> <name><surname>Saenko</surname> <given-names>K.</given-names></name> <name><surname>Darrell</surname> <given-names>T.</given-names></name></person-group> (<year>2014</year>). <source>Deep domain confusion: maximizing for domain invariance. <italic>arXiv preprint arXiv:1412.3474</italic></source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/1412.3474">https://arxiv.org/abs/1412.3474</ext-link></citation>
</ref>
<ref id="B58">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Van Den Heuvel</surname> <given-names>M. P.</given-names></name> <name><surname>Pol</surname> <given-names>H. E. H.</given-names></name></person-group> (<year>2010</year>). <article-title>Exploring the brain network: a review on resting-state fmri functional connectivity</article-title>. <source>Eur. Neuropsychopharmacol</source>. <volume>20</volume>, <fpage>519</fpage>&#x02013;<lpage>534</lpage>. <pub-id pub-id-type="doi">10.1016/j.euroneuro.2010.03.008</pub-id><pub-id pub-id-type="pmid">20471808</pub-id></citation></ref>
<ref id="B59">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Van der Maaten</surname> <given-names>L.</given-names></name> <name><surname>Hinton</surname> <given-names>G.</given-names></name></person-group> (<year>2008</year>). <article-title>Visualizing data using t-SNE</article-title>. <source>J. Mach. Learn. Res</source>. <volume>9</volume>, <fpage>2579</fpage>&#x02013;<lpage>2605</lpage>. Available online at: <ext-link ext-link-type="uri" xlink:href="http://jmlr.org/papers/v9/vandermaaten08a.html">http://jmlr.org/papers/v9/vandermaaten08a.html</ext-link></citation>
</ref>
<ref id="B60">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>M.</given-names></name> <name><surname>Lian</surname> <given-names>C.</given-names></name> <name><surname>Yao</surname> <given-names>D.</given-names></name> <name><surname>Zhang</surname> <given-names>D.</given-names></name> <name><surname>Liu</surname> <given-names>M.</given-names></name> <name><surname>Shen</surname> <given-names>D.</given-names></name></person-group> (<year>2019</year>). <article-title>Spatial-temporal dependency modeling and network hub detection for functional MRI analysis via convolutional-recurrent network</article-title>. <source>IEEE Trans. Biomed. Eng</source>. <volume>67</volume>, <fpage>2241</fpage>&#x02013;<lpage>2252</lpage>. <pub-id pub-id-type="doi">10.1109/TBME.2019.2957921</pub-id><pub-id pub-id-type="pmid">31825859</pub-id></citation></ref>
<ref id="B61">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wee</surname> <given-names>C.-Y.</given-names></name> <name><surname>Yap</surname> <given-names>P.-T.</given-names></name> <name><surname>Zhang</surname> <given-names>D.</given-names></name> <name><surname>Denny</surname> <given-names>K.</given-names></name> <name><surname>Browndyke</surname> <given-names>J. N.</given-names></name> <name><surname>Potter</surname> <given-names>G. G.</given-names></name> <etal/></person-group>. (<year>2012</year>). <article-title>Identification of MCI individuals using structural and functional connectivity networks</article-title>. <source>Neuroimage</source> <volume>59</volume>, <fpage>2045</fpage>&#x02013;<lpage>2056</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2011.10.015</pub-id><pub-id pub-id-type="pmid">22019883</pub-id></citation></ref>
<ref id="B62">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wold</surname> <given-names>S.</given-names></name> <name><surname>Esbensen</surname> <given-names>K.</given-names></name> <name><surname>Geladi</surname> <given-names>P.</given-names></name></person-group> (<year>1987</year>). <article-title>Principal component analysis</article-title>. <source>Chemometr. Intell. Lab. Syst</source>. <volume>2</volume>, <fpage>37</fpage>&#x02013;<lpage>52</lpage>. <pub-id pub-id-type="doi">10.1016/0169-7439(87)80084-9</pub-id></citation>
</ref>
<ref id="B63">
<citation citation-type="book"><person-group person-group-type="author"><collab>World Health Organization</collab></person-group> (<year>2017</year>). <source>Depression and Other Common Mental Disorders: Global Health Estimates</source>. Technical report, World Health Organization.</citation>
</ref>
<ref id="B64">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yan</surname> <given-names>C.</given-names></name> <name><surname>Wang</surname> <given-names>X.</given-names></name> <name><surname>Zuo</surname> <given-names>X.</given-names></name> <name><surname>Zang</surname> <given-names>Y.</given-names></name></person-group> (<year>2016</year>). <article-title>DPABI: data processing &#x00026; analysis for (resting-state) brain imaging</article-title>. <source>Neuroinformatics</source> <volume>14</volume>, <fpage>339</fpage>&#x02013;<lpage>351</lpage>. <pub-id pub-id-type="doi">10.1007/s12021-016-9299-4</pub-id><pub-id pub-id-type="pmid">27075850</pub-id></citation></ref>
<ref id="B65">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yan</surname> <given-names>C.</given-names></name> <name><surname>Zang</surname> <given-names>Y.</given-names></name></person-group> (<year>2010</year>). <article-title>DPARSF: A matlab toolbox for &#x0201C;pipeline&#x0201D; data analysis of resting-state fMRI</article-title>. <source>Front. Syst. Neurosci</source>. <volume>4</volume>:<fpage>13</fpage>. <pub-id pub-id-type="doi">10.3389/fnsys.2010.00013</pub-id><pub-id pub-id-type="pmid">20577591</pub-id></citation></ref>
<ref id="B66">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yan</surname> <given-names>C.-G.</given-names></name> <name><surname>Chen</surname> <given-names>X.</given-names></name> <name><surname>Li</surname> <given-names>L.</given-names></name> <name><surname>Castellanos</surname> <given-names>F. X.</given-names></name> <name><surname>Bai</surname> <given-names>T.-J.</given-names></name> <name><surname>Bo</surname> <given-names>Q.-J.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>Reduced default mode network functional connectivity in recurrent patients with major depressive disorder: evidence from 25 cohorts</article-title>. <source>bioRxiv</source> <volume>2019</volume>:<fpage>321745</fpage>. <pub-id pub-id-type="doi">10.1073/pnas.1900390116</pub-id><pub-id pub-id-type="pmid">30979801</pub-id></citation></ref>
<ref id="B67">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Yang</surname> <given-names>C.</given-names></name> <name><surname>Wang</surname> <given-names>R.</given-names></name> <name><surname>Yao</surname> <given-names>S.</given-names></name> <name><surname>Liu</surname> <given-names>S.</given-names></name> <name><surname>Abdelzaher</surname> <given-names>T.</given-names></name></person-group> (<year>2020</year>). <source>Revisiting over-smoothing in deep GCNs. <italic>arXiv preprint arXiv:2003.13663</italic></source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/2003.13663">https://arxiv.org/abs/2003.13663</ext-link></citation>
</ref>
<ref id="B68">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yao</surname> <given-names>D.</given-names></name> <name><surname>Sui</surname> <given-names>J.</given-names></name> <name><surname>Wang</surname> <given-names>M.</given-names></name> <name><surname>Yang</surname> <given-names>E.</given-names></name> <name><surname>Jiaerken</surname> <given-names>Y.</given-names></name> <name><surname>Luo</surname> <given-names>N.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>A mutual multi-scale triplet graph convolutional network for classification of brain disorders using functional or structural connectivity</article-title>. <source>IEEE Trans. Med. Imaging</source> <volume>40</volume>, <fpage>1279</fpage>&#x02013;<lpage>1289</lpage>. <pub-id pub-id-type="doi">10.1109/TMI.2021.3051604</pub-id><pub-id pub-id-type="pmid">33444133</pub-id></citation></ref>
<ref id="B69">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Yao</surname> <given-names>L.</given-names></name> <name><surname>Mao</surname> <given-names>C.</given-names></name> <name><surname>Luo</surname> <given-names>Y.</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;Graph convolutional networks for text classification,&#x0201D;</article-title> in <source>Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33</source> (<publisher-loc>Honolulu, HI</publisher-loc>: <publisher-name>AAAI</publisher-name>), <fpage>7370</fpage>&#x02013;<lpage>7377</lpage>. <pub-id pub-id-type="doi">10.1609/aaai.v33i01.33017370</pub-id></citation>
</ref>
<ref id="B70">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Yue-Hei Ng</surname> <given-names>J.</given-names></name> <name><surname>Hausknecht</surname> <given-names>M.</given-names></name> <name><surname>Vijayanarasimhan</surname> <given-names>S.</given-names></name> <name><surname>Vinyals</surname> <given-names>O.</given-names></name> <name><surname>Monga</surname> <given-names>R.</given-names></name> <name><surname>Toderici</surname> <given-names>G.</given-names></name></person-group> (<year>2015</year>). <article-title>&#x0201C;Beyond short snippets: deep networks for video classification,&#x0201D;</article-title> in <source>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</source> (<publisher-loc>Boston, MA</publisher-loc>), <fpage>4694</fpage>&#x02013;<lpage>4702</lpage>. <pub-id pub-id-type="doi">10.1109/CVPR.2015.7299101</pub-id></citation>
</ref>
<ref id="B71">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>L.</given-names></name> <name><surname>Wang</surname> <given-names>M.</given-names></name> <name><surname>Liu</surname> <given-names>M.</given-names></name> <name><surname>Zhang</surname> <given-names>D.</given-names></name></person-group> (<year>2020</year>). <article-title>A survey on deep learning for neuroimaging-based brain disorder analysis</article-title>. <source>Front. Neurosci</source>. <volume>14</volume>:<fpage>779</fpage>. <pub-id pub-id-type="doi">10.3389/fnins.2020.00779</pub-id><pub-id pub-id-type="pmid">33117114</pub-id></citation></ref>
<ref id="B72">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>Y.</given-names></name> <name><surname>Jiang</surname> <given-names>X.</given-names></name> <name><surname>Qiao</surname> <given-names>L.</given-names></name> <name><surname>Liu</surname> <given-names>M.</given-names></name></person-group> (<year>2021</year>). <article-title>Modularity-guided functional brain network analysis for early-stage dementia identification</article-title>. <source>Front. Neurosci</source>. <volume>15</volume>:<fpage>720909</fpage>. <pub-id pub-id-type="doi">10.3389/fnins.2021.720909</pub-id><pub-id pub-id-type="pmid">34421530</pub-id></citation></ref>
<ref id="B73">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>Y.</given-names></name> <name><surname>Zhang</surname> <given-names>H.</given-names></name> <name><surname>Adeli</surname> <given-names>E.</given-names></name> <name><surname>Chen</surname> <given-names>X.</given-names></name> <name><surname>Liu</surname> <given-names>M.</given-names></name> <name><surname>Shen</surname> <given-names>D.</given-names></name></person-group> (<year>2020</year>). <article-title>Multiview feature learning with multiatlas-based functional connectivity networks for MCI diagnosis</article-title>. <source>IEEE Trans. Cybern</source>. <pub-id pub-id-type="doi">10.1109/TCYB.2020.3016953</pub-id><pub-id pub-id-type="pmid">33306476</pub-id></citation></ref>
<ref id="B74">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>Y.</given-names></name> <name><surname>Zhang</surname> <given-names>H.</given-names></name> <name><surname>Chen</surname> <given-names>X.</given-names></name> <name><surname>Liu</surname> <given-names>M.</given-names></name> <name><surname>Zhu</surname> <given-names>X.</given-names></name> <name><surname>Lee</surname> <given-names>S.-W.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>Strength and similarity guided group-level brain functional network construction for MCI diagnosis</article-title>. <source>Pattern Recogn</source>. <volume>88</volume>, <fpage>421</fpage>&#x02013;<lpage>430</lpage>. <pub-id pub-id-type="doi">10.1016/j.patcog.2018.12.001</pub-id><pub-id pub-id-type="pmid">31579344</pub-id></citation></ref>
</ref-list>
<fn-group>
<fn id="fn0001"><p><sup>1</sup><ext-link ext-link-type="uri" xlink:href="http://rfmri.org/REST-meta-MDD">http://rfmri.org/REST-meta-MDD</ext-link></p></fn>
</fn-group>
</back>
</article> 