<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Aging Neurosci.</journal-id>
<journal-title>Frontiers in Aging Neuroscience</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Aging Neurosci.</abbrev-journal-title>
<issn pub-type="epub">1663-4365</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fnagi.2022.871706</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Aging Neuroscience</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Alzheimer&#x00027;s Disease Diagnosis With Brain Structural MRI Using Multiview-Slice Attention and 3D Convolution Neural Network</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>Chen</surname> <given-names>Lin</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1250579/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Qiao</surname> <given-names>Hezhe</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1670693/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Zhu</surname> <given-names>Fan</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1759226/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Chongqing Key Laboratory of Big Data and Intelligent Computing, Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences</institution>, <addr-line>Chongqing</addr-line>, <country>China</country></aff>
<aff id="aff2"><sup>2</sup><institution>University of Chinese Academy of Sciences</institution>, <addr-line>Beijing</addr-line>, <country>China</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Peng Xu, University of Electronic Science and Technology of China, China</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Xianmin Wang, Guangzhou University, China; Dianlong You, Yanshan University, China</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Lin Chen <email>chenlin&#x00040;cigit.ac.cn</email></corresp>
<fn fn-type="other" id="fn001"><p>This article was submitted to Alzheimer&#x00027;s Disease and Related Dementias, a section of the journal Frontiers in Aging Neuroscience</p></fn></author-notes>
<pub-date pub-type="epub">
<day>26</day>
<month>04</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>14</volume>
<elocation-id>871706</elocation-id>
<history>
<date date-type="received">
<day>08</day>
<month>02</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>17</day>
<month>03</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2022 Chen, Qiao and Zhu.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Chen, Qiao and Zhu</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license> </permissions>
<abstract>
<p>Numerous artificial intelligence (AI) based approaches have been proposed for automatic Alzheimer&#x00027;s disease (AD) prediction with brain structural magnetic resonance imaging (sMRI). Previous studies extract features from the whole brain or individual slices separately, ignoring the properties of multi-view slices and feature complementarity. For this reason, we present a novel AD diagnosis model based on the multiview-slice attention and 3D convolution neural network (3D-CNN). Specifically, we begin by extracting the local slice-level characteristic in various dimensions using multiple sub-networks. Then we proposed a slice-level attention mechanism to emphasize specific 2D-slices to exclude the redundancy features. After that, a 3D-CNN was employed to capture the global subject-level structural changes. Finally, all these 2D and 3D features were fused to obtain more discriminative representations. We conduct the experiments on 1,451 subjects from ADNI-1 and ADNI-2 datasets. Experimental results showed the superiority of our model over the state-of-the-art approaches regarding dementia classification. Specifically, our model achieves accuracy values of 91.1 and 80.1% on ADNI-1 for AD diagnosis and mild cognitive impairment (MCI) convention prediction, respectively.</p></abstract>
<kwd-group>
<kwd>Alzheimer&#x00027;s disease (AD)</kwd>
<kwd>disease prognosis</kwd>
<kwd>multi-view-slice attention</kwd>
<kwd>3D convolution neural network</kwd>
<kwd>brain sMRI image</kwd>
</kwd-group>
<counts>
<fig-count count="8"/>
<table-count count="6"/>
<equation-count count="5"/>
<ref-count count="42"/>
<page-count count="13"/>
<word-count count="7205"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>1. Introduction</title>
<p>Alzheimer&#x00027;s disease (AD) is the most common cause of dementia that causes progressive and permanent memory loss and brain damage. It is critical to initiate treatment for slowing down AD development in early AD. As a non-contact diagnostic method, structural magnetic resonance imaging (sMRI) is regarded as a typical imaging biomarker in quantifying the stage of neurodegeneration (Kincses et al., <xref ref-type="bibr" rid="B16">2015</xref>; Bayram et al., <xref ref-type="bibr" rid="B3">2018</xref>; Shi et al., <xref ref-type="bibr" rid="B31">2018</xref>). Based on the examination of the brain&#x00027;s sMRI images, numerous artificial intelligence (AI) technologies, including conventional voxel-based machine learning methods and deep-learning-based approaches, have been performed for assisting the cognitive diagnosis (Mart&#x000ED;-Juan et al., <xref ref-type="bibr" rid="B26">2020</xref>; Tanveer et al., <xref ref-type="bibr" rid="B32">2020</xref>; Wu et al., <xref ref-type="bibr" rid="B37">2021a</xref>,<xref ref-type="bibr" rid="B38">b</xref>).</p>
<p>In the early attempts, traditional statistical methods based on voxel-based morphology (VBM) were introduced to measure the brain&#x00027;s morphologic changes. VBM-based studies determine the intrinsic characteristics of specific biomarkers, such as the hippocampus volumes (Fuse et al., <xref ref-type="bibr" rid="B11">2018</xref>), cortex sickness (Luk et al., <xref ref-type="bibr" rid="B25">2018</xref>), subcortical volumes (Vu et al., <xref ref-type="bibr" rid="B33">2018</xref>), and frequency features with non-subsampled contourlets (Feng et al., <xref ref-type="bibr" rid="B10">2021</xref>), to calculate the regional, anatomical volume of the brain. However, most VBM-based approaches relying on domain knowledge and expert&#x00027;s experience need a complex handcrafted feature extraction procedure, which is independent of the subsequent classifiers, resulting in potential diagnostic performance degradation.</p>
<p>With the advancement of deep learning, especially the successful applications of convolution neural networks (CNN), in recent years, a growing body of research employed deep learning to analyze the MR images by training an end-to-end model without handcrafted features (Zhang et al., <xref ref-type="bibr" rid="B41">2020</xref>; AbdulAzeem et al., <xref ref-type="bibr" rid="B1">2021</xref>; Qiao et al., <xref ref-type="bibr" rid="B28">2021</xref>). Since the 3D volumetric nature of sMRI, 3D-CNN could be directly applied to capture the structural changes of the whole brain at the subject-level (Jin et al., <xref ref-type="bibr" rid="B13">2019</xref>). However, there is much useless information in the complete MRI with millions of voxels. Furthermore, it is hard to fully train the CNNs with only a few labeled MRI data available at the subject level. Many deep-learning-based methods turn to exact pre-determination of regions-of-interest (ROI) for training the models with 3D-Patch or 2D-slice (Ebrahimighahnavieh et al., <xref ref-type="bibr" rid="B9">2020</xref>). Liu et al. (<xref ref-type="bibr" rid="B24">2020b</xref>) extract multi-scale image patches based on the pre-determined anatomical landmarks from sMRI for training an end-to-end CNN. Lian et al. (<xref ref-type="bibr" rid="B18">2020a</xref>,<xref ref-type="bibr" rid="B20">b</xref>) trained multiple classifiers with multilevel discriminative sMRI features from the whole sMRI with a hybrid network to capture local-to-global structural information. Compared with the modeling in the subject level, the patches or slices carry more local features but lose some global information. In addition, some studies try to exclude irrelevant regions by emphasizing specific brain tissues with the help of segmentation technology. Cui and Liu (<xref ref-type="bibr" rid="B7">2019</xref>) and Poloni and Ferrari (<xref ref-type="bibr" rid="B27">2022</xref>) focus on the specific biomarker from specific regions, such as the hippocampus, to capture the structural changes in 3D MR images for AD and mild cognitive impairment (MCI) classification. Chen and Xia (<xref ref-type="bibr" rid="B5">2021</xref>) design a sparse regression module to identify the critical cortical regions, such as the amygdala, posterior temporal lobe, and propose a deep feature extraction module to integrate the features landmarked regions for the diagnosis process. However, such methods need extra tissue segmentation operations, which inevitably increase the complexity of the diagnostic model.</p>
<p>Although the existing models have achieved outstanding results so far, it is still a challenging work for AD diagnosis due to a large number of volumes in 3D MR images and a subtle difference between abnormalities and normality brains, i.e., it is vital to extract subtle changes in disease progression from MRI sequence data with a high denominational. Previous studies focus on extracting features from the whole brain or individual slices separately, ignoring the feature complementarity from different views. As illustrated in <xref ref-type="fig" rid="F1">Figure 1</xref>, each slice of the brain sMRI in different views contains a certain amount of local information that could also be valuable for dementia diagnosis. Considering both global structure changes of whole brain and fine-grained local distinctions of slices could be both crucial, this study proposes a novel fusion model for AD classification, named multiView-slice attention and 3D convolution neural network (MSA3D), which organically integrates multiple slices features and 3D structural information.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>The slice-level information in brain sMRI. <bold>(A)</bold> Slice-level features in axial plan. <bold>(B)</bold> Slice-level features captured in multiview, including the sagittal, coronal, and axial planes.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnagi-14-871706-g0001.tif"/>
</fig>
<p>The main contributions of this study are three-fold:
<list list-type="order">
<list-item><p>We proposed an MSA3D model to combine the 2D multi-view-slice levels and global 3D subject-level features for fully mining the subtle changes in different views and dimensions.</p></list-item>
<list-item><p>We propose a slice-level attention module to help the CNN focus on specific slices to obtain more discriminative features representations from abundant vowels.</p></list-item>
<list-item><p>We perform two classification tasks, i.e., AD diagnosis and MCI conversion prediction, on two ADNI datasets. Our model achieves superior diagnostic results compared with other tested models, demonstrating our model&#x00027;s efficacy in aiding dementia prediction.</p></list-item>
</list></p></sec>
<sec id="s2">
<title>2. Materials and Data Preprocessing</title>
<sec>
<title>2.1. Studied Subjects</title>
<p>Following the previous studies (Liu et al., <xref ref-type="bibr" rid="B23">2019</xref>; Lian et al., <xref ref-type="bibr" rid="B20">2020b</xref>), we employed two public sMRI data sets, i.e., ADNI-1 and ADNI-2, for empirical study. Both of them can be found on the Alzheimer&#x00027;s Disease Neuroimaging Initiative (ADNI) website (Jack et al., <xref ref-type="bibr" rid="B12">2008</xref>). This study employed the ADNI data only for model validation but did not involve any patient interaction or data acquisition. More detailed data acquisition protocols are available at <ext-link ext-link-type="uri" xlink:href="http://adni.loni.usc.edu/">http://adni.loni.usc.edu/</ext-link>. We collected a total of 1,451 subjects from the ADNI database with baseline T1 weighted (T1W) brain MRI scans, which are divided into four categories:
<list list-type="bullet">
<list-item><p>Cognitively Normal (CN): Subjects diagnosed with CN at baseline and showed no cognitive decline.</p></list-item>
<list-item><p>Stable MCI (sMCI): Subjects diagnosed with MCI remain stable and have not converted to AD at all time-points (0&#x02013;90 months).</p></list-item>
<list-item><p>Progressive MCI (pMCI): Subjects are diagnosed with MCI who would gradually progress to AD within 0&#x02013;36 months.</p></list-item>
<list-item><p>Alzheimer&#x00027;s disease: Subjects diagnosed as AD at baseline and whose conditions would not change during the follow-up period.</p></list-item>
</list></p>
<p>To avoid data leakage problems mentioned in Wen et al. (<xref ref-type="bibr" rid="B35">2020</xref>), we also remove the subjects exited in both ADNI-1 and ADNI-2. More specifically, the ADNI-1 dataset is formed of 808 subjects with 1.5 T T1W sMR brain images, including 183 AD, 229 CN, 167 pMCI, and 229 sMCI. The ADNI-2 dataset has 643 3T T1W sMR brain images, including 143 AD, 184 CN, 75 pMCI, and 241 sMCI. <xref ref-type="table" rid="T1">Table 1</xref> summarizes the detailed clinical information of the studied subjects, including age, sex, and the scores of the mini-mental state examination (MMSE). In our experiments, these two independent datasets will be employed as the training dataset and testing dataset, repetitively, to perform cross-validation. More specifically, we first trained the model on the ADNI-1 and evaluated it on ADNI-2. Subsequently, we reversed the experimentation and used the ADNI-2 for model learning, and then the trained model was assessed on ADNI-1. Note that we employed the ADNI data only for empirical analysis but this study did not employ any patient interaction or data acquisition.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Detailed clinical information of the studied subjects in ADNI-1 and ADNI-2 (&#x000B1; means the SD).</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Dataset</bold></th>
<th valign="top" align="left"><bold>Label</bold></th>
<th valign="top" align="center"><bold>Total number</bold></th>
<th valign="top" align="center"><bold>Age (Years)</bold></th>
<th valign="top" align="center"><bold>Sex (M/F)</bold></th>
<th valign="top" align="center"><bold>MMSE</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">ADNI-1</td>
<td valign="top" align="left">NC</td>
<td valign="top" align="center">229</td>
<td valign="top" align="center">76.2 &#x000B1; 5.1</td>
<td valign="top" align="center">119/110</td>
<td valign="top" align="center">29.2 &#x000B1; 1.0</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">sMCI</td>
<td valign="top" align="center">229</td>
<td valign="top" align="center">74.8 &#x000B1; 7.6</td>
<td valign="top" align="center">153/76</td>
<td valign="top" align="center">27.2 &#x000B1; 1.7</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">pMCI</td>
<td valign="top" align="center">167</td>
<td valign="top" align="center">74.9 &#x000B1; 7.2</td>
<td valign="top" align="center">102/65</td>
<td valign="top" align="center">26.9 &#x000B1; 1.7</td>
</tr>
<tr style="border-bottom: thin solid #000000;">
<td/>
<td valign="top" align="left">AD</td>
<td valign="top" align="center">183</td>
<td valign="top" align="center">75.6 &#x000B1; 7.6</td>
<td valign="top" align="center">96/87</td>
<td valign="top" align="center">23.1 &#x000B1; 2.5</td>
</tr> <tr>
<td valign="top" align="left">ADNI-2</td>
<td valign="top" align="left">NC</td>
<td valign="top" align="center">184</td>
<td valign="top" align="center">77.3 &#x000B1; 6.7</td>
<td valign="top" align="center">87/97</td>
<td valign="top" align="center">28.8 &#x000B1; 1.7</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">sMCI</td>
<td valign="top" align="center">241</td>
<td valign="top" align="center">71.3 &#x000B1; 7.5</td>
<td valign="top" align="center">134/107</td>
<td valign="top" align="center">28.3 &#x000B1; 1.5</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">pMCI</td>
<td valign="top" align="center">75</td>
<td valign="top" align="center">71.9 &#x000B1; 7.2</td>
<td valign="top" align="center">40/35</td>
<td valign="top" align="center">27.0 &#x000B1; 1.6</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">AD</td>
<td valign="top" align="center">143</td>
<td valign="top" align="center">75.6 &#x000B1; 7.8</td>
<td valign="top" align="center">85/58</td>
<td valign="top" align="center">21.9 &#x000B1; 3.8</td>
</tr>
</tbody>
</table>
</table-wrap></sec>
<sec>
<title>2.2. Data Preprocessing</title>
<p>The standard preprocessing pipeline was performed on all the T1W brain MRIs as follows: First, all MRIs were performed in an axial orientation parallel to the line through anterior commissure (AC)-posterior commissure (PC) correction. Then the invalid volumes of the sMRI, i.e., the blank regions, were removed, leaving only the brain tissues. Subsequently, the intensity of brain images was corrected and normalized with the N3 algorithm after the skull dissection (Wang et al., <xref ref-type="bibr" rid="B34">2011</xref>). Finally, all the aligned images are resized into the same spatial resolution for facilitating the CNN training. The model&#x00027;s inputs are fixed to 91 &#x000D7; 101 &#x000D7; 91(i.e., 2<italic>mm</italic> &#x000D7; 2<italic>mm</italic> &#x000D7; 2<italic>mm</italic> cubic size) in our experiment, following the previous study (Jin et al., <xref ref-type="bibr" rid="B14">2020</xref>).</p></sec></sec>
<sec sec-type="methods" id="s3">
<title>3. Methodology</title>
<p>The overall architecture of our model is presented in <xref ref-type="fig" rid="F2">Figure 2</xref>, which is composed of five main parts: the MRI sequences input, multi-view-slice sub-network (MVSSN), slices attention module (SAM), subject-level 3D-CNN (S3D-CNN), and a softmax classifier with full connection layer. The following sections provide more details for each module.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>Illustration of the proposed multiview-slice attention and 3D convolution neural network (MSA3D) model.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnagi-14-871706-g0002.tif"/>
</fig>
<sec>
<title>3.1. Multi-View-Slice 2D Sub-Networks</title>
<p>In this subsection, we introduce the MVSSN module for extracting multiview 2D-slice level features. As shown in <xref ref-type="fig" rid="F3">Figure 3</xref>, the inputs of MVSSN are consist of the MR slices in three views, i.e., the sagittal, coronal, and axial imaging planes. Since discriminative features may exist in different slices, we employ a 2D-CNN to extract the multiview slice features from each slice. Let&#x00027;s denote the <italic>x</italic>, <italic>y</italic>, and <italic>z</italic> as the MRI planes of sagittal, coronal, and axial, respectively, particularly, <inline-formula><mml:math id="M1"><mml:msub><mml:mrow><mml:mstyle class="text"><mml:mtext mathvariant="bold">S</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow></mml:msub><mml:mtext>&#x000A0;</mml:mtext><mml:msubsup><mml:mrow><mml:mo>=</mml:mo><mml:mo>[</mml:mo><mml:mtext>s</mml:mtext></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msubsup><mml:msubsup><mml:mrow><mml:mstyle class="text"><mml:mtext class="textrm" mathvariant="normal">, s</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:mo>&#x02026;</mml:mo><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>M</mml:mi></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msubsup><mml:mo>]</mml:mo></mml:math></inline-formula> denotes the slice cluster in the <italic>x</italic> plane, where <italic>M</italic><sub><italic>x</italic></sub> is the total slice number of the cluster <bold>S</bold><sub><italic>x</italic></sub>. After using the multiple 2D-CNNs on each slice to generate the feature maps in different views separately, the input <italic>I</italic> &#x02208; <italic>R</italic><sup><italic>D</italic> &#x000D7; <italic>H</italic> &#x000D7; <italic>W</italic></sup> can be transformed as the feature maps <italic>F</italic><sub><italic>x</italic></sub>, <italic>F</italic><sub><italic>y</italic></sub>, <italic>F</italic><sub><italic>z</italic></sub> in three dimensions. For example, each feature map <inline-formula><mml:math id="M2"><mml:msubsup><mml:mrow><mml:mi>F</mml:mi></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> in sagittal view is calculated by Equation (1):
<disp-formula id="E1"><label>(1)</label><mml:math id="M3"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msubsup><mml:mrow><mml:mi>F</mml:mi></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:msubsup><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
where <inline-formula><mml:math id="M4"><mml:msubsup><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> is a independent 2D-based CNN, <inline-formula><mml:math id="M5"><mml:msubsup><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> is the weight of CNN <inline-formula><mml:math id="M6"><mml:msubsup><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula>, and <italic>i</italic> &#x02208; [1, <italic>M</italic><sub><italic>x</italic></sub>] means the <italic>i</italic>th slice in the <italic>x</italic>-direction. Each <inline-formula><mml:math id="M7"><mml:msubsup><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> contains three CNN blocks, each with a conventional layer, a barch normalize (BN) layer, a rectified linear unit (RELU) operator, and a maxpooling layer. Detailed parameters of our 2D-based CNN are listed in <xref ref-type="table" rid="T2">Table 2</xref>.</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>2D-CNN slice sub-network for multi-view slice features extraction.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnagi-14-871706-g0003.tif"/>
</fig>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Detailed parameters of our 2D-CNN slice sub-network.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Layer</bold></th>
<th valign="top" align="center"><bold>Kernel</bold></th>
<th valign="top" align="center"><bold>Stride</bold></th>
<th valign="top" align="left"><bold>Activation</bold></th>
<th valign="top" align="center"><bold>Output channels</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Conv2D</td>
<td valign="top" align="center">3 &#x000D7; 3</td>
<td valign="top" align="center">2</td>
<td valign="top" align="left">BachNorm&#x0002B;Relu</td>
<td valign="top" align="center">8</td>
</tr>
<tr>
<td valign="top" align="left">MaxPooling2D</td>
<td valign="top" align="center">2 &#x000D7; 2</td>
<td/>
<td/>
<td valign="top" align="center">8</td>
</tr>
<tr>
<td valign="top" align="left">Conv2d</td>
<td valign="top" align="center">3 &#x000D7; 3</td>
<td valign="top" align="center">2</td>
<td valign="top" align="left">BachNorm&#x0002B;Relu</td>
<td valign="top" align="center">32</td>
</tr>
<tr>
<td valign="top" align="left">MaxPooling2D</td>
<td valign="top" align="center">2 &#x000D7; 2</td>
<td/>
<td/>
<td valign="top" align="center">32</td>
</tr>
<tr>
<td valign="top" align="left">Conv2D</td>
<td valign="top" align="center">3 &#x000D7; 3</td>
<td valign="top" align="center">2</td>
<td valign="top" align="left">BachNorm&#x0002B;Relu</td>
<td valign="top" align="center">64</td>
</tr>
<tr>
<td valign="top" align="left">MaxPooling2D</td>
<td valign="top" align="center">2 &#x000D7; 2</td>
<td/>
<td/>
<td valign="top" align="center">64</td>
</tr>
<tr>
<td valign="top" align="left">Global-Avg-Pooling2D</td>
<td valign="top" align="center">1 &#x000D7; 1</td>
<td valign="top" align="center">1</td>
<td/>
<td valign="top" align="center">128</td>
</tr>
<tr>
<td valign="top" align="left">Full connected</td>
<td/>
<td/>
<td/>
<td valign="top" align="center">128</td>
</tr>
<tr>
<td valign="top" align="left">Full connected</td>
<td/>
<td/>
<td/>
<td valign="top" align="center">8</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>After the Global-Avg-Pooling (GAP) operation, the feature map <inline-formula><mml:math id="M8"><mml:msubsup><mml:mrow><mml:mi>F</mml:mi></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> can be pooled as a vector denoted as <inline-formula><mml:math id="M9"><mml:msubsup><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula>. In the end, all the feature maps in <italic>x</italic> view can be cascaded as <inline-formula><mml:math id="M10"><mml:msup><mml:mrow><mml:mstyle class="text" mathvariant="bold"><mml:mtext>I</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:mo>&#x02026;</mml:mo><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>M</mml:mi></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula>. The same conventional operation can be applied on <italic>y</italic> and <italic>z</italic> views to generate the corresponding feature map clusters.</p></sec>
<sec>
<title>3.2. Slices Attention Module</title>
<p>Each vector in <bold>I</bold><sup><italic>k</italic></sup> can be regarded as a class-specific response after extracting the multiple slices-level features using the MVSSN. Considering that the volumetric MRI data contains different slices, many of them may not contain the most representative information relevant to dementia (Lian et al., <xref ref-type="bibr" rid="B19">2021</xref>). To address this issue, we proposed a SAM to help the CNN focus on the specific features by exploiting the interdependencies among slices.</p>
<p>As shown in <xref ref-type="fig" rid="F2">Figure 2</xref>, given a set of features embedding of the <italic>j</italic><sup><italic>th</italic></sup> direction, denoted as <inline-formula><mml:math id="M11"><mml:msup><mml:mrow><mml:mstyle class="text" mathvariant="bold"><mml:mtext>I</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msup><mml:msup><mml:mrow><mml:mo>&#x02208;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>M</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mo>&#x000D7;</mml:mo><mml:mi>C</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula>, where C = 8 is the feature channels of each slice, and <italic>k</italic> &#x02208; {<italic>x, y, z</italic>} means the MR plane. By employing an attention mechanism, we can obtain the slice attention <inline-formula><mml:math id="M12"><mml:msup><mml:mrow><mml:mstyle class="text" mathvariant="bold"><mml:mtext>A</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msup><mml:msup><mml:mrow><mml:mo>&#x02208;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>M</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mo>&#x000D7;</mml:mo><mml:msub><mml:mrow><mml:mi>M</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msup></mml:math></inline-formula>, which can build the dynamic correlations between the target diagnosis label and slice-level features with the following equation:
<disp-formula id="E2"><label>(2)</label><mml:math id="M13"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msubsup><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mo class="qopname">exp</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msubsup><mml:mo>&#x000B7;</mml:mo><mml:msubsup><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:msubsup><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>M</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msubsup><mml:mo class="qopname">exp</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msubsup><mml:mo>&#x000B7;</mml:mo><mml:msubsup><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
where <inline-formula><mml:math id="M14"><mml:msubsup><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msubsup><mml:mo>&#x02208;</mml:mo><mml:msup><mml:mrow><mml:mstyle class="text" mathvariant="bold"><mml:mtext>A</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> is the score that semantically represents the impact of <italic>i</italic><sup><italic>th</italic></sup> slice feature on the <italic>j</italic><sup><italic>th</italic></sup> slice in the <italic>k</italic><sup><italic>th</italic></sup> direction. The final output of the weighted slice features <inline-formula><mml:math id="M15"><mml:msup><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mstyle class="text" mathvariant="bold"><mml:mtext>I</mml:mtext></mml:mstyle></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msup><mml:msup><mml:mrow><mml:mo>&#x02208;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>M</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mo>&#x000D7;</mml:mo><mml:mi>C</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> can be calculated by:
<disp-formula id="E3"><label>(3)</label><mml:math id="M16"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msubsup><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mi>&#x003B2;</mml:mi><mml:mstyle displaystyle="true"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>M</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munderover></mml:mstyle><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msubsup><mml:msubsup><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:msubsup><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msubsup></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
where &#x003B2; is a learnable parameter that will gradually increase from 0, note that the final output feature maps are the sum of all the weighted features of the slices in one direction so that the SAM can adaptively emphasize the most relevant slices to produce a better AD inference.</p>
<p>After the SAM module, we fuse all the slice features in three directions using concatenation operation to form the final slice-level features <inline-formula><mml:math id="M17"><mml:msub><mml:mrow><mml:mstyle class="text" mathvariant="bold"><mml:mtext>F</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mstyle class="text" mathvariant="bold"><mml:mtext>s</mml:mtext></mml:mstyle></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mstyle class="text" mathvariant="bold"><mml:mtext>I</mml:mtext></mml:mstyle></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mstyle class="text" mathvariant="bold"><mml:mtext>I</mml:mtext></mml:mstyle></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>y</mml:mi></mml:mrow></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mstyle class="text" mathvariant="bold"><mml:mtext>I</mml:mtext></mml:mstyle></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>z</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula>, where <bold>F</bold><sub><bold>s</bold></sub> represents the cascaded weighted features which can capture the multiple views of local changes of the brain in three directions in 2D MRI images.</p></sec>
<sec>
<title>3.3. Subject-Level 3D Neural Network</title>
<p>The brain MRI data can be regarded as 3D data with an input size of <italic>H</italic> &#x000D7; <italic>W</italic> &#x000D7; <italic>D</italic>, where <italic>H</italic> and <italic>W</italic> denote the height and width of the MRI, repetitively, and <italic>D</italic> is the image sequence. In order to explore the global structure changes of the brain, all of the convolution operations and pooling layers are reformed from 2D to 3D. The 3D CNN operator is given in Equation (4):
<disp-formula id="E4"><label>(4)</label><mml:math id="M18"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msubsup><mml:mrow><mml:mi>u</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mi>z</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B4;</mml:mi></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munder></mml:mstyle><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B4;</mml:mi></mml:mrow><mml:mrow><mml:mi>y</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munder></mml:mstyle><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B4;</mml:mi></mml:mrow><mml:mrow><mml:mi>z</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munder></mml:mstyle><mml:msubsup><mml:mrow><mml:mi>F</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003B4;</mml:mi></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003B4;</mml:mi></mml:mrow><mml:mrow><mml:mi>y</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mi>z</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003B4;</mml:mi></mml:mrow><mml:mrow><mml:mi>z</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x000D7;</mml:mo><mml:msubsup><mml:mrow><mml:mi>W</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B4;</mml:mi></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003B4;</mml:mi></mml:mrow><mml:mrow><mml:mi>y</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003B4;</mml:mi></mml:mrow><mml:mrow><mml:mi>z</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
where (<italic>x, y, z</italic>) refers to the 3D coordinates in sMRI data, <inline-formula><mml:math id="M19"><mml:msubsup><mml:mrow><mml:mi>F</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula> is the <italic>i</italic>th feature map of the <italic>l</italic> &#x02212; 1 layer. <inline-formula><mml:math id="M20"><mml:msubsup><mml:mrow><mml:mi>W</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B4;</mml:mi></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003B4;</mml:mi></mml:mrow><mml:mrow><mml:mi>y</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003B4;</mml:mi></mml:mrow><mml:mrow><mml:mi>z</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula> is a 3D convolution kernel slides in 3 dimensions, thus the new <italic>j</italic>th feature map <inline-formula><mml:math id="M21"><mml:msubsup><mml:mrow><mml:mi>u</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>y</mml:mi><mml:mo>,</mml:mo><mml:mi>z</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula> of the <italic>l</italic> layer can be generated after 3D convolution across the <inline-formula><mml:math id="M22"><mml:msubsup><mml:mrow><mml:mi>F</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula> from the <italic>l</italic> &#x02212; 1 layer. Similar to the 2D-CNN, our 3D-CNN includes four network blocks, and each block has a 3D-CNN layer, 3D BN layer, ReLu activation, and 3D max-pooling layer. Finally, the 3D convolutional feature maps are pooled into one 1D vector using a 3D-GAP layer with a kernel size of 1 &#x000D7; 1 &#x000D7; 1. The produced vector represents the global subject-level features. Detailed parameters of our 3D-CNN subject-level subnetwork are shown in <xref ref-type="table" rid="T3">Table 3</xref>.</p>
<table-wrap position="float" id="T3">
<label>Table 3</label>
<caption><p>Detailed parameters of our 3D-CNN subject sub-network.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Layer name</bold></th>
<th valign="top" align="center"><bold>Kernel</bold></th>
<th valign="top" align="center"><bold>Stride</bold></th>
<th valign="top" align="left"><bold>Activation</bold></th>
<th valign="top" align="center"><bold>Output channels</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Conv3D</td>
<td valign="top" align="center">3 &#x000D7; 3 &#x000D7; 3</td>
<td valign="top" align="center">1</td>
<td valign="top" align="left">BachNorm3d&#x0002B;Relu</td>
<td valign="top" align="center">32</td>
</tr>
<tr>
<td valign="top" align="left">MaxPooling3D</td>
<td valign="top" align="center">3 &#x000D7; 3 &#x000D7; 3</td>
<td valign="top" align="center">2</td>
<td/>
<td valign="top" align="center">32</td>
</tr>
<tr>
<td valign="top" align="left">Conv3D</td>
<td valign="top" align="center">3 &#x000D7; 3 &#x000D7; 3</td>
<td valign="top" align="center">1</td>
<td valign="top" align="left">BachNorm3d&#x0002B;Relu</td>
<td valign="top" align="center">128</td>
</tr>
<tr>
<td valign="top" align="left">MaxPooling3D</td>
<td valign="top" align="center">3 &#x000D7; 3 &#x000D7; 3</td>
<td valign="top" align="center">2</td>
<td/>
<td valign="top" align="center">128</td>
</tr>
<tr>
<td valign="top" align="left">Conv3D</td>
<td valign="top" align="center">3 &#x000D7; 3 &#x000D7; 3</td>
<td valign="top" align="center">1</td>
<td valign="top" align="left">BachNorm3d&#x0002B;Relu</td>
<td valign="top" align="center">256</td>
</tr>
<tr>
<td valign="top" align="left">MaxPooling3D</td>
<td valign="top" align="center">3 &#x000D7; 3 &#x000D7; 3</td>
<td valign="top" align="center">2</td>
<td/>
<td valign="top" align="center">256</td>
</tr>
<tr>
<td valign="top" align="left">Conv3D</td>
<td valign="top" align="center">2 &#x000D7; 2 &#x000D7; 2</td>
<td valign="top" align="center">2</td>
<td valign="top" align="left">BachNorm3d&#x0002B;Relu</td>
<td valign="top" align="center">512</td>
</tr>
<tr>
<td valign="top" align="left">MaxPooling3D</td>
<td valign="top" align="center">5 &#x000D7; 5 &#x000D7; 5</td>
<td valign="top" align="center">2</td>
<td/>
<td valign="top" align="center">512</td>
</tr>
<tr>
<td valign="top" align="left">Globel-Avg-Pooling3D</td>
<td valign="top" align="center">1 &#x000D7; 1 &#x000D7; 1</td>
<td/>
<td/>
<td valign="top" align="center">512</td>
</tr>
</tbody>
</table>
</table-wrap></sec>
<sec>
<title>3.4. Fully Connected Layer and Loss for Classification</title>
<p>To exploit both the slice-level and subject-level features generated by 2D and 3D-CNNs, a fully connected (FC) layer is employed to concatenate all the 2D and 3D features maps, followed by a final FC layer and a softmax classifier, which outputs the prediction probability of the diagnostic labels. The cross-entropy (CE) is widely adopted as the training loss function for image classification (Liu et al., <xref ref-type="bibr" rid="B21">2021</xref>), which is given as follows:
<disp-formula id="E5"><label>(5)</label><mml:math id="M23"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>L</mml:mi><mml:mo>=</mml:mo><mml:mo>-</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>C</mml:mi></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>c</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>C</mml:mi></mml:mrow></mml:munderover></mml:mstyle><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mstyle class="text" mathvariant="bold"><mml:mtext>X</mml:mtext></mml:mstyle></mml:mrow></mml:munder></mml:mstyle><mml:mtext class="textrm" mathvariant="normal">I</mml:mtext><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mi>c</mml:mi></mml:mrow><mml:mo>}</mml:mo></mml:mrow><mml:mo class="qopname">log</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mtext class="textrm" mathvariant="normal">P(</mml:mtext><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mi>c</mml:mi><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>:</mml:mo><mml:mstyle class="text" mathvariant="bold"><mml:mtext>W</mml:mtext></mml:mstyle><mml:mo>)</mml:mo></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
where I{&#x000B7;} &#x0003D; 1 if {&#x000B7;} is true, otherwise I{&#x000B7;} &#x0003D; 0. <italic>N</italic> is the total number of test subjects and <italic>X</italic><sub><italic>i</italic></sub> means the <italic>i</italic>th sample with the corresponding label <italic>Y</italic><sub><italic>i</italic></sub> in the training datasets <bold>X</bold>, and <italic>i</italic> &#x02208; [1, <italic>N</italic>]. <inline-formula><mml:math id="M24"><mml:mstyle class="text"><mml:mtext class="textrm" mathvariant="normal">P(</mml:mtext></mml:mstyle><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mi>c</mml:mi><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>:</mml:mo><mml:mstyle class="text" mathvariant="bold"><mml:mtext>W</mml:mtext></mml:mstyle><mml:mo>)</mml:mo></mml:math></inline-formula> measures the probability of the input sample <italic>X</italic><sub><italic>i</italic></sub> that is correctly classified as the <inline-formula><mml:math id="M25"><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> by the trained network with weights <bold>W</bold>.</p></sec>
<sec>
<title>3.5. Complexity Analysis</title>
<p>We further analyze our proposed model&#x00027;s complexity by reporting the two branches of subnetworks, respectively. For the aspect of the global subject-level 3D-CNN model, the computational complexity of 3D-CNN layer is <inline-formula><mml:math id="M26"><mml:mi>O</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>y</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>z</mml:mi></mml:mrow></mml:msub><mml:msubsup><mml:mrow><mml:mi>K</mml:mi></mml:mrow><mml:mrow><mml:mi>g</mml:mi><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>b</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi></mml:mrow><mml:mrow><mml:mn>3</mml:mn></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>, where <italic>K</italic><sub><italic>global</italic></sub> is 3D-CNN kernel size, while <italic>D</italic><sub><italic>x</italic></sub>, <italic>D</italic><sub><italic>y</italic></sub>, <italic>D</italic><sub><italic>z</italic></sub> is the feature map dimensions of the layer. For the aspect of the slice-level 2D-CNN model, since the 2D feature maps are fused in three dimensions, the time complexity of the 2D-CNN layer is <inline-formula><mml:math id="M27"><mml:mi>O</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>M</mml:mi></mml:mrow><mml:mrow><mml:mi>z</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>y</mml:mi></mml:mrow></mml:msub><mml:msubsup><mml:mrow><mml:mi>K</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>l</mml:mi><mml:mi>i</mml:mi><mml:mi>c</mml:mi><mml:mi>e</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>M</mml:mi></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>y</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>z</mml:mi></mml:mrow></mml:msub><mml:msubsup><mml:mrow><mml:mi>K</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>l</mml:mi><mml:mi>i</mml:mi><mml:mi>c</mml:mi><mml:mi>e</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>M</mml:mi></mml:mrow><mml:mrow><mml:mi>y</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>z</mml:mi></mml:mrow></mml:msub><mml:msubsup><mml:mrow><mml:mi>K</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>l</mml:mi><mml:mi>i</mml:mi><mml:mi>c</mml:mi><mml:mi>e</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>, where <italic>M</italic><sub><italic>x</italic></sub>, <italic>M</italic><sub><italic>y</italic></sub>, <italic>M</italic><sub><italic>z</italic></sub> denotes the total number of slices in three MR planes, receptively, and <italic>K</italic><sub><italic>slice</italic></sub> is the 2D-CNN kernel size.</p></sec></sec>
<sec id="s4">
<title>4. Experimental Results</title>
<sec>
<title>4.1. Competing Methods</title>
<p>We first compare our proposed MSA3D method with multiple deep-learning-based diagnosis approaches that we reproduced and evaluated on the same training and testing datasets including (1) a statistical method based on VBM with SVM [denoted as VBM&#x0002B;SVM, proposed by Ashburner and Friston (<xref ref-type="bibr" rid="B2">2000</xref>)], (2) a method using 3D-CNN features [denoted as 3D-CNN, proposed by Wen et al. (<xref ref-type="bibr" rid="B35">2020</xref>)], (3) a method using multi-slice 2D features, i.e., the features extracted from all the slices in three directions (denoted as Multi-Slice), and (4) a method using 3D-CNN with 3D patch-level features (denoted as Multi-Patch).</p>
<list list-type="order">
<list-item><p>Voxel&#x0002B;SVM: As a conventional statistical-based model, Voxel&#x0002B;SVM performed sMRI analyses at the voxel level (Ashburner and Friston, <xref ref-type="bibr" rid="B2">2000</xref>). Using a non-linear image registration approach, we first normalized all MRIs with the automated anatomical atlas (AAL) template. Then, we segmented the gray matter (GM) from sMRI data. In the end, we mapped the density of GM tissue into one vector and used the support vector machine (SVM) as the classifier for AD diagnosis.</p></list-item>
<list-item><p>3D convolution neural network: As an important part of MSA3D, 3D-CNN can extract global subject-level changes of sMRI for dementia diagnosis (Wen et al., <xref ref-type="bibr" rid="B35">2020</xref>). Thus, it can be regarded as the baseline model in our study. In this model, we only give the 3D MRI data as the input for training the 3D-CNN.</p></list-item>
<list-item><p>Multi-Slice: As another essential component of MSA3D, the multi-slice model focus on the local slice-level features, which consist of all the features extracted by using the 2D-CNN with the 2D slices in sagittal, coronal, and axial MR planes.</p></list-item>
<list-item><p>Multi-Patch: In this method, multiple 3D-patches are partitioned from the whole brain according to the landmarks defined in Zhang et al. (<xref ref-type="bibr" rid="B39">2016</xref>) and Liu et al. (<xref ref-type="bibr" rid="B24">2020b</xref>) to extract region-scale features (ROI), and then we train a 3D-CNN as the feature extractor for each patch. In the end, all the ROI-based features were cascaded to obtain the final embedded feature for the entire sMRI.</p></list-item>
</list></sec>
<sec>
<title>4.2. Experimental Setting</title>
<p>All the tested models are implemented with Python on Pytorch using one NVIDIA GTX1080TI-11G GPU. During the training stage, the batch size is set to the same value of 12 for all models for a fair comparison. Stochastic gradient descent (SGD) with an initial learning rate of 0.01 and a weighted delay of 0.02 is adopted as the optimization approach, along with an early stopping mechanism for avoiding over-fitting. The following five criteria are calculated to investigate the performance of the tested models, including accuracy (ACC), specificity (SPE), sensitivity (SEN), the area under the ROC curve (AUC), and F1-values (F1).</p></sec>
<sec>
<title>4.3. Results on ADNI-2</title>
<p>We first present the comparison results of two classification tasks (i.e., AD vs. NC and pMCI vs. sMCI) on ADNI-2 in <xref ref-type="table" rid="T4">Table 4</xref> and <xref ref-type="fig" rid="F4">Figure 4</xref>, with the tested methods trained on the ADNI-1. As we can inform from <xref ref-type="table" rid="T4">Table 4</xref>, Multi-Patch shows a better performance than Multi-Slice on AD prediction, especially on the challenging pMCI vs. sMCI. The results indicate that local discriminative features are important for MCI prediction, and only the 2D-slice level features may not be a good option for CNNs. In addition, 3D-CNN achieved the second-best results on both AD and MCI prediction tasks. We can also find that all the deep-learning-based models perform better than the conventional Voxel&#x0002B;svm method. The main reason is that the deep-learning-based technique can achieve a better feature extraction with an end-to-end framework. In general, our model consistently yields better performance than the tested methods, e.g., in the case of MSA3D vs. 3D-CNN baseline, our model resulted in 7 and 5.6% improvements in terms of ACC and AUC for classifying AD/NC, and 7.3% and 16.9% improvements in terms of ACC and AUC for determining pMCI/sMCI. This result shows that after fusion of the 2D and 3D information through two branches of CNNs, our model can capture more discriminative changes in both multiview 2D-slices and 3D whole-brain volumes in the progress of AD and MCI conversion. So that our model generates significant improvements in terms of all the metrics compared to other methods in comparison.</p>
<table-wrap position="float" id="T4">
<label>Table 4</label>
<caption><p>Classification results of AD vs. CN and MCI convention on ADNI-2.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Method</bold></th>
<th valign="top" align="center" colspan="5" style="border-bottom: thin solid #000000;"><bold>AD vs. CN</bold></th>
<th valign="top" align="center" colspan="5" style="border-bottom: thin solid #000000;"><bold>pMCI vs. sMCI</bold></th>
</tr>
<tr>
<th/>
<th valign="top" align="center"><bold>ACC</bold></th>
<th valign="top" align="center"><bold>SEN</bold></th>
<th valign="top" align="center"><bold>SPE</bold></th>
<th valign="top" align="center"><bold>AUC</bold></th>
<th valign="top" align="center"><bold>F1</bold></th>
<th valign="top" align="center"><bold>ACC</bold></th>
<th valign="top" align="center"><bold>SEN</bold></th>
<th valign="top" align="center"><bold>SPE</bold></th>
<th valign="top" align="center"><bold>AUC</bold></th>
<th valign="top" align="center"><bold>F1</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Voxel&#x0002B;SVM</td>
<td valign="top" align="center">0.759</td>
<td valign="top" align="center">0.677</td>
<td valign="top" align="center">0.810</td>
<td valign="top" align="center">0.729</td>
<td valign="top" align="center">0.705</td>
<td valign="top" align="center">0.736</td>
<td valign="top" align="center">0.107</td>
<td valign="top" align="center">0.769</td>
<td valign="top" align="center">0.609</td>
<td valign="top" align="center">0.162</td>
</tr>
<tr>
<td valign="top" align="left">3D-CNN</td>
<td valign="top" align="center">0.872</td>
<td valign="top" align="center">0.874</td>
<td valign="top" align="center">0.839</td>
<td valign="top" align="center">0.933</td>
<td valign="top" align="center">0.856</td>
<td valign="top" align="center">0.769</td>
<td valign="top" align="center">0.427</td>
<td valign="top" align="center">0.831</td>
<td valign="top" align="center">0.721</td>
<td valign="top" align="center">0.467</td>
</tr>
<tr>
<td valign="top" align="left">Multi-Slice</td>
<td valign="top" align="center">0.838</td>
<td valign="top" align="center">0.755</td>
<td valign="top" align="center">0.826</td>
<td valign="top" align="center">0.894</td>
<td valign="top" align="center">0.813</td>
<td valign="top" align="center">0.728</td>
<td valign="top" align="center">0.267</td>
<td valign="top" align="center">0.792</td>
<td valign="top" align="center">0.620</td>
<td valign="top" align="center">0.317</td>
</tr>
<tr>
<td valign="top" align="left">Multi-Patch</td>
<td valign="top" align="center">0.841</td>
<td valign="top" align="center">0.790</td>
<td valign="top" align="center">0.844</td>
<td valign="top" align="center">0.924</td>
<td valign="top" align="center">0.803</td>
<td valign="top" align="center">0.722</td>
<td valign="top" align="center">0.373</td>
<td valign="top" align="center">0.821</td>
<td valign="top" align="center">0.698</td>
<td valign="top" align="center">0.438</td>
</tr>
<tr>
<td valign="top" align="left">MSA3D</td>
<td valign="top" align="center"><bold>0.911</bold></td>
<td valign="top" align="center"><bold>0.888</bold></td>
<td valign="top" align="center"><bold>0.914</bold></td>
<td valign="top" align="center"><bold>0.950</bold></td>
<td valign="top" align="center"><bold>0.898</bold></td>
<td valign="top" align="center"><bold>0.801</bold></td>
<td valign="top" align="center"><bold>0.520</bold></td>
<td valign="top" align="center"><bold>0.856</bold></td>
<td valign="top" align="center"><bold>0.789</bold></td>
<td valign="top" align="center"><bold>0.553</bold></td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>All the models are trained on ADNI-1. The best results are highlighted in bold</italic>.</p>
</table-wrap-foot>
</table-wrap>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p>Comparisons results in terms of ROC curves. The models are trained on ADNI-1 and tested on ADNI-2. <bold>(A)</bold> AD vs. NC. <bold>(B)</bold> pMCI vs. sMCI.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnagi-14-871706-g0004.tif"/>
</fig>
</sec>
<sec>
<title>4.4. Results on ADNI-1</title>
<p>In order to further investigate the effectiveness of the test models, we also perform a cross-valuation on ADNI datasets, i.e., we trained the models on ADNI-2 and tested them on ADNI-1. It needs to be pointed out that because of the lack of sufficient pMCI samples in ADNI-2 (75 in ADNI-2 vs. 167 in ADNI-1), we only conduct the experiments of AD diagnosis on ADNI-1. The comparison results are summarized in <xref ref-type="table" rid="T5">Table 5</xref> and <xref ref-type="fig" rid="F5">Figure 5</xref>, from which we can observe similar results compared to the models tested on the ADNI-2. Our model still produces the best values in terms of all the metrics compared with the other methods.</p>
<table-wrap position="float" id="T5">
<label>Table 5</label>
<caption><p>Classification results of AD vs. CN on ADNI-1.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Methods</bold></th>
<th valign="top" align="center"><bold>ACC</bold></th>
<th valign="top" align="center"><bold>SEN</bold></th>
<th valign="top" align="center"><bold>SPE</bold></th>
<th valign="top" align="center"><bold>AUC</bold></th>
<th valign="top" align="center"><bold>F1</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Voxel&#x0002B;SVM</td>
<td valign="top" align="center">0.754</td>
<td valign="top" align="center">0.728</td>
<td valign="top" align="center">0.781</td>
<td valign="top" align="center">0.774</td>
<td valign="top" align="center">0.741</td>
</tr>
<tr>
<td valign="top" align="left">3D-CNN</td>
<td valign="top" align="center">0.833</td>
<td valign="top" align="center">0.738</td>
<td valign="top" align="center">0.813</td>
<td valign="top" align="center">0.905</td>
<td valign="top" align="center">0.796</td>
</tr>
<tr>
<td valign="top" align="left">Multi-slice</td>
<td valign="top" align="center">0.774</td>
<td valign="top" align="center">0.776</td>
<td valign="top" align="center">0.812</td>
<td valign="top" align="center">0.832</td>
<td valign="top" align="center">0.753</td>
</tr>
<tr>
<td valign="top" align="left">Multi-patch</td>
<td valign="top" align="center">0.808</td>
<td valign="top" align="center">0.710</td>
<td valign="top" align="center">0.793</td>
<td valign="top" align="center">0.890</td>
<td valign="top" align="center">0.767</td>
</tr>
<tr>
<td valign="top" align="left">MSA3D</td>
<td valign="top" align="center"><bold>0.864</bold></td>
<td valign="top" align="center"><bold>0.858</bold></td>
<td valign="top" align="center"><bold>0.884</bold></td>
<td valign="top" align="center"><bold>0.912</bold></td>
<td valign="top" align="center"><bold>0.849</bold></td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>All the models are trained on ADNI-2. The best results are highlighted in bold</italic>.</p>
</table-wrap-foot>
</table-wrap>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption><p>Comparisons of ROC curves. The models are trained on ADNI-2 and tested on ADNI-1.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnagi-14-871706-g0005.tif"/>
</fig>
<p>Meanwhile, we can find a significant performance drop for all models when trained on ADNI-2, which leads to a relatively small improvement of AUC achieved by our model compared with the 3D-CNN. The main reason for this is that ADNI-1 and ADNI-2 were collected using 1.5 and 3.0 Tesla MRI scanners, respectively. The strength of a 3.0 T magnet is two times that of a 1.5 T magnet, which could cause the overestimation of brain parenchymal volume at 1.5 T (Chu et al., <xref ref-type="bibr" rid="B6">2016</xref>). The variable image quality between different scanners directly impacts the models for diagnosis. However, our model still outperforms the 3D-CNN baseline by 5.3% of the F1 value in this scenario. All of these findings suggest the proposed model&#x00027;s efficacy and reliability.</p></sec>
<sec>
<title>4.5. Comparison With Other Methods in Literature</title>
<p>In this section, we give a brief description of our MSA3D method with the previous study reported in the literature for AD diagnosis using the ADNI database. The state-of-the-art comparison studies contain:
<list list-type="order">
<list-item><p>The conventional statistical-based methods include: SVM trained with Voxel-based features (VBF; Salvatore et al., <xref ref-type="bibr" rid="B29">2015</xref>); landmark-based morphometric features extracted from a local patch (LBM; Zhang et al., <xref ref-type="bibr" rid="B39">2016</xref>); SVM trained with landmark-based features (SVM-landmark; Zhang et al., <xref ref-type="bibr" rid="B40">2017</xref>).</p></list-item>
<list-item><p>The deep-learning-based methods include: 3D-CNN based on the whole brain sMRI data (whole-3DCNN; Korolev et al., <xref ref-type="bibr" rid="B17">2017</xref>); Multi-layer perception &#x0002B; recurrent neural network using the longitudinal sMRI features (MLP-RNN; Cui et al., <xref ref-type="bibr" rid="B8">2018</xref>); 3D-CNN based on the multiple-modality inputs including sMRI, PET, and MD-DTI data (multi-3DCNN; Khvostikov et al., <xref ref-type="bibr" rid="B15">2018</xref>); 3D-DenseNet based on the 3D-patches features extraction from the hippocampal areas (3D-DenseNet; Liu et al., <xref ref-type="bibr" rid="B22">2020a</xref>); hierarchical fully convolutional network based on 3D-patch and regions features extracted with prior landmarks (wH-FCN; Lian et al., <xref ref-type="bibr" rid="B20">2020b</xref>).</p></list-item>
</list></p>
<p>As shown in <xref ref-type="table" rid="T6">Table 6</xref>, We can draw the following conclusions: (1) deep-learning-based methods, especially the CNN-based models, perform much better than most of the conventional statistical methods in terms of ACC. The main reason is that CNN has more feature representation power than handcrafted features. (2) The local features, including ROI-based, landmark-based, and hippocampal segmentation, are also essential to improve the performance of dementia prediction, which indicates that the local changes in whole-brain images provide some valuable clues for AD diagnosis. However, most of these models need predefined landmarks or segmentation regions, which could be hard to obtain potentially informative ROIs due to the local differences between subjects. (3) Different from existing deep-learning-based models (Korolev et al., <xref ref-type="bibr" rid="B17">2017</xref>; Khvostikov et al., <xref ref-type="bibr" rid="B15">2018</xref>; Lian et al., <xref ref-type="bibr" rid="B20">2020b</xref>; Liu et al., <xref ref-type="bibr" rid="B22">2020a</xref>), our proposed model can extract more discriminative features from both local 2D-slice level and 3D-subject level sMRI data using 2D-slice attention network and 3D-CNN, it generates the best ACC, SEN values on AD vs. CN task, and the best SPE and AUC values for predicting pMCI vs. sMCI.</p>
<table-wrap position="float" id="T6">
<label>Table 6</label>
<caption><p>The performance comparison of our model with other state-of-the-art studies report in the literature using the ADNI database for prediction of AD vs. CN and pMCI vs. sMCI.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Method</bold></th>
<th valign="top" align="left"><bold>Test subjects</bold></th>
<th valign="top" align="center" colspan="4" style="border-bottom: thin solid #000000;"><bold>AD vs. CN</bold></th>
<th valign="top" align="center" colspan="4" style="border-bottom: thin solid #000000;"><bold>pMCI vs. sMCI</bold></th>
</tr>
<tr>
<th/>
<th/>
<th valign="top" align="center"><bold>ACC</bold></th>
<th valign="top" align="center"><bold>SEN</bold></th>
<th valign="top" align="center"><bold>SPE</bold></th>
<th valign="top" align="center"><bold>AUC</bold></th>
<th valign="top" align="center"><bold>ACC</bold></th>
<th valign="top" align="center"><bold>SEN</bold></th>
<th valign="top" align="center"><bold>SPE</bold></th>
<th valign="top" align="center"><bold>AUC</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">VBF</td>
<td valign="top" align="left">137AD&#x0002B;76sMCI&#x0002B;<break/> 134pMCI&#x0002B;162CN</td>
<td valign="top" align="center">0.760</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">0.660</td>
<td/>
<td/>
<td/>
</tr>
<tr>
<td valign="top" align="left">SVM-Landmark</td>
<td valign="top" align="left">154 AD&#x0002B;346 MCI<break/> &#x0002B;207 CN</td>
<td valign="top" align="center">0.822</td>
<td valign="top" align="center">0.774</td>
<td valign="top" align="center">0.861</td>
<td valign="top" align="center">0.881</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
</tr>
<tr>
<td valign="top" align="left">LBM<break/></td>
<td valign="top" align="left">385AD&#x0002B;465sMCI&#x0002B;<break/> 205pMCI&#x0002B;429CN</td>
<td valign="top" align="center">0.822</td>
<td valign="top" align="center">0.774</td>
<td valign="top" align="center">0.861</td>
<td valign="top" align="center">0.881</td>
<td valign="top" align="center">0.686</td>
<td valign="top" align="center">0.395</td>
<td valign="top" align="center">0.732</td>
<td valign="top" align="center">0.636</td>
</tr>
<tr>
<td valign="top" align="left">MLP-RNN</td>
<td valign="top" align="left">198AD&#x0002B;229CN</td>
<td valign="top" align="center">0.897</td>
<td valign="top" align="center">0.868</td>
<td valign="top" align="center">0.925</td>
<td valign="top" align="center">0.921</td>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td valign="top" align="left">Whole-3DCNN</td>
<td valign="top" align="left">50AD&#x0002B;77sMCI&#x0002B;<break/> 43pMCI&#x0002B;61CN</td>
<td valign="top" align="center">0.800</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">0.870</td>
<td valign="top" align="center">0.520</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">0.520</td>
</tr>
<tr>
<td valign="top" align="left">Multi-3DCNN</td>
<td valign="top" align="left">48AD&#x0002B;58CN</td>
<td valign="top" align="center">0.850</td>
<td valign="top" align="center">0.880</td>
<td valign="top" align="center">0.900</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
</tr>
<tr>
<td valign="top" align="left">3D-DenseNet</td>
<td valign="top" align="left">97AD&#x0002B;233MCI<break/> &#x0002B;119CN</td>
<td valign="top" align="center">0.889</td>
<td valign="top" align="center">0.866</td>
<td valign="top" align="center">0.808</td>
<td valign="top" align="center">0.925</td>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td valign="top" align="left">wH-FCN</td>
<td valign="top" align="left">385AD&#x0002B;465sMCI<break/> &#x0002B;205pMCI&#x0002B;429CN</td>
<td valign="top" align="center">0.903</td>
<td valign="top" align="center">0.824</td>
<td valign="top" align="center">0.965</td>
<td valign="top" align="center"><bold>0.951</bold></td>
<td valign="top" align="center"><bold>0.809</bold></td>
<td valign="top" align="center">0.526</td>
<td valign="top" align="center"><bold>0.854</bold></td>
<td valign="top" align="center">0.781</td>
</tr>
<tr>
<td valign="top" align="left">Our model</td>
<td valign="top" align="left">326AD&#x0002B;&#x0002B;470sMCI<break/> &#x0002B;242pMCI&#x0002B;413CN</td>
<td valign="top" align="center"><bold>0.911</bold></td>
<td valign="top" align="center"><bold>0.888</bold></td>
<td valign="top" align="center">0.914</td>
<td valign="top" align="center"><bold>0.950</bold></td>
<td valign="top" align="center">0.801</td>
<td valign="top" align="center">0.520</td>
<td valign="top" align="center"><bold>0.856</bold></td>
<td valign="top" align="center"><bold>0.789</bold></td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>The best results are highlighted in bold</italic>.</p>
</table-wrap-foot>
</table-wrap>
<p>It is noteworthy that our model does not need any predefined landmarks or extra location modules (e.g., hippocampus segmentation), but it achieved better or at least comparative diagnostic results than that of existing deep-learning-based AD diagnosis methods. For example, compared with the second-best wH-FCN model, which extracts features from multiple 3D-patches with hierarchical landmarks proposals, MAS3D generates better results in terms of ACC, SEN, and yields almost the same AUC values on AD vs. CN task. For the aspect of the pMCI vs. sMCI task, our model performs slightly worse than the wH-FCN in terms of ACC and SEN. The possible reason is that wH-FCN adopts more prior knowledge to improve the model&#x00027;s recognition capability, i.e., wH-FCN constrains the distances between landmarks and initializes the network parameters of the MCI prediction model from the task of AD classification.</p></sec></sec>
<sec sec-type="discussion" id="s5">
<title>5. Discussion</title>
<sec>
<title>5.1. Influence of Features in Different Dimensions</title>
<p>In this section, we investigate the effects of models using multiple slice-level features in different views for AD classification. As shown in <xref ref-type="fig" rid="F6">Figure 6</xref>, compared with the model combined with features in the axial plane generates much better results than that of the sagittal and the coronal planes in terms of ACC and SEN. Moreover, after combining the features in three dimensions, our proposed MAS3D outperforms all the tested models, especially yielding significantly better SEN values than the tested methods. This result demonstrates that our 2D- and 3D-features fusion strategy can organically integrate the multi-view-slices features in all directions.</p>
<fig id="F6" position="float">
<label>Figure 6</label>
<caption><p>The influence of features fusion in different dimensions.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnagi-14-871706-g0006.tif"/>
</fig></sec>
<sec>
<title>5.2. Influence of Slice Attention Module</title>
<p>As introduced in Section 3.2, the SAM was employed in our MSA3D model to assist the slice-level feature extraction by exploiting the relationships among the slices, i.e., to filter out uninformative slices efficiently. In this subsection, we conducted an ablation experiment for comparison, in which the SAM is removed from our MSA3D, defined as MS3D, to investigate the effectiveness of the proposed SAM, and all the models are trained using ADNI-1 and obtained the test results on ADNI-2.</p>
<p>The comparison results are illustrated in <xref ref-type="fig" rid="F7">Figure 7</xref>, from which we can inform that: (1) the two variants of our methods (i.e., MS3D and MSA3D) consistently perform better than the baseline model (i.e., 3D-CNN), which means the fusion of 2D -slice level and 3D subject features provides richer feature representation power for AD diagnosis. (2) the SAM further improved the performance of slice level feature extraction, especially on the challenging MCI prediction task, e.g., The proposed MSA3D generally had better classification performances than MS3D (the ACC and SEN is 0.772 vs. 0.801 and 0.440 vs. 0.520, respectively). This indicates that the proposed SAM can help the neural network focus on specific slices and learn more discriminative 2D-slice level features from abundant slices.</p>
<fig id="F7" position="float">
<label>Figure 7</label>
<caption><p>Comparison between multi-view slice fusion without SAM (i.e., MS3D) and multi-view slice fusion with SAM (i.e., MSA3D). <bold>(A,B)</bold> Show the classification results for AD vs. CN and pMCI vs. sMCI, respectively.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnagi-14-871706-g0007.tif"/>
</fig></sec>
<sec>
<title>5.3. Visualization of Slices Features</title>
<p>This section visualizes the attention maps produced by our MAS3D method using the Grad-cam (Selvaraju et al., <xref ref-type="bibr" rid="B30">2020</xref>) technology for predicting the subjects with AD and pMCI. The first, second, and third columns of <xref ref-type="fig" rid="F8">Figure 8</xref> show the different 2D-slices of sMRI in different views, including sagittal, coronal, and horizontal, respectively, where the corresponding model is trained on ADNI-1, and three AD and three pMCI subjects are randomly selected from ADNI-2 for testing.</p>
<fig id="F8" position="float">
<label>Figure 8</label>
<caption><p>Attention maps of our MAS3D method for predicting multiple subjects selected from the ADNI database with different stages of dementia (i.e., AD and pMCI), respectively. Each subject&#x00027;s attention map is displayed in three MR planes (i.e., sagittal, coronal, and horizontal), where red and blue colors denote high and low discriminative features in sMRI, respectively.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnagi-14-871706-g0008.tif"/>
</fig>
<p>From <xref ref-type="fig" rid="F8">Figure 8</xref>, we can infer that our model can identify discriminative atrophy areas for different subjects with different stages of dementia, especially for the regions that affect human memory and decision making in the brain. For example, our model emphasizes the atrophy of the frontoparietal cortex, ventricle regions, and hippocampus in the brain. It needs to be pointed out that these highlighted brain regions located by our model in AD diagnosis are consistent with previous clinical research (Chan et al., <xref ref-type="bibr" rid="B4">2002</xref>; Zhang et al., <xref ref-type="bibr" rid="B42">2021</xref>), which have reported the potential sensitive markers for neurodegeneration. All of these results suggest our proposed model can more precisely learn more discriminative features from the brain sMRI for precise dementia diagnosis.</p></sec>
<sec>
<title>5.4. Limitation and Future Study</title>
<p>While the experimental results suggested our proposed model performed well in automatic dementia detection, its performance and generalization might be potentially enhanced in the future by addressing the limitations listed below.</p>
<p>First, we take advantage of both 2D-slice and 3D-subject features in an integrated MSA3D model. However, the numerous 2D slices observably increased the computational complexity. Since not all the slices help determine the prediction, we could reduce the complexity by using an online feature selection module (Wu D. et al., <xref ref-type="bibr" rid="B36">2021</xref>) to select the 2D slices dynamically. Second, the difference distributions between ADNI-1 and ADNI-2 were not taken into account, i.e., 1.5 T scanners and 3 T scanners for ADNI-1 and ADNI-2, repetitively, which might have a detrimental impact on the model&#x00027;s performance, i.e., the model trained on ADNI-2 and assessed on ADNI-1 performed worse than that trained on ADNI-1 and evaluated on ADNI-2. We could potentially introduce the domain adaption technique into our model to reduce the domain gap between different ADNI datasets. Finally, To further verify the generalization capacity of the proposed model, we will investigate more deep-learning-based methods and test our model on other AD datasets for more AD-related prediction tasks, such as dementia status estimation.</p></sec></sec>
<sec sec-type="conclusions" id="s6">
<title>6. Conclusion</title>
<p>This study explores a 2D-slice-level and 3D subject-level fusion model for AI-based AD diagnosis using brain sMRI. In addition, a slice attention module is proposed to select the most discriminative slice-level features adaptively from the brain sMRI data. The effectiveness of our model is validated on ADNI-1 and ADNI-2, repetitively, for dementia classification. Specifically, our model achieves 91.1 and 80.1% ACC values on ADNI-1 in AD diagnosis and MCI convention precondition, respectively.</p></sec>
<sec sec-type="data-availability" id="s7">
<title>Data Availability Statement</title>
<p>The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.</p></sec>
<sec id="s8">
<title>Ethics Statement</title>
<p>Ethical review and approval were not required for the study on human participants because all the data in this study were downloaded from the Alzheimer&#x00027;s Disease Neuroimaging Initiative (ADNI) database.</p></sec>
<sec id="s9">
<title>Author Contributions</title>
<p>LC and HQ implemented and optimized the methods and wrote the manuscript. LC and FZ designed the experiment and algorithm. All authors contributed to the article and approved the submitted version.</p></sec>
<sec sec-type="funding-information" id="s10">
<title>Funding</title>
<p>Publication costs are funded by the National Nature Science Foundation of China under grants (61902370 and 61802360) and are also by the key cooperation project of the Chongqing Municipal Education Commission (HZ2021008 and HZ2021017).</p></sec>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p></sec>
<sec sec-type="disclaimer" id="s11">
<title>Publisher&#x00027;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p></sec>
</body>
<back>
<ack><p>Data used in the preparation of this article were obtained from the Alzheimer&#x00027;s disease Neuroimaging Initiative (ADNI) database. The investigators within the ADNI contributed to the design and implementation of ADNI and provided data but did not participate in the analysis or writing of this article. A complete listing of ADNI investigators can be found at: <ext-link ext-link-type="uri" xlink:href="http://adni.loni.usc.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf">http://adni.loni.usc.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf</ext-link>.</p>
</ack>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>AbdulAzeem</surname> <given-names>Y. M.</given-names></name> <name><surname>Bahgat</surname> <given-names>W. M.</given-names></name> <name><surname>Badawy</surname> <given-names>M. M.</given-names></name></person-group> (<year>2021</year>). <article-title>A CNN based framework for classification of Alzheimer&#x00027;s disease</article-title>. <source>Neural Comput. Appl</source>. <volume>33</volume>, <fpage>10415</fpage>&#x02013;<lpage>10428</lpage>. <pub-id pub-id-type="doi">10.1007/s00521-021-05799-w</pub-id></citation></ref>
<ref id="B2">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ashburner</surname> <given-names>J.</given-names></name> <name><surname>Friston</surname> <given-names>K. J.</given-names></name></person-group> (<year>2000</year>). <article-title>Voxel-based morphometry&#x02013;the methods</article-title>. <source>Neuroimage</source> <volume>11</volume>, <fpage>805</fpage>&#x02013;<lpage>821</lpage>. <pub-id pub-id-type="doi">10.1006/nimg.2000.0582</pub-id><pub-id pub-id-type="pmid">10860804</pub-id></citation></ref>
<ref id="B3">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bayram</surname> <given-names>E.</given-names></name> <name><surname>Caldwell</surname> <given-names>J. Z.</given-names></name> <name><surname>Banks</surname> <given-names>S. J.</given-names></name></person-group> (<year>2018</year>). <article-title>Current understanding of magnetic resonance imaging biomarkers and memory in Alzheimer&#x00027;s disease</article-title>. <source>Alzheimer&#x00027;s Dement. Transl. Res. Clin. Intervent</source>. <volume>4</volume>, <fpage>395</fpage>&#x02013;<lpage>413</lpage>. <pub-id pub-id-type="doi">10.1016/j.trci.2018.04.007</pub-id><pub-id pub-id-type="pmid">30229130</pub-id></citation></ref>
<ref id="B4">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chan</surname> <given-names>D.</given-names></name> <name><surname>Fox</surname> <given-names>N.</given-names></name> <name><surname>Rossor</surname> <given-names>M.</given-names></name></person-group> (<year>2002</year>). <article-title>Differing patterns of temporal atrophy in Alzheimer&#x00027;s disease and semantic dementia</article-title>. <source>Neurology</source> <volume>58</volume>, <fpage>838</fpage>&#x02013;<lpage>838</lpage>. <pub-id pub-id-type="doi">10.1212/WNL.58.5.838</pub-id><pub-id pub-id-type="pmid">11889267</pub-id></citation></ref>
<ref id="B5">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>Y.</given-names></name> <name><surname>Xia</surname> <given-names>Y.</given-names></name></person-group> (<year>2021</year>). <article-title>Iterative sparse and deep learning for accurate diagnosis of Alzheimer&#x00027;s disease</article-title>. <source>Pattern Recogn</source>. <volume>116</volume>:<fpage>107944</fpage>. <pub-id pub-id-type="doi">10.1016/j.patcog.2021.107944</pub-id></citation></ref>
<ref id="B6">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chu</surname> <given-names>R.</given-names></name> <name><surname>Tauhid</surname> <given-names>S.</given-names></name> <name><surname>Glanz</surname> <given-names>B. I.</given-names></name> <name><surname>Healy</surname> <given-names>B. C.</given-names></name> <name><surname>Kim</surname> <given-names>G.</given-names></name> <name><surname>Oommen</surname> <given-names>V. V.</given-names></name> <etal/></person-group>. (<year>2016</year>). <article-title>Whole brain volume measured from 1.5 t versus 3T MRI in healthy subjects and patients with multiple sclerosis</article-title>. <source>J. Neuroimaging</source> <volume>26</volume>, <fpage>62</fpage>&#x02013;<lpage>67</lpage>. <pub-id pub-id-type="doi">10.1111/jon.12271</pub-id><pub-id pub-id-type="pmid">26118637</pub-id></citation></ref>
<ref id="B7">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cui</surname> <given-names>R.</given-names></name> <name><surname>Liu</surname> <given-names>M.</given-names></name></person-group> (<year>2019</year>). <article-title>Hippocampus analysis by combination of 3-d densenet and shapes for Alzheimer&#x00027;s disease diagnosis</article-title>. <source>IEEE J. Biomed. Health Informatics</source> <volume>23</volume>, <fpage>2099</fpage>&#x02013;<lpage>2107</lpage>. <pub-id pub-id-type="doi">10.1109/JBHI.2018.2882392</pub-id><pub-id pub-id-type="pmid">30475734</pub-id></citation></ref>
<ref id="B8">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Cui</surname> <given-names>R.</given-names></name> <name><surname>Liu</surname> <given-names>M.</given-names></name> <name><surname>Li</surname> <given-names>G.</given-names></name></person-group> (<year>2018</year>). <article-title>Longitudinal analysis for Alzheimer&#x00027;s disease diagnosis using RNN</article-title>, in <source>15th IEEE International Symposium on Biomedical Imaging, ISBI 2018</source> (<publisher-loc>Washington, DC</publisher-loc>), <fpage>1398</fpage>&#x02013;<lpage>1401</lpage>. <pub-id pub-id-type="doi">10.1109/ISBI.2018.8363833</pub-id><pub-id pub-id-type="pmid">30763637</pub-id></citation></ref>
<ref id="B9">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ebrahimighahnavieh</surname> <given-names>M. A.</given-names></name> <name><surname>Luo</surname> <given-names>S.</given-names></name> <name><surname>Chiong</surname> <given-names>R.</given-names></name></person-group> (<year>2020</year>). <article-title>Deep learning to detect Alzheimer&#x00027;s disease from neuroimaging: a systematic literature review</article-title>. <source>Comput. Methods Programs Biomed</source>. <volume>187</volume>:<fpage>105242</fpage>. <pub-id pub-id-type="doi">10.1016/j.cmpb.2019.105242</pub-id><pub-id pub-id-type="pmid">31837630</pub-id></citation></ref>
<ref id="B10">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Feng</surname> <given-names>J.</given-names></name> <name><surname>Zhang</surname> <given-names>S.</given-names></name> <name><surname>Chen</surname> <given-names>L.</given-names></name> <name><surname>Xia</surname> <given-names>J.</given-names></name></person-group> (<year>2021</year>). <article-title>Alzheimer&#x00027;s disease classification using features extracted from nonsubsampled contourlet subband-based individual networks</article-title>. <source>Neurocomputing</source> <volume>421</volume>, <fpage>260</fpage>&#x02013;<lpage>272</lpage>. <pub-id pub-id-type="doi">10.1016/j.neucom.2020.09.012</pub-id></citation></ref>
<ref id="B11">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Fuse</surname> <given-names>H.</given-names></name> <name><surname>Oishi</surname> <given-names>K.</given-names></name> <name><surname>Maikusa</surname> <given-names>N.</given-names></name> <name><surname>Fukami</surname> <given-names>T.</given-names></name></person-group> (<year>2018</year>). <article-title>Detection of Alzheimer&#x00027;s disease with shape analysis of MRI images</article-title>, in <source>2018 Joint 10th International Conference on Soft Computing and Intelligent Systems (SCIS) and 19th International Symposium on Advanced Intelligent Systems (ISIS)</source> (<publisher-loc>Toyama</publisher-loc>), <fpage>1031</fpage>&#x02013;<lpage>1034</lpage>. <pub-id pub-id-type="doi">10.1109/SCIS-ISIS.2018.00171</pub-id></citation></ref>
<ref id="B12">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jack</surname> <given-names>C. R.</given-names> <suffix>Jr</suffix></name> <name><surname>Bernstein</surname> <given-names>M. A.</given-names></name> <name><surname>Fox</surname> <given-names>N. C.</given-names></name> <name><surname>Thompson</surname> <given-names>P.</given-names></name> <name><surname>Alexander</surname> <given-names>G.</given-names></name> <name><surname>Harvey</surname> <given-names>D.</given-names></name> <etal/></person-group>. (<year>2008</year>). <article-title>The Alzheimer&#x00027;s disease neuroimaging initiative (ADNI): MRI methods</article-title>. <source>J. Magn. Reson. Imaging</source> <volume>27</volume>, <fpage>685</fpage>&#x02013;<lpage>691</lpage>. <pub-id pub-id-type="doi">10.1002/jmri.21049</pub-id><pub-id pub-id-type="pmid">18302232</pub-id></citation></ref>
<ref id="B13">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Jin</surname> <given-names>D.</given-names></name> <name><surname>Xu</surname> <given-names>J.</given-names></name> <name><surname>Zhao</surname> <given-names>K.</given-names></name> <name><surname>Hu</surname> <given-names>F.</given-names></name> <name><surname>Yang</surname> <given-names>Z.</given-names></name> <name><surname>Liu</surname> <given-names>B.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>Attention-based 3D convolutional network for Alzheimer&#x00027;s disease diagnosis and biomarkers exploration</article-title>, in <source>16th IEEE International Symposium on Biomedical Imaging, ISBI 2019</source> (<publisher-loc>Venice</publisher-loc>), <fpage>1047</fpage>&#x02013;<lpage>1051</lpage>. <pub-id pub-id-type="doi">10.1109/ISBI.2019.8759455</pub-id></citation></ref>
<ref id="B14">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jin</surname> <given-names>D.</given-names></name> <name><surname>Zhou</surname> <given-names>B.</given-names></name> <name><surname>Han</surname> <given-names>Y.</given-names></name> <name><surname>Ren</surname> <given-names>J.</given-names></name> <name><surname>Han</surname> <given-names>T.</given-names></name> <name><surname>Liu</surname> <given-names>B.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Generalizable, reproducible, and neuroscientifically interpretable imaging biomarkers for Alzheimer&#x00027;s disease</article-title>. <source>Adv. Sci</source>. <volume>7</volume>:<fpage>2000675</fpage>. <pub-id pub-id-type="doi">10.1002/advs.202000675</pub-id><pub-id pub-id-type="pmid">32714766</pub-id></citation></ref>
<ref id="B15">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Khvostikov</surname> <given-names>A. V.</given-names></name> <name><surname>Aderghal</surname> <given-names>K.</given-names></name> <name><surname>Benois-Pineau</surname> <given-names>J.</given-names></name> <name><surname>Krylov</surname> <given-names>A. S.</given-names></name> <name><surname>Catheline</surname> <given-names>G.</given-names></name></person-group> (<year>2018</year>). <article-title>3D CNN-based classification using sMRI and MD-DTI images for Alzheimer disease studies</article-title>. <source>CoRR, abs/1801.05968</source>. <pub-id pub-id-type="doi">10.48550/arXiv.1801.05968</pub-id></citation></ref>
<ref id="B16">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kincses</surname> <given-names>Z. T.</given-names></name> <name><surname>Kiraly</surname> <given-names>A.</given-names></name> <name><surname>Ver&#x000E9;b</surname> <given-names>D.</given-names></name> <name><surname>V&#x000E9;csei</surname> <given-names>L.</given-names></name></person-group> (<year>2015</year>). <article-title>Structural magnetic resonance imaging markers of Alzheimer&#x00027;s disease and its retranslation to rodent models</article-title>. <source>J. Alzheimer&#x00027;s Dis</source>. <volume>47</volume>, <fpage>277</fpage>&#x02013;<lpage>290</lpage>. <pub-id pub-id-type="doi">10.3233/JAD-143195</pub-id><pub-id pub-id-type="pmid">26401552</pub-id></citation></ref>
<ref id="B17">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Korolev</surname> <given-names>S.</given-names></name> <name><surname>Safiullin</surname> <given-names>A.</given-names></name> <name><surname>Belyaev</surname> <given-names>M.</given-names></name> <name><surname>Dodonova</surname> <given-names>Y.</given-names></name></person-group> (<year>2017</year>). <article-title>Residual and plain convolutional neural networks for 3d brain MRI classification</article-title>, in <source>14th IEEE International Symposium on Biomedical Imaging, ISBI 2017</source> (<publisher-loc>Melbourne, QLD</publisher-loc>), <fpage>835</fpage>&#x02013;<lpage>838</lpage>. <pub-id pub-id-type="doi">10.1109/ISBI.2017.7950647</pub-id></citation></ref>
<ref id="B18">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lian</surname> <given-names>C.</given-names></name> <name><surname>Liu</surname> <given-names>M.</given-names></name> <name><surname>Pan</surname> <given-names>Y.</given-names></name> <name><surname>Shen</surname> <given-names>D.</given-names></name></person-group> (<year>2020a</year>). <article-title>Attention-guided hybrid network for dementia diagnosis with structural MR images</article-title>. <source>IEEE Trans. Cybern</source>. <fpage>1</fpage>&#x02013;<lpage>12</lpage>. <pub-id pub-id-type="doi">10.1109/TCYB.2020.3005859</pub-id><pub-id pub-id-type="pmid">32721906</pub-id></citation></ref>
<ref id="B19">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lian</surname> <given-names>C.</given-names></name> <name><surname>Liu</surname> <given-names>M.</given-names></name> <name><surname>Wang</surname> <given-names>L.</given-names></name> <name><surname>Shen</surname> <given-names>D.</given-names></name></person-group> (<year>2021</year>). <article-title>Multi-task weakly-supervised attention network for dementia status estimation with structural MRI</article-title>. <source>IEEE Trans. Neural Netw. Learn. Syst</source>. <fpage>1</fpage>&#x02013;<lpage>13</lpage>. <pub-id pub-id-type="doi">10.1109/TNNLS.2021.3055772</pub-id><pub-id pub-id-type="pmid">33656999</pub-id></citation></ref>
<ref id="B20">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lian</surname> <given-names>C.</given-names></name> <name><surname>Liu</surname> <given-names>M.</given-names></name> <name><surname>Zhang</surname> <given-names>J.</given-names></name> <name><surname>Shen</surname> <given-names>D.</given-names></name></person-group> (<year>2020b</year>). <article-title>Hierarchical fully convolutional network for joint atrophy localization and Alzheimer&#x00027;s disease diagnosis using structural MRI</article-title>. <source>IEEE Trans. Pattern Anal. Mach. Intell</source>. <volume>42</volume>, <fpage>880</fpage>&#x02013;<lpage>893</lpage>. <pub-id pub-id-type="doi">10.1109/TPAMI.2018.2889096</pub-id><pub-id pub-id-type="pmid">30582529</pub-id></citation></ref>
<ref id="B21">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>M.</given-names></name> <name><surname>Chen</surname> <given-names>L.</given-names></name> <name><surname>Du</surname> <given-names>X.</given-names></name> <name><surname>Jin</surname> <given-names>L.</given-names></name> <name><surname>Shang</surname> <given-names>M.</given-names></name></person-group> (<year>2021</year>). <article-title>Activated gradients for deep neural networks</article-title>. <source>IEEE Trans. Neural Netw. Learn. Syst</source>. <fpage>1</fpage>&#x02013;<lpage>13</lpage>. <pub-id pub-id-type="doi">10.1109/TNNLS.2021.3106044</pub-id><pub-id pub-id-type="pmid">34469312</pub-id></citation></ref>
<ref id="B22">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>M.</given-names></name> <name><surname>Li</surname> <given-names>F.</given-names></name> <name><surname>Yan</surname> <given-names>H.</given-names></name> <name><surname>Wang</surname> <given-names>K.</given-names></name> <name><surname>Ma</surname> <given-names>Y.</given-names></name> <name><surname>Shen</surname> <given-names>L.</given-names></name> <etal/></person-group>. (<year>2020a</year>). <article-title>A multi-model deep convolutional neural network for automatic hippocampus segmentation and classification in Alzheimer&#x00027;s disease</article-title>. <source>NeuroImage</source> <volume>208</volume>:<fpage>116459</fpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2019.116459</pub-id><pub-id pub-id-type="pmid">31837471</pub-id></citation></ref>
<ref id="B23">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>M.</given-names></name> <name><surname>Zhang</surname> <given-names>J.</given-names></name> <name><surname>Adeli</surname> <given-names>E.</given-names></name> <name><surname>Shen</surname> <given-names>D.</given-names></name></person-group> (<year>2019</year>). <article-title>Joint classification and regression <italic>via</italic> deep multi-task multi-channel learning for Alzheimer&#x00027;s disease diagnosis</article-title>. <source>IEEE Trans. Biomed. Eng</source>. <volume>66</volume>, <fpage>1195</fpage>&#x02013;<lpage>1206</lpage>. <pub-id pub-id-type="doi">10.1109/TBME.2018.2869989</pub-id><pub-id pub-id-type="pmid">30222548</pub-id></citation></ref>
<ref id="B24">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>M.</given-names></name> <name><surname>Zhang</surname> <given-names>J.</given-names></name> <name><surname>Lian</surname> <given-names>C.</given-names></name> <name><surname>Shen</surname> <given-names>D.</given-names></name></person-group> (<year>2020b</year>). <article-title>Weakly supervised deep learning for brain disease prognosis using MRI and incomplete clinical scores</article-title>. <source>IEEE Trans. Cybern</source>. <volume>50</volume>, <fpage>3381</fpage>&#x02013;<lpage>3392</lpage>. <pub-id pub-id-type="doi">10.1109/TCYB.2019.2904186</pub-id><pub-id pub-id-type="pmid">30932861</pub-id></citation></ref>
<ref id="B25">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Luk</surname> <given-names>C. C.</given-names></name> <name><surname>Ishaque</surname> <given-names>A.</given-names></name> <name><surname>Khan</surname> <given-names>M.</given-names></name> <name><surname>Ta</surname> <given-names>D.</given-names></name> <name><surname>Chenji</surname> <given-names>S.</given-names></name> <name><surname>Yang</surname> <given-names>Y.-H.</given-names></name> <etal/></person-group>. (<year>2018</year>). <article-title>Alzheimer&#x00027;s disease: 3-dimensional MRI texture for prediction of conversion from mild cognitive impairment</article-title>. <source>Alzheimer&#x00027;s Dement</source>. <volume>10</volume>, <fpage>755</fpage>&#x02013;<lpage>763</lpage>. <pub-id pub-id-type="doi">10.1016/j.dadm.2018.09.002</pub-id><pub-id pub-id-type="pmid">30480081</pub-id></citation></ref>
<ref id="B26">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mart&#x000ED;-Juan</surname> <given-names>G.</given-names></name> <name><surname>Sanroma-Guell</surname> <given-names>G.</given-names></name> <name><surname>Piella</surname> <given-names>G.</given-names></name></person-group> (<year>2020</year>). <article-title>A survey on machine and statistical learning for longitudinal analysis of neuroimaging data in Alzheimer&#x00027;s disease</article-title>. <source>Comput. Methods Prog. Biomed</source>. <volume>189</volume>:<fpage>105348</fpage>. <pub-id pub-id-type="doi">10.1016/j.cmpb.2020.105348</pub-id><pub-id pub-id-type="pmid">31995745</pub-id></citation></ref>
<ref id="B27">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Poloni</surname> <given-names>K. M.</given-names></name> <name><surname>Ferrari</surname> <given-names>R. J.</given-names></name></person-group> (<year>2022</year>). <article-title>Automated detection, selection and classification of hippocampal landmark points for the diagnosis of Alzheimer&#x00027;s disease</article-title>. <source>Comput. Methods Prog. Biomed</source>. <volume>214</volume>:<fpage>106581</fpage>. <pub-id pub-id-type="doi">10.1016/j.cmpb.2021.106581</pub-id><pub-id pub-id-type="pmid">34923325</pub-id></citation></ref>
<ref id="B28">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Qiao</surname> <given-names>H.</given-names></name> <name><surname>Chen</surname> <given-names>L.</given-names></name> <name><surname>Ye</surname> <given-names>Z.</given-names></name> <name><surname>Zhu</surname> <given-names>F.</given-names></name></person-group> (<year>2021</year>). <article-title>Early Alzheimer&#x00027;s disease diagnosis with the contrastive loss using paired structural MRIs</article-title>. <source>Comput. Methods Prog. Biomed</source>. <volume>208</volume>:<fpage>106282</fpage>. <pub-id pub-id-type="doi">10.1016/j.cmpb.2021.106282</pub-id><pub-id pub-id-type="pmid">34343744</pub-id></citation></ref>
<ref id="B29">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Salvatore</surname> <given-names>C.</given-names></name> <name><surname>Cerasa</surname> <given-names>A.</given-names></name> <name><surname>Battista</surname> <given-names>P.</given-names></name> <name><surname>Gilardi</surname> <given-names>M. C.</given-names></name> <name><surname>Quattrone</surname> <given-names>A.</given-names></name> <name><surname>Castiglioni</surname> <given-names>I.</given-names></name></person-group> (<year>2015</year>). <article-title>Magnetic resonance imaging biomarkers for the early diagnosis of Alzheimer&#x00027;s disease: a machine learning approach</article-title>. <source>Front. Neurosci</source>. <volume>9</volume>:<fpage>307</fpage>. <pub-id pub-id-type="doi">10.3389/fnins.2015.00307</pub-id><pub-id pub-id-type="pmid">26388719</pub-id></citation></ref>
<ref id="B30">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Selvaraju</surname> <given-names>R. R.</given-names></name> <name><surname>Cogswell</surname> <given-names>M.</given-names></name> <name><surname>Das</surname> <given-names>A.</given-names></name> <name><surname>Vedantam</surname> <given-names>R.</given-names></name> <name><surname>Parikh</surname> <given-names>D.</given-names></name> <name><surname>Batra</surname> <given-names>D.</given-names></name></person-group> (<year>2020</year>). <article-title>Grad-CAM: visual explanations from deep networks <italic>via</italic> gradient-based localization</article-title>. <source>Int. J. Comput. Vis</source>. <volume>128</volume>, <fpage>336</fpage>&#x02013;<lpage>359</lpage>. <pub-id pub-id-type="doi">10.1007/s11263-019-01228-7</pub-id></citation></ref>
<ref id="B31">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shi</surname> <given-names>J.</given-names></name> <name><surname>Zheng</surname> <given-names>X.</given-names></name> <name><surname>Li</surname> <given-names>Y.</given-names></name> <name><surname>Zhang</surname> <given-names>Q.</given-names></name> <name><surname>Ying</surname> <given-names>S.</given-names></name></person-group> (<year>2018</year>). <article-title>Multimodal neuroimaging feature learning with multimodal stacked deep polynomial networks for diagnosis of Alzheimer&#x00027;s disease</article-title>. <source>IEEE J. Biomed. Health Informatics</source> <volume>22</volume>, <fpage>173</fpage>&#x02013;<lpage>183</lpage>. <pub-id pub-id-type="doi">10.1109/JBHI.2017.2655720</pub-id><pub-id pub-id-type="pmid">28113353</pub-id></citation></ref>
<ref id="B32">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tanveer</surname> <given-names>M.</given-names></name> <name><surname>Richhariya</surname> <given-names>B.</given-names></name> <name><surname>Khan</surname> <given-names>R.</given-names></name> <name><surname>Rashid</surname> <given-names>A.</given-names></name> <name><surname>Khanna</surname> <given-names>P.</given-names></name> <name><surname>Prasad</surname> <given-names>M.</given-names></name> <name><surname>Lin</surname> <given-names>C.</given-names></name></person-group> (<year>2020</year>). <article-title>Machine learning techniques for the diagnosis of Alzheimer&#x00027;s disease: a review</article-title>. <source>ACM Trans. Multim. Comput. Commun. Appl</source>. <volume>16</volume>, <fpage>1</fpage>&#x02013;<lpage>35</lpage>. <pub-id pub-id-type="doi">10.1145/3344998</pub-id></citation></ref>
<ref id="B33">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vu</surname> <given-names>T. D.</given-names></name> <name><surname>Ho</surname> <given-names>N.</given-names></name> <name><surname>Yang</surname> <given-names>H.</given-names></name> <name><surname>Kim</surname> <given-names>J.</given-names></name> <name><surname>Song</surname> <given-names>H.</given-names></name></person-group> (<year>2018</year>). <article-title>Non-white matter tissue extraction and deep convolutional neural network for Alzheimer&#x00027;s disease detection</article-title>. <source>Soft Comput</source>. <volume>22</volume>, <fpage>6825</fpage>&#x02013;<lpage>6833</lpage>. <pub-id pub-id-type="doi">10.1007/s00500-018-3421-5</pub-id></citation></ref>
<ref id="B34">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>Y.</given-names></name> <name><surname>Nie</surname> <given-names>J.</given-names></name> <name><surname>Yap</surname> <given-names>P.</given-names></name> <name><surname>Shi</surname> <given-names>F.</given-names></name> <name><surname>Guo</surname> <given-names>L.</given-names></name> <name><surname>Shen</surname> <given-names>D.</given-names></name></person-group> (<year>2011</year>). <article-title>Robust deformable-surface-based skull-stripping for large-scale studies</article-title>, in <source>Medical Image Computing and Computer-Assisted Intervention - MICCAI 2011 - 14th International Conference</source>, Vol. 6893 of Lecture Notes in Computer Science, eds <person-group person-group-type="editor"><name><surname>Fichtinger</surname> <given-names>G.</given-names></name> <name><surname>Martel</surname> <given-names>A. L.</given-names></name> <name><surname>Peters</surname> <given-names>T. M.</given-names></name></person-group> (<publisher-loc>Toronto, ON</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>635</fpage>&#x02013;<lpage>642</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-642-23626-6_78</pub-id><pub-id pub-id-type="pmid">22003753</pub-id></citation></ref>
<ref id="B35">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wen</surname> <given-names>J.</given-names></name> <name><surname>Thibeau-Sutre</surname> <given-names>E.</given-names></name> <name><surname>Diaz-Melo</surname> <given-names>M.</given-names></name> <name><surname>Samper-Gonz&#x000E1;lez</surname> <given-names>J.</given-names></name> <name><surname>Routier</surname> <given-names>A.</given-names></name> <name><surname>Bottani</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Convolutional neural networks for classification of Alzheimer&#x00027;s disease: overview and reproducible evaluation</article-title>. <source>Med. Image Anal</source>. <volume>63</volume>:<fpage>101694</fpage>. <pub-id pub-id-type="doi">10.1016/j.media.2020.101694</pub-id><pub-id pub-id-type="pmid">32417716</pub-id></citation></ref>
<ref id="B36">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wu</surname> <given-names>D.</given-names></name> <name><surname>He</surname> <given-names>Y.</given-names></name> <name><surname>Luo</surname> <given-names>X.</given-names></name> <name><surname>Zhou</surname> <given-names>M.</given-names></name></person-group> (<year>2021</year>). <article-title>A latent factor analysis-based approach to online sparse streaming feature selection</article-title>. <source>IEEE Trans. Syst. Man Cybern</source>. <fpage>1</fpage>&#x02013;<lpage>15</lpage>. <pub-id pub-id-type="doi">10.1109/TSMC.2021.3096065</pub-id></citation></ref>
<ref id="B37">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wu</surname> <given-names>E. Q.</given-names></name> <name><surname>Hu</surname> <given-names>D.</given-names></name> <name><surname>Deng</surname> <given-names>P.-Y.</given-names></name> <name><surname>Tang</surname> <given-names>Z.</given-names></name> <name><surname>Cao</surname> <given-names>Y.</given-names></name> <name><surname>Zhang</surname> <given-names>W.-M.</given-names></name> <etal/></person-group>. (<year>2021a</year>). <article-title>Nonparametric bayesian prior inducing deep network for automatic detection of cognitive status</article-title>. <source>IEEE Trans. Cybern</source>. <volume>51</volume>, <fpage>5483</fpage>&#x02013;<lpage>5496</lpage>. <pub-id pub-id-type="doi">10.1109/TCYB.2020.2977267</pub-id><pub-id pub-id-type="pmid">32203044</pub-id></citation></ref>
<ref id="B38">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wu</surname> <given-names>E. Q.</given-names></name> <name><surname>Lin</surname> <given-names>C.-T.</given-names></name> <name><surname>Zhu</surname> <given-names>L.-M.</given-names></name> <name><surname>Tang</surname> <given-names>Z. R.</given-names></name> <name><surname>Jie</surname> <given-names>Y.-W.</given-names></name> <name><surname>Zhou</surname> <given-names>G.-R.</given-names></name></person-group> (<year>2021b</year>). <article-title>Fatigue detection of pilots&#x00027; brain through brains cognitive map and multilayer latent incremental learning model</article-title>. <source>IEEE Trans. Cybern</source>. <fpage>1</fpage>&#x02013;<lpage>13</lpage>. <pub-id pub-id-type="doi">10.1109/TCYB.2021.3068300</pub-id><pub-id pub-id-type="pmid">33961575</pub-id></citation></ref>
<ref id="B39">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>J.</given-names></name> <name><surname>Gao</surname> <given-names>Y.</given-names></name> <name><surname>Gao</surname> <given-names>Y.</given-names></name> <name><surname>Munsell</surname> <given-names>B. C.</given-names></name> <name><surname>Shen</surname> <given-names>D.</given-names></name></person-group> (<year>2016</year>). <article-title>Detecting anatomical landmarks for fast alzheimer&#x00027;s disease diagnosis</article-title>. <source>IEEE Trans. Med. Imaging</source> <volume>35</volume>, <fpage>2524</fpage>&#x02013;<lpage>2533</lpage>. <pub-id pub-id-type="doi">10.1109/TMI.2016.2582386</pub-id><pub-id pub-id-type="pmid">27333602</pub-id></citation></ref>
<ref id="B40">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>J.</given-names></name> <name><surname>Liu</surname> <given-names>M.</given-names></name> <name><surname>An</surname> <given-names>L.</given-names></name> <name><surname>Gao</surname> <given-names>Y.</given-names></name> <name><surname>Shen</surname> <given-names>D.</given-names></name></person-group> (<year>2017</year>). <article-title>Alzheimer&#x00027;s disease diagnosis using landmark-based features from longitudinal structural MR images</article-title>. <source>IEEE J. Biomed. Health Inform</source>. <volume>21</volume>, <fpage>1607</fpage>&#x02013;<lpage>1616</lpage>. <pub-id pub-id-type="doi">10.1109/JBHI.2017.2704614</pub-id><pub-id pub-id-type="pmid">28534798</pub-id></citation></ref>
<ref id="B41">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>T.</given-names></name> <name><surname>Li</surname> <given-names>C.</given-names></name> <name><surname>Li</surname> <given-names>P.</given-names></name> <name><surname>Peng</surname> <given-names>Y.</given-names></name> <name><surname>Kang</surname> <given-names>X.</given-names></name> <name><surname>Jiang</surname> <given-names>C.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Separated channel attention convolutional neural network (SC-CNN-attention) to identify ADHD in multi-site RS-fMRI dataset</article-title>. <source>Entropy</source> <volume>22</volume>:<fpage>893</fpage>. <pub-id pub-id-type="doi">10.3390/e22080893</pub-id><pub-id pub-id-type="pmid">33286662</pub-id></citation></ref>
<ref id="B42">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>Y.</given-names></name> <name><surname>Jiang</surname> <given-names>X.</given-names></name> <name><surname>Qiao</surname> <given-names>L.</given-names></name> <name><surname>Liu</surname> <given-names>M.</given-names></name></person-group> (<year>2021</year>). <article-title>Modularity-guided functional brain network analysis for early-stage dementia identification</article-title>. <source>Front. Neurosci</source>. <volume>15</volume>:<fpage>720909</fpage>. <pub-id pub-id-type="doi">10.3389/fnins.2021.720909</pub-id><pub-id pub-id-type="pmid">34421530</pub-id></citation></ref>
</ref-list>
</back>
</article>