<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Comput. Neurosci.</journal-id>
<journal-title>Frontiers in Computational Neuroscience</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Comput. Neurosci.</abbrev-journal-title>
<issn pub-type="epub">1662-5188</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fncom.2023.1113381</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Neuroscience</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>HC-Net: A hybrid convolutional network for non-human primate brain extraction</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Fei</surname> <given-names>Hong</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/2122344/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Wang</surname> <given-names>Qianshan</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1256406/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Shang</surname> <given-names>Fangxin</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Xu</surname> <given-names>Wenyi</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Chen</surname> <given-names>Xiaofeng</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Chen</surname> <given-names>Yifei</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Li</surname> <given-names>Haifang</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/649557/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>College of Information and Computer, Taiyuan University of Technology</institution>, <addr-line>Taiyuan</addr-line>, <country>China</country></aff>
<aff id="aff2"><sup>2</sup><institution>Country Intelligent Healthcare Unit, Baidu</institution>, <addr-line>Beijing</addr-line>, <country>China</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Qi Li, Changchun University of Science and Technology, China</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Jingang Shi, Xi&#x2019;an Jiaotong University, China; Lina Zhu, Central South University, China</p></fn>
<corresp id="c001">&#x002A;Correspondence: Haifang Li, <email>lihaifang@tyut.edu.cn</email></corresp>
</author-notes>
<pub-date pub-type="epub">
<day>09</day>
<month>02</month>
<year>2023</year>
</pub-date>
<pub-date pub-type="collection">
<year>2023</year>
</pub-date>
<volume>17</volume>
<elocation-id>1113381</elocation-id>
<history>
<date date-type="received">
<day>01</day>
<month>12</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>23</day>
<month>01</month>
<year>2023</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2023 Fei, Wang, Shang, Xu, Chen, Chen and Li.</copyright-statement>
<copyright-year>2023</copyright-year>
<copyright-holder>Fei, Wang, Shang, Xu, Chen, Chen and Li</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract>
<p>Brain extraction (skull stripping) is an essential step in the magnetic resonance imaging (MRI) analysis of brain sciences. However, most of the current brain extraction methods that achieve satisfactory results for human brains are often challenged by non-human primate brains. Due to the small sample characteristics and the nature of thick-slice scanning of macaque MRI data, traditional deep convolutional neural networks (DCNNs) are unable to obtain excellent results. To overcome this challenge, this study proposed a symmetrical end-to-end trainable hybrid convolutional neural network (HC-Net). It makes full use of the spatial information between adjacent slices of the MRI image sequence and combines three consecutive slices from three axes for 3D convolutions, which reduces the calculation consumption and promotes accuracy. The HC-Net consists of encoding and decoding structures of 3D convolutions and 2D convolutions in series. The effective use of 2D convolutions and 3D convolutions relieves the underfitting of 2D convolutions to spatial features and the overfitting of 3D convolutions to small samples. After evaluating macaque brain data from different sites, the results showed that HC-Net performed better in inference time (approximately 13 s per volume) and accuracy (mean Dice coefficient reached 95.46%). The HC-Net model also had good generalization ability and stability in different modes of brain extraction tasks.</p>
</abstract>
<kwd-group>
<kwd>brain extraction</kwd>
<kwd>deep learning</kwd>
<kwd>hybrid convolution network</kwd>
<kwd>hybrid features</kwd>
<kwd>non-human primate MRI</kwd>
</kwd-group>
<counts>
<fig-count count="11"/>
<table-count count="6"/>
<equation-count count="10"/>
<ref-count count="45"/>
<page-count count="11"/>
<word-count count="8348"/>
</counts>
</article-meta>
</front>
<body>
<sec id="S1" sec-type="intro">
<title>1. Introduction</title>
<p>With the launch of brain science programs in various countries, macaques have become an important non-human primate animal model (<xref ref-type="bibr" rid="B44">Zhang and Shi, 1993</xref>; <xref ref-type="bibr" rid="B21">Li et al., 2019</xref>). Researchers have conducted invasive experiments (such as electrophoresis, biology, histology, and lesions) on macaque brains to verify hypotheses that cannot be carried out on human brains (<xref ref-type="bibr" rid="B23">Liu et al., 2018</xref>; <xref ref-type="bibr" rid="B3">Cai et al., 2020</xref>). In brain science research, MRI has become an essential medical technology to study the brain because of its non-invasiveness, ease of collecting information many times, and rich and detailed tissue information. Brain extraction is one of the initial steps of MRI image processing (<xref ref-type="bibr" rid="B12">Esteban et al., 2019</xref>; <xref ref-type="bibr" rid="B35">Tasserie et al., 2020</xref>). By removing non-brain tissues (skull, muscle, eye, dura mater, external blood vessels, and nerves), the accuracy of brain image processing steps can be improved, such as anatomy-based brain registration, meningeal surface reconstruction, brain volume measurement, and tissue recognition (<xref ref-type="bibr" rid="B40">Xi et al., 2019a</xref>,<xref ref-type="bibr" rid="B41">b</xref>; <xref ref-type="bibr" rid="B1">Autio et al., 2020</xref>; <xref ref-type="bibr" rid="B20">Lepage et al., 2021</xref>). However, the performance of existing brain extraction tools is lacking when applied to the macaque brain (<xref ref-type="bibr" rid="B45">Zhao et al., 2018</xref>).</p>
<p>The particularity of the macaque&#x2019;s brain makes brain extraction more challenging than in humans. It mainly includes the following aspects: (1) The evolutionary distance of 25 million years makes the brain weight of macaques approximately one-tenth that of humans (<xref ref-type="bibr" rid="B28">Nei et al., 2001</xref>; <xref ref-type="bibr" rid="B11">Donahue et al., 2016</xref>). The narrow and prominent frontal lobe and the eyes surrounded by fatty tissue near the brain of macaques all make it difficult to extract with methods based on humans, as shown in <xref ref-type="fig" rid="F1">Figure 1</xref>. (2) The species differences between macaques and humans necessitate increased spatial resolution to achieve the anatomical resolution that can be compared between them. However, a smaller voxel means a lower signal-to-noise ratio. To improve the signal-to-noise ratio, researchers have collected macaque data under higher field strengths (such as 4.7 T, 7 T, 9.4 T, and 11.7 T). The ultrahigh field strength will increase the heterogeneity of B0 and B1, which will strongly influence the tissue contrast, thus reducing the data quality (<xref ref-type="bibr" rid="B36">Van de Moortele et al., 2005</xref>). (3) Different macaque data collection sites use specific collection protocols and equipment (<xref ref-type="bibr" rid="B26">Milham et al., 2018</xref>), resulting in significant differences in the data quality and characteristics. To solve these challenges, other researchers have proposed methods for non-human primate data [i.e., a new option &#x201C;-monkey&#x201D; in AFNI (<xref ref-type="bibr" rid="B8">Cox, 1996</xref>), registration methods (<xref ref-type="bibr" rid="B24">Lohmeier et al., 2019</xref>; <xref ref-type="bibr" rid="B18">Jung et al., 2021</xref>)]. However, the final results mostly require manual intervention. Therefore, an automatic, rapid, and robust brain extraction method for macaques is highly desirable in non-human primate studies.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption><p>Examples of brain magnetic resonance imaging (MRI) images showing the tissue structure different between human and macaque brain. The red regions denote the brain. Compared with human brain, the macaque has narrow and prominent frontal lobe and eyes surrounded by adipose tissue.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fncom-17-1113381-g001.tif"/>
</fig>
<p>In recent years, following the revival of deep neural networks (<xref ref-type="bibr" rid="B33">Seyedhosseini et al., 2013</xref>; <xref ref-type="bibr" rid="B42">Yan et al., 2015</xref>) and the development of parallel computing (<xref ref-type="bibr" rid="B6">Coates et al., 2013</xref>; <xref ref-type="bibr" rid="B32">Schmidhuber, 2015</xref>), DCNNs have shown excellent performance in various computer vision tasks and have been widely used in medical image segmentation for human body tissue (<xref ref-type="bibr" rid="B45">Zhao et al., 2018</xref>; <xref ref-type="bibr" rid="B39">Wang et al., 2021</xref>, <xref ref-type="bibr" rid="B38">2022</xref>; <xref ref-type="bibr" rid="B16">Huang et al., 2022</xref>; <xref ref-type="bibr" rid="B34">Sun et al., 2022</xref>). The methods based on DCNN architecture can be divided into 2D (2D convolution kernel) and 3D (3D convolution kernel) methods. Based on the excellent feature extraction ability of the 2D convolution kernel, the 2D methods can extract features quickly from original volumes. However, the input of a 2D convolution network is usually a slice cut along the Z-axis, which ignores the spatial information of volume data. This would lead to limiting the ability of model segmentation. To overcome this limitation, 2D methods that make use of information from adjacent slices have been introduced (<xref ref-type="bibr" rid="B25">Lucena et al., 2019</xref>; <xref ref-type="bibr" rid="B43">Zhang et al., 2019</xref>). Specifically, adjacent slices cropped from volumetric images are fed into the 2D networks as the 3D segmentation volume by simply stacking the 2D slices. Although adjacent slices are employed, it is still not enough to probe the spatial information along the third dimension, which leads to the underfitting of 2D convolution to spatial information. In particular, macaque brain samples have the nature of thick-slice scanning, and more voxels are primarily anisotropic (0.60&#x00D7;1.20&#x00D7;0.60 mm, 0.50&#x00D7;0.55&#x00D7;0.55 mm, 0.75&#x00D7;0.75&#x00D7;0.75 mm), which corrupts the extraction of brain tissue from macaque brain MRI images. To make full use of the context information of 3D medical volume data, 3D DCNNs are applied to the field of brain image segmentation (<xref ref-type="bibr" rid="B17">Hwang et al., 2019</xref>; <xref ref-type="bibr" rid="B7">Coupeau et al., 2022</xref>). <xref ref-type="bibr" rid="B19">Kleesiek et al. (2016)</xref> first proposed an end-to-end 3D DCNN for human brain extraction, and then 2D U-Net was also extended to 3D for the 3D dataset (<xref ref-type="bibr" rid="B5">&#x00C7;i&#x00E7;ek et al., 2016</xref>). <xref ref-type="bibr" rid="B27">Milletari et al. (2016)</xref> used 3D FCN and dice loss to construct a v-net network for MR image segmentation, and then <xref ref-type="bibr" rid="B10">Dolz et al. (2018)</xref> used it to segment subcutaneous brain tissue. Compared with 2D networks, 3D networks suffer from high computational costs and GPU memory consumption. The high memory consumption limits the depth of the network as well as the filter&#x2019;s field of view. To take advantage of 3D information and reduce the negative impact of 3D networks, <xref ref-type="bibr" rid="B22">Li et al. (2018)</xref> used a 2D H-DenseUNet to extract the features within the slices and a 3D H-DenseUNet to extract the features between the slices. Furthermore, the work formulated the learning process in an end-to-end manner. The intra-slice representations and inter-slice features were fused by the HFF layer, which improved the segmentation effect of the liver and tumor. In human brain data, hundreds of training samples and validation samples are used to ensure accuracy and relieve the overfitting of the model. However, compared with the human brain, the small sample characteristics of macaques limit the training of an additional 3D network. On the other hand, <xref ref-type="bibr" rid="B29">Prasoon et al. (2013)</xref> sliced on the X, Y, and Z axes and input the slices of the different axes into three 2D FCNs to compensate for the absence of the 3D features in training. Similarly, to reduce training time, <xref ref-type="bibr" rid="B4">Chen et al. (2021)</xref> proposed a triple U-Net composed of three U-Net networks, and the input of three networks is one frame slice image (each frame of slice data contains three adjacent slices). Two auxiliary U-Net networks supplemented and constrained the training of the main U-Net network, which significantly improved the accuracy of whole-brain segmentation. These studies have shown that the applications of intra-slice representations and inter-slice features are more conducive to improving accuracy in medical image segmentation. However, using multiple networks for feature fusion will increase the complexity and computational cost of the network. Therefore, a separate network to fuse 2D and 3D information to reduce the amount of calculation and simplify the training process for macaque brain extraction requires further research.</p>
<p>The present work attempts to overcome the above problems to develop a general brain extraction model based on deep learning for non-human primates. To achieve higher accuracy, the model can efficiently extract the intra-slice representations and inter-slice features from insufficient macaque data. It can also have better generalization and stability on untrained data sites. The main contributions of our research are threefold. First, to overcome the challenge of small sample sizes, the present work increases the amount of data by using slices in three directions of volume data. Second, our research uses 3D convolutions to extract the brain directly from the &#x201C;Data Block&#x201D;, which effectively represents more spatial features and relieves the underfitting of 2D convolutions to spatial features. Compared with directly loading 3D volume data, this method reduces the amount of computation. Finally, this research proposes an end-to-end hybrid convolutional encoding and decoding structure model (HC-Net) to balance the calculation and performance. The model represents the spatial information of existing data better and achieves higher extraction accuracy. Besides, the model reduces the learning burdens and the training complexity.</p>
</sec>
<sec id="S2" sec-type="materials|methods">
<title>2. Materials and methods</title>
<sec id="S2.SS1">
<title>2.1. Dataset</title>
<p>The MRI macaque data are publicly available from the recent NHP data-sharing consortium - the non-human PRIMate Data Exchange (PRIME-DE) (<xref ref-type="bibr" rid="B26">Milham et al., 2018</xref>). This research selected one anatomical T1w image per macaque in our study. Because the number of samples of individual data site is too small, we used the joint data of multiple sites [Newcastle University Medical School (Newcastle), <italic>N</italic> = 5, the University of California, Davis (ucdavis), <italic>N</italic> = 5, Mount Sinai School of Medicine (Phillips) [Mountsinai-P], <italic>N</italic> = 5, Stem Cell and Brain Research Institute (sbri), <italic>N</italic> = 5, University of Minnesota (UMN), <italic>N</italic> = 2, Institute of Neuroscience (Ion), <italic>N</italic> = 5, East China Normal University Chen (ecnu-chen), <italic>N</italic> = 5] for training and testing (Macaque dataset I, <italic>N</italic> = 32). The data of different field strengths [Lyon Neuroscience Research Center (Lyon), 1.5 T, <italic>N</italic> = 4; Mount Sinai School of Medicine Siemens scanner (mountsinai-S), 3 T, <sub><italic>N=5</italic></sub>; Newcastle University Medical School (Newcastle), 4.7 T, <italic>N = 5</italic>; University of Minnesota (UMN), 7 T, <italic>N</italic> = 2; University of Western Ontario (UWO), 7 T, <italic>N</italic> = 3] were used as an additional dataset (Macaque dataset II, <italic>N</italic> = 19) to further verify the performance of the model. Detailed data information about the data alliance can be found on the website <ext-link ext-link-type="uri" xlink:href="https://fcon_1000.projects.nitrc.org/indi/indiPRIME.html">https://fcon_1000.projects.nitrc.org/indi/indiPRIME.html</ext-link>. Each selected T1w image was segmented manually to make a ground truth mask.</p>
<p>Human T1w MRI images and macaque B0 images were used to extend the proposed model to facilitate the brain extraction of different species and modes. The human data used in the present study are publicly available from the Human Connectome Project (HCP) (<xref ref-type="bibr" rid="B37">Van Essen et al., 2012</xref>), WU-Minn 1,200 subjects data release. In this study, the training dataset included 50 brain T1w subjects, and the test dataset included 17 subjects. The ground truth masks were created by their corresponding brain tissue files. The macaque B0 images were obtained from diffusion-weighted imaging (DWI) data in the UWM dataset of PRIME-DE. To improve the image quality, head motion eddy current correction and gradient direction correction were carried out for the original DWI images. In particular, to reduce the workload of manual brain extraction, this work applied the existing T1w image mask registration to B0 images to eliminate most non-brain tissue. At this time, the B0 images still contain non-brain tissues such as eyeballs and fat. The ground truth masks were manually made for the B0 images. The samples of 25 macaques were used for training, and 10 were tested.</p>
</sec>
<sec id="S2.SS2">
<title>2.2. &#x201C;Data Block&#x201D; pre-processing</title>
<p>The macaque MRI image is a three-dimensional volume, and there is context information between consecutive slices. For training, the slices will be independently input into the model, which will destroy the dependency between slices. Therefore, this research took three consecutive slices as a &#x201C;Data Block&#x201D; to maintain this relationship between slices and smooth the contour of brain tissue by constraints from adjacent slices. <xref ref-type="fig" rid="F2">Figure 2</xref> shows the manufacturing process of the &#x201C;Data Block&#x201D;. We continuously read three adjacent 2D slices <italic>s</italic>&#x2212;1,<italic>s</italic>,<italic>s</italic> + 1 (<italic>s</italic> &#x2208; [2,<italic>N</italic>&#x2212;1], where <sub><italic>N</italic></sub> is the number of sample slices) as a &#x201C;Data Block&#x201D; in gray mode. This research took the middle slice label of this block as the label. The reading step was set to 1, that was, the <italic>i<sup>th</sup></italic> block was (<italic>s</italic>&#x2212;1,<italic>s</italic>,<italic>s</italic> + 1), and the block (<italic>i</italic> + 1)<sup><italic>th</italic></sup> was (<italic>s</italic>,<italic>s</italic> + 1,<italic>s</italic> + 2).</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption><p>The manufacturing process of the &#x201C;Data Block&#x201D;.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fncom-17-1113381-g002.tif"/>
</fig>
<p>To increase the training data, each sample was sliced along the coronal plane, sagittal plane, and horizontal plane. The final dataset was obtained by splicing slices of three planes. The initial T1w data are anisotropic. To merge the probability maps of the three axes, we resampled the image to a size of 256&#x00D7;256 by using the double trilinear interpolation method. Furthermore, to reduce the heterogeneity between the data of different sites and improve the quality of images, the intensity of the data was standardized so that the intensity values were between 0 and 1. T1w MRI images of humans and B0 images of macaques were also pre-processed in this way.</p>
</sec>
<sec id="S2.SS3">
<title>2.3. HC-Net network</title>
<p>Although 2D convolutions have achieved great success in many segmentation tasks, they are incapable of exploring inter-slice information. To this end, this study first uses two 3D convolution blocks to obtain more context information of slices from the original volumes. Then, to reduce the amount of computation and increase the receptive field of the network, the proposed model employs the encoding and decoding structure to construct the HC-Net network, which can be trained with a small dataset. Finally, the skip connection retains the details after each encoding to reduce the loss of bottom features and integrate multiscale information to improve performance.</p>
<p>The network structure of HC-Net is shown in <xref ref-type="fig" rid="F3">Figure 3</xref>. It includes the encoding path and decoding path. The encoding path comprises five encoders, and each encoder is composed of two convolution layers. The first and second encoders are 3D convolution modules connected with a normalization operation and a ReLU activation function after convolution. With the 3D convolution kernels, these encoders can better extract spatial information from the original volumes. The third to fifth encoders are 2D convolution modules. They can significantly reduce the number of parameters and complexity of model training. To achieve this, the <italic>To</italic>_4<italic>D</italic> operation is used to convert tensors of different dimensions. A max-pooling operation is used after each encoder to reduce the image resolution in the encoding path. In the decoding path, there are four decoders. The first two decoders are 2D modules, and the last two decoders are 3D modules. Each decoder is connected with a permutation convolution and a ReLU activation function for upsampling. After upsampling, each decoder concatenates the feature maps of the corresponding size in the encoding path by the skip connection. After upsampling three times, through the <italic>To</italic>_5<italic>D</italic> operation, the feature maps are converted into 5-dimensional tensors and concatenated with the corresponding feature maps in the encoding path. Then, the hybrid feature maps are input into 3D decoders. After the last decoder, this study employs a 1&#x00D7;1&#x00D7;1 convolution layer to map the final feature maps to a two-class map. Finally, a softmax layer is used to obtain the probability map for the brain tissue.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption><p>Architecture of the hybrid convolutional neural network (HC-Net). Given the input &#x201C;Data Block&#x201D;, two 3D encoders are first used to obtain more context information of slices. Then, 2D encoders are utilized to reduce the amount of computation. Finally, the encoding and decoding structure are used to increase the receptive field of the network and to restore the size of the original image.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fncom-17-1113381-g003.tif"/>
</fig>
<p>We denote the input training samples as <italic>IN</italic> &#x2208; <italic>R</italic><sup><italic>N</italic>&#x00D7;<italic>C</italic>&#x00D7;<italic>D</italic>&#x00D7;<italic>H</italic>&#x00D7;<italic>W</italic></sup> with ground truth labels <italic>G</italic> &#x2208; <italic>R</italic><sup><italic>N</italic>&#x00D7;<italic>C</italic>&#x00D7;<italic>D</italic>&#x00D7;<italic>H</italic>&#x00D7;<italic>W</italic></sup>, where <italic>N</italic> denotes the batch size of the input training samples, <italic>C</italic> denotes the channel, and <sub><italic>D&#x00D7;H&#x00D7;W</italic></sub> denotes the size of the samples. <italic>G</italic>(<italic>x</italic>,<italic>y</italic>,<italic>z</italic>) = 0 or 1 indicates that the pixel (<italic>x</italic>,<italic>y</italic>,<italic>z</italic>) is tagged with the class brain (1) or non-brain (0). Let <italic>IN</italic><sub>3d</sub> &#x2208; <italic>R</italic><sup><italic>N</italic>&#x00D7;1&#x00D7;3&#x00D7;256&#x00D7;256</sup> denote the input of the first encoder and <italic>F</italic><sub>3d</sub> denote some sequence operations of 3D convolution, batch normalization, and the activation function. The learning process of 3D convolutions in the encoding path can be described as follows:</p>
<disp-formula id="S2.E1">
<label>(1)</label>
<mml:math id="M1">
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">X</mml:mtext>
<mml:mi mathvariant="bold">e2</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi mathvariant="normal">F</mml:mi>
<mml:mrow>
<mml:mn>3</mml:mn>
<mml:mo>&#x2062;</mml:mo>
<mml:mi mathvariant="normal">d</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">IN</mml:mtext>
<mml:mrow>
<mml:mn mathvariant="bold">3</mml:mn>
<mml:mi mathvariant="bold">d</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">X</mml:mtext>
<mml:mi mathvariant="bold">e2</mml:mi>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mtext mathvariant="bold">R</mml:mtext>
<mml:mrow>
<mml:mi mathvariant="normal">N</mml:mi>
<mml:mo>&#x00D7;</mml:mo>
<mml:mn>32</mml:mn>
<mml:mo>&#x00D7;</mml:mo>
<mml:mn>3</mml:mn>
<mml:mo>&#x00D7;</mml:mo>
<mml:mn>64</mml:mn>
<mml:mo>&#x00D7;</mml:mo>
<mml:mn>64</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <italic>X</italic><sub><italic>e2</italic></sub> represents the features after the second encoder. The <italic>To</italic>_4<italic>D</italic> operation converts the 5-dimensional tensors into 4-dimensional tensors by stacking the batch and depth dimensions and inputting them into the 2D encoder. First, this research records the depth dimension and then swaps the channel dimension and depth dimension. Second, the data are split along the batch size dimension and then spliced along the depth dimension. Finally, the tensor is compressed into a 4-dimensional tensor. The details of the <italic>To</italic>_4<italic>D</italic> operation are shown in <xref ref-type="fig" rid="F4">Figure 4</xref>. The <italic>To</italic>_4<italic>D</italic> operation is as follows:</p>
<disp-formula id="S2.E2">
<label>(2)</label>
<mml:math id="M2">
<mml:mrow>
<mml:msubsup>
<mml:mtext mathvariant="bold">X</mml:mtext>
<mml:mi mathvariant="bold">e2</mml:mi>
<mml:msup>
<mml:mi/>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
</mml:msubsup>
<mml:mo>=</mml:mo>
<mml:mi>T</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi mathvariant="normal">_</mml:mi>
<mml:mn>4</mml:mn>
<mml:mi>D</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">X</mml:mtext>
<mml:mrow>
<mml:mi mathvariant="bold">e2</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msubsup>
<mml:mtext mathvariant="bold">X</mml:mtext>
<mml:mi mathvariant="bold">e2</mml:mi>
<mml:msup>
<mml:mi/>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
</mml:msubsup>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mtext mathvariant="bold">R</mml:mtext>
<mml:mrow>
<mml:mrow>
<mml:mn>3</mml:mn>
<mml:mo>&#x2062;</mml:mo>
<mml:mi mathvariant="normal">N</mml:mi>
</mml:mrow>
<mml:mo>&#x00D7;</mml:mo>
<mml:mn>32</mml:mn>
<mml:mo>&#x00D7;</mml:mo>
<mml:mn>64</mml:mn>
<mml:mo>&#x00D7;</mml:mo>
<mml:mn>64</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption><p>Details of the To_4D operation and To_5D operation.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fncom-17-1113381-g004.tif"/>
</fig>
<p>where <inline-formula><mml:math id="INEQ39"><mml:msubsup><mml:mtext mathvariant="bold">X</mml:mtext><mml:mi mathvariant="bold">e2</mml:mi><mml:msup><mml:mi/><mml:mo>&#x2032;</mml:mo></mml:msup></mml:msubsup></mml:math></inline-formula> denotes the input data of the third encoder.</p>
<p><italic>F</italic><sub>2<italic>d</italic></sub> denotes some sequence operations of the 2D convolution, batch normalization, and activation functions, and the training processing of the 2D encoder and decoder can be denoted as:</p>
<disp-formula id="S2.E3">
<label>(3)</label>
<mml:math id="M3">
<mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">X</mml:mtext>
<mml:mi mathvariant="bold">d3</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="normal">F</mml:mi>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>&#x2062;</mml:mo>
<mml:mi mathvariant="normal">d</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msubsup>
<mml:mtext mathvariant="bold">X</mml:mtext>
<mml:mi mathvariant="bold">e2</mml:mi>
<mml:msup>
<mml:mi/>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
</mml:msubsup>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mo>,</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">X</mml:mtext>
<mml:mi mathvariant="bold">d3</mml:mi>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mtext mathvariant="bold">R</mml:mtext>
<mml:mrow>
<mml:mrow>
<mml:mn>3</mml:mn>
<mml:mo>&#x2062;</mml:mo>
<mml:mi mathvariant="normal">N</mml:mi>
</mml:mrow>
<mml:mo>&#x00D7;</mml:mo>
<mml:mn>32</mml:mn>
<mml:mo>&#x00D7;</mml:mo>
<mml:mn>128</mml:mn>
<mml:mo>&#x00D7;</mml:mo>
<mml:mn>128</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <bold>X<sub>d3</sub></bold> denotes the feature maps from the third upsampling layer. The 4D tensor is converted into the 5D tensor by splitting the batch size dimension to restore the depth dimension,</p>
<disp-formula id="S2.E4">
<label>(4)</label>
<mml:math id="M4">
<mml:mrow>
<mml:mrow>
<mml:msubsup>
<mml:mtext mathvariant="bold">X</mml:mtext>
<mml:mi mathvariant="bold">d3</mml:mi>
<mml:msup>
<mml:mi/>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
</mml:msubsup>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>o</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi mathvariant="normal">_</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mn>5</mml:mn>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>D</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">X</mml:mtext>
<mml:mi mathvariant="bold">d3</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mi>D</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mo>,</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mtext mathvariant="bold">X</mml:mtext>
<mml:mi mathvariant="bold">d3</mml:mi>
<mml:msup>
<mml:mi/>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
</mml:msubsup>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mtext mathvariant="bold">R</mml:mtext>
<mml:mrow>
<mml:mi>N</mml:mi>
<mml:mo>&#x00D7;</mml:mo>
<mml:mn>32</mml:mn>
<mml:mo>&#x00D7;</mml:mo>
<mml:mn>3</mml:mn>
<mml:mo>&#x00D7;</mml:mo>
<mml:mn>128</mml:mn>
<mml:mo>&#x00D7;</mml:mo>
<mml:mn>128</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <italic>D</italic> denotes the depth dimension from the <italic>To</italic>_4<italic>D</italic> operation and <inline-formula><mml:math id="INEQ44"><mml:msubsup><mml:mtext mathvariant="bold">X</mml:mtext><mml:mi mathvariant="bold">d3</mml:mi><mml:msup><mml:mi/><mml:mo>&#x2032;</mml:mo></mml:msup></mml:msubsup></mml:math></inline-formula> denotes the 5D tensor feature volume from the third upsampling layer. The details of the <italic>To</italic>_5<italic>D</italic> operation are shown in <xref ref-type="fig" rid="F4">Figure 4</xref>. Especially after 2D convolution in the decoding path, the 3D decoder is trained based not only on the features detected in the 2D decoder but also on the 3D context features from the second encoder. The hybrid features from the 2D and 3D convolutions are jointly learned after the third upsampling layer in the decoding path. The hybrid operation can be described as follows:</p>
<disp-formula id="S2.E5">
<label>(5)</label>
<mml:math id="M5">
<mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">X</mml:mtext>
<mml:mi mathvariant="bold">h</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mtext mathvariant="bold">X</mml:mtext>
<mml:mi mathvariant="bold">d3</mml:mi>
<mml:msup>
<mml:mi/>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
</mml:msubsup>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">X</mml:mtext>
<mml:msup>
<mml:mi mathvariant="bold">e2</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
</mml:msub>
</mml:mrow>
</mml:mrow>
<mml:mo>,</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">X</mml:mtext>
<mml:mi mathvariant="bold">h</mml:mi>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mi mathvariant="normal">R</mml:mi>
<mml:mrow>
<mml:mi mathvariant="normal">N</mml:mi>
<mml:mo>&#x00D7;</mml:mo>
<mml:mn>64</mml:mn>
<mml:mo>&#x00D7;</mml:mo>
<mml:mn>3</mml:mn>
<mml:mo>&#x00D7;</mml:mo>
<mml:mn>128</mml:mn>
<mml:mo>&#x00D7;</mml:mo>
<mml:mn>128</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mo>,</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="normal">X</mml:mi>
<mml:msup>
<mml:mi>e2</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mi mathvariant="normal">R</mml:mi>
<mml:mrow>
<mml:mi mathvariant="normal">N</mml:mi>
<mml:mo>&#x00D7;</mml:mo>
<mml:mn>32</mml:mn>
<mml:mo>&#x00D7;</mml:mo>
<mml:mn>3</mml:mn>
<mml:mo>&#x00D7;</mml:mo>
<mml:mn>128</mml:mn>
<mml:mo>&#x00D7;</mml:mo>
<mml:mn>128</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>Where <italic>X</italic><sub><italic>h</italic></sub> denotes the hybrid features and <italic>X</italic><sub><italic>e</italic>2&#x2032;</sub> denotes the output of the second encoder without max-pooling. More details of the HC-Net network are given in <xref ref-type="table" rid="T1">Table 1</xref>.</p>
<table-wrap position="float" id="T1">
<label>TABLE 1</label>
<caption><p>Architectures of the proposed HC-Net. The feature size column indicates the output size of the current stage. The &#x201C;(3<italic>D</italic>,3&#x00D7;3&#x00D7;3,1,1)&#x00D7;2&#x201D; corresponds to the 3D convolution with two convolution kernels of 3&#x00D7;3&#x00D7;3, stride 1 and padding 1. &#x201C;MaxPool3d, 1&#x00D7;2&#x00D7;2,1&#x00D7;2&#x00D7;2&#x201D; corresponds to max-pooling with a sliding window size of 1&#x00D7;2&#x00D7;2, stride 1&#x00D7;2&#x00D7;2. &#x201C;2<italic>DCT</italic>,4&#x00D7;4,1,1&#x201D; corresponds to 2D transpose convolution with a kernel of 4&#x00D7;4, stride 1, and padding 1.</p></caption>
<table cellspacing="5" cellpadding="5" frame="box" rules="all">
<thead>
<tr>
<td valign="top" align="left" style="color:#ffffff;background-color: #7f8080;"></td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">HC-Net</td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">Feature size</td>
<td valign="top" align="left" style="color:#ffffff;background-color: #7f8080;"></td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">HC-Net</td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">Feature size</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Input</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">3&#x00D7;256&#x00D7;256</td>
<td valign="top" align="left">Encode 1</td>
<td valign="top" align="center">(2<italic>D</italic>,3&#x00D7;3,1,1)&#x00D7;2</td>
<td valign="top" align="center">128&#x00D7;32&#x00D7;32</td>
</tr>
<tr>
<td valign="top" align="left"><italic>Encoder1</italic></td>
<td valign="top" align="center">(3<italic>D</italic>,3&#x00D7;3&#x00D7;3,1,1)&#x00D7;2</td>
<td valign="top" align="center">16&#x00D7;3&#x00D7;256&#x00D7;256</td>
<td valign="top" align="left">Upconv</td>
<td valign="top" align="center">2<italic>DCT</italic>,4&#x00D7;4,2,1</td>
<td valign="top" align="center">64&#x00D7;64&#x00D7;64</td>
</tr>
<tr>
<td valign="top" align="left"><italic>MaxPool3d</italic></td>
<td valign="top" align="center">1&#x00D7;2&#x00D7;2,1&#x00D7;2&#x00D7;2</td>
<td valign="top" align="center">16&#x00D7;3&#x00D7;128&#x00D7;128</td>
<td valign="top" align="left">Concatenate</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">128&#x00D7;64&#x00D7;64</td>
</tr>
<tr>
<td valign="top" align="left"><italic>Encoder2</italic></td>
<td valign="top" align="center">(3<italic>D</italic>,3&#x00D7;3&#x00D7;3,1,1)&#x00D7;2</td>
<td valign="top" align="center">32&#x00D7;3&#x00D7;128&#x00D7;128</td>
<td valign="top" align="left">Decoder 2</td>
<td valign="top" align="center">(2<italic>D</italic>,3&#x00D7;3,1,1)&#x00D7;2</td>
<td valign="top" align="center">64&#x00D7;64&#x00D7;64</td>
</tr>
<tr>
<td valign="top" align="left"><italic>MaxPool3d</italic></td>
<td valign="top" align="center">1&#x00D7;2&#x00D7;2,1&#x00D7;2&#x00D7;2</td>
<td valign="top" align="center">32&#x00D7;3&#x00D7;64&#x00D7;64</td>
<td valign="top" align="left">Upconv</td>
<td valign="top" align="center">2<italic>DCT</italic>,4&#x00D7;4,2,1</td>
<td valign="top" align="center">32&#x00D7;128&#x00D7;128</td>
</tr>
<tr>
<td valign="top" align="left"><italic>To</italic>_4D</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">3&#x00D7;32&#x00D7;64&#x00D7;64</td>
<td valign="top" align="left">To_5D</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">32&#x00D7;3&#x00D7;128&#x00D7;128</td>
</tr>
<tr>
<td valign="top" align="left"><italic>Encoder3</italic></td>
<td valign="top" align="center">(2<italic>D</italic>,3&#x00D7;3,1,1)&#x00D7;2</td>
<td valign="top" align="center">64&#x00D7;64&#x00D7;64</td>
<td valign="top" align="left">Concatenate</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">64&#x00D7;3&#x00D7;128&#x00D7;128</td>
</tr>
<tr>
<td valign="top" align="left">MaxPool2d</td>
<td valign="top" align="center">2&#x00D7;2</td>
<td valign="top" align="center">64&#x00D7;32&#x00D7;32</td>
<td valign="top" align="left">Decoder3</td>
<td valign="top" align="center">(3<italic>D</italic>,3&#x00D7;3&#x00D7;3,1,1)&#x00D7;2</td>
<td valign="top" align="center">32&#x00D7;3&#x00D7;128&#x00D7;128</td>
</tr>
<tr>
<td valign="top" align="left">Encoder 4</td>
<td valign="top" align="center">(2<italic>D</italic>,3&#x00D7;3,1,1)&#x00D7;2</td>
<td valign="top" align="center">128&#x00D7;32&#x00D7;32</td>
<td valign="top" align="left">Upconv</td>
<td valign="top" align="center">3<italic>DCT</italic>,3&#x00D7;4&#x00D7;4,2,1</td>
<td valign="top" align="center">16&#x00D7;3&#x00D7;128&#x00D7;128</td>
</tr>
<tr>
<td valign="top" align="left">MaxPool2d</td>
<td valign="top" align="center">2&#x00D7;2</td>
<td valign="top" align="center">128&#x00D7;16&#x00D7;16</td>
<td valign="top" align="left">Concatenate</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">32&#x00D7;3&#x00D7;256&#x00D7;256</td>
</tr>
<tr>
<td valign="top" align="left">Encoder 5</td>
<td valign="top" align="center">(2<italic>D</italic>,3&#x00D7;3,1,1)&#x00D7;2</td>
<td valign="top" align="center">256&#x00D7;16&#x00D7;16</td>
<td valign="top" align="left">Decoder 4</td>
<td valign="top" align="center">(3<italic>D</italic>,3&#x00D7;3&#x00D7;,1,1)&#x00D7;2(3<italic>D</italic>,3&#x00D7;3&#x00D7;3,1,1)&#x00D7;2</td>
<td valign="top" align="center">16&#x00D7;3&#x00D7;256&#x00D7;256</td>
</tr>
<tr>
<td valign="top" align="left">Upconv</td>
<td valign="top" align="center">2<italic>DCT</italic>,4&#x00D7;4,2,1</td>
<td valign="top" align="center">128&#x00D7;32&#x00D7;32</td>
<td valign="top" align="left">Out layer</td>
<td valign="top" align="center">3<italic>D</italic>,3&#x00D7;3&#x00D7;3,1,1</td>
<td valign="top" align="center">2&#x00D7;3&#x00D7;256&#x00D7;256</td>
</tr>
<tr>
<td valign="top" align="left">Concatenate</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">256&#x00D7;32&#x00D7;32</td>
<td valign="top" align="left">&#x2212;</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">&#x2212;</td>
</tr>
</tbody>
</table></table-wrap>
</sec>
<sec id="S2.SS4">
<title>2.4. Loss function and evaluation indicators</title>
<p>The cross-entropy loss function was employed as the loss function in this study to train the networks, which can be described as:</p>
<disp-formula id="S2.E6">
<label>(6)</label>
<mml:math id="M6">
<mml:mrow>
<mml:mrow>
<mml:mi>L</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>o</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>y</mml:mi>
<mml:mo>,</mml:mo>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo stretchy="false">^</mml:mo>
</mml:mover>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mi>N</mml:mi>
</mml:mfrac>
<mml:mo>&#x2062;</mml:mo>
<mml:munder>
<mml:mo largeop="true" movablelimits="false" symmetric="true">&#x2211;</mml:mo>
<mml:mi>i</mml:mi>
</mml:munder>
</mml:mrow>
<mml:mo>-</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:msubsup>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>c</mml:mi>
</mml:msubsup>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mi>log</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msubsup>
<mml:mi>p</mml:mi>
<mml:mi>i</mml:mi>
<mml:mn>1</mml:mn>
</mml:msubsup>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mo>+</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>-</mml:mo>
<mml:msubsup>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>c</mml:mi>
</mml:msubsup>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mi>log</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>-</mml:mo>
<mml:msubsup>
<mml:mi>p</mml:mi>
<mml:mi>i</mml:mi>
<mml:mn>1</mml:mn>
</mml:msubsup>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>Where <inline-formula><mml:math id="INEQ115"><mml:msubsup><mml:mi>y</mml:mi><mml:mi>i</mml:mi><mml:mi>c</mml:mi></mml:msubsup></mml:math></inline-formula> indicates the ground truth label for voxel <italic>i</italic> (brain or non-brain) and <inline-formula><mml:math id="INEQ117"><mml:msubsup><mml:mi>p</mml:mi><mml:mi>i</mml:mi><mml:mn>1</mml:mn></mml:msubsup></mml:math></inline-formula> denotes the probability of voxel <italic>i</italic> belonging to the brain.</p>
<p>In this paper, true position (<italic>TP</italic>), true negative (<italic>TN</italic>), false positive (<italic>FP</italic>), and false negative (<italic>FN</italic>) were used to mark the comparison between the extraction result and ground truth. The Dice coefficient (<italic>Dice</italic>), sensitivity (<italic>Sen</italic>), specificity (<italic>Spe</italic>), and volumetric overlap error (<italic>VOE</italic>) are mainly used to evaluate the model&#x2019;s performance in medical image segmentation. These evaluation indicators can be formulated as follows:</p>
<disp-formula id="S2.E7">
<label>(7)</label>
<mml:math id="M7">
<mml:mrow>
<mml:mrow>
<mml:mi>D</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>c</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>e</mml:mi>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>T</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>T</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>P</mml:mi>
</mml:mrow>
<mml:mo>+</mml:mo>
<mml:mrow>
<mml:mi>F</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>P</mml:mi>
</mml:mrow>
<mml:mo>+</mml:mo>
<mml:mrow>
<mml:mi>F</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="S2.E8">
<label>(8)</label>
<mml:math id="M8">
<mml:mrow>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>e</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>v</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>y</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>e</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>n</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>P</mml:mi>
</mml:mrow>
<mml:mo>+</mml:mo>
<mml:mrow>
<mml:mi>F</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="S2.E9">
<label>(9)</label>
<mml:math id="M9">
<mml:mrow>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>e</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>c</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>f</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>c</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>y</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>e</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mi>F</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>P</mml:mi>
</mml:mrow>
<mml:mo>+</mml:mo>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="S2.E10">
<label>(10)</label>
<mml:math id="M10">
<mml:mrow>
<mml:mrow>
<mml:mi>V</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>O</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>E</mml:mi>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>-</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mi>F</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mo>+</mml:mo>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>P</mml:mi>
</mml:mrow>
<mml:mo>+</mml:mo>
<mml:mrow>
<mml:mi>F</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>P</mml:mi>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>The Dice coefficient is used to calculate the similarity between two samples whose value ranges between 0 and 1, and a higher value indicates similarity. This measurement function represents the ratio of the intersection area of two samples to the total area. Sensitivity is an index to measure the ability of the extraction algorithm to correctly identify the brain, which indicates the proportion of pixels correctly judged as brain tissue. Specificity is an index to measure the ability of the brain extraction algorithm to correctly identify non-brain tissue, indicating the proportion of pixels that are non-brain tissue that is correctly judged as non-brain tissue. The lower the volumetric overlap error is, the higher the sample similarity.</p>
</sec>
</sec>
<sec id="S3">
<title>3. Experiments and results</title>
<sec id="S3.SS1">
<title>3.1. Implementation details</title>
<p>The HC-Net model was implemented with the PyTorch framework and ran on an NVIDIA RTX 3090 GPU. It is trained end-to-end, which means that the &#x201C;Data Block&#x201D; is provided as input without any other process or additional network. The initial minimum learning rate was 1&#x00D7;10<sup>&#x2212;4</sup>, the training batch size was 20, and the training epoch was 50. To train this network, this research employed the cross-entropy loss function to calculate the loss between the ground-truth labels and the predicted labels of the &#x201C;Data Block&#x201D;. The training time of the HC-Net model was approximately 4 h. For a fair comparison, state-of-the-art methods such as SegNet (<xref ref-type="bibr" rid="B2">Badrinarayanan et al., 2017</xref>), 2D U-Net (<xref ref-type="bibr" rid="B31">Ronneberger et al., 2015</xref>), 3D U-Net (<xref ref-type="bibr" rid="B5">&#x00C7;i&#x00E7;ek et al., 2016</xref>), U<sup>2</sup>-Net (<xref ref-type="bibr" rid="B30">Qin et al., 2020</xref>), and UNet 3+ (<xref ref-type="bibr" rid="B15">Huang et al., 2020</xref>) were trained with the same training data and tested on the same test data in all experiments.</p>
</sec>
<sec id="S3.SS2">
<title>3.2. Comparison with other methods</title>
<p>In this section, we conduct comprehensive experiments to analyze the effectiveness of our proposed method on dataset I.</p>
<p><xref ref-type="fig" rid="F5">Figure 5</xref> shows the training losses of 2D U-Net, 3D U-Net, and HC-Net. The loss converged faster for the HC-Net model than for the 2D U-Net and 3D U-Net models. In addition, the loss of the HC-Net model was the smallest. <xref ref-type="fig" rid="F6">Figure 6</xref> shows that the Dice coefficients are more stable for the HC-Net model in the validation dataset. As shown in <xref ref-type="fig" rid="F5">Figures 5</xref>, <xref ref-type="fig" rid="F6">6</xref>, the performance of the 3D U-Net model was always lower than that of the 2D U-Net model under the same epoch (higher loss and lower Dice coefficients across 50 epochs). This highlighted the effectiveness and efficiency of 2D convolutions. This was because the 3D convolutions consume a large amount of GPU memory, so the network converged slowly. After the 40th epoch, the Dice coefficients of the validation set in the 3D U-Net model were relatively stable, but the values were lower than 0.9, and the expressiveness was weak. The HC-Net model showed a higher Dice coefficient, and the performance of the model tended to be stable. The result meant our model achieved better performance in macaque brain extraction.</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption><p>Loss in the training process of hybrid convolutional neural network (HC-Net), 2D U-Net, and 3D U-Net.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fncom-17-1113381-g005.tif"/>
</fig>
<fig id="F6" position="float">
<label>FIGURE 6</label>
<caption><p>Dice coefficients on the validation set.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fncom-17-1113381-g006.tif"/>
</fig>
<p>The performances of different brain extraction methods were evaluated by using T1w images of 12 samples. These samples were not the participants for training and were located in different sites from the training sets. The extraction results from all test volumes were obtained and compared with the ground truth labels. <xref ref-type="table" rid="T2">Table 2</xref> shows the average of the indicators. The HC-Net model outperformed the state-of-the-art methods by a margin on Dice. All the evaluation indicators of the HC-Net network were higher than those of FSL and AFNI. Compared with the Dice coefficient indicator, the proposed method was approximately 0.3% &#x223C;10.48% higher than SegNet, 2D U-Net, 3D U-Net, U<sup>2</sup>-Net, and UNet3+. Experimental results confirmed that our model can robustly handle each example by incorporating the advantages of 3D convolution and 2D convolution for learning feature representations on intra-slice and inter-slice features.</p>
<table-wrap position="float" id="T2">
<label>TABLE 2</label>
<caption><p>Evaluation results of different methods.</p></caption>
<table cellspacing="5" cellpadding="5" frame="box" rules="all">
<thead>
<tr>
<td valign="top" align="left" style="color:#ffffff;background-color: #7f8080;">Method</td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">Dice</td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">Sen</td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">Spe</td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">VOE</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">FSL</td>
<td valign="top" align="center">0.7832</td>
<td valign="top" align="center">0.9372</td>
<td valign="top" align="center">0.9678</td>
<td valign="top" align="center">0.3251</td>
</tr>
<tr>
<td valign="top" align="left">AFNI</td>
<td valign="top" align="center">0.8471</td>
<td valign="top" align="center">0.8180</td>
<td valign="top" align="center">0.9928</td>
<td valign="top" align="center">0.2241</td>
</tr>
<tr>
<td valign="top" align="left">SegNet</td>
<td valign="top" align="center">0.9427</td>
<td valign="top" align="center">0.9652</td>
<td valign="top" align="center">0.9940</td>
<td valign="top" align="center">0.1079</td>
</tr>
<tr>
<td valign="top" align="left">2D U-Net</td>
<td valign="top" align="center">0.9454</td>
<td valign="top" align="center">0.9775</td>
<td valign="top" align="center">0.9929</td>
<td valign="top" align="center">0.1030</td>
</tr>
<tr>
<td valign="top" align="left">3D U-Net</td>
<td valign="top" align="center">0.8498</td>
<td valign="top" align="center">0.8234</td>
<td valign="top" align="center">0.9891</td>
<td valign="top" align="center">0.2197</td>
</tr>
<tr>
<td valign="top" align="left">U<sup>2</sup>-Net</td>
<td valign="top" align="center">0.9516</td>
<td valign="top" align="center">0.9787</td>
<td valign="top" align="center">0.9939</td>
<td valign="top" align="center">0.1068</td>
</tr>
<tr>
<td valign="top" align="left">UNet3+</td>
<td valign="top" align="center">0.9430</td>
<td valign="top" align="center">0.9775</td>
<td valign="top" align="center">0.9932</td>
<td valign="top" align="center">0.1121</td>
</tr>
<tr>
<td valign="top" align="left">HC-Net</td>
<td valign="top" align="center">0.9546</td>
<td valign="top" align="center">0.9464</td>
<td valign="top" align="center">0.9973</td>
<td valign="top" align="center">0.0860</td>
</tr>
</tbody>
</table></table-wrap>
<p><xref ref-type="table" rid="T3">Table 3</xref> shows the parameters of each model and the average inference time per macaque brain for each network. Compared with 3D U-Net, SegNet, U<sup>2</sup>-Net, and UNet3+, the proposed model had fewer parameters and less average inference time. Although the next best performance in <xref ref-type="table" rid="T2">Table 2</xref> was the U<sup>2</sup>-Net model, it had at least 8.13 times more parameters and requires 1.32 more inference time than the HC-Net network. With fewer parameters, the HC-Net model had an advantage in computational efficiency with the added advantage of being trainable with a smaller dataset without compromising performance. The parameters and inference time were not inferior to those of the 2D U-Net model. This result was reasonable because of the 3D convolution in the HC-Net network. However, the accuracy of HC-Net was higher than that of the 2D U-Net model. It also confirmed that adding context information to our HC-Net model was effective for the extraction of macaque brains.</p>
<table-wrap position="float" id="T3">
<label>TABLE 3</label>
<caption><p>Comparison of weights and testing times of different methods on the test dataset.</p></caption>
<table cellspacing="5" cellpadding="5" frame="box" rules="all">
<thead>
<tr>
<td valign="top" align="left" style="color:#ffffff;background-color: #7f8080;">Model</td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">Weights (M)</td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">Inference time (minute)</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">SegNet</td>
<td valign="top" align="center">24.9444</td>
<td valign="top" align="center">1.1573</td>
</tr>
<tr>
<td valign="top" align="left">2D U-Net</td>
<td valign="top" align="center">2.4666</td>
<td valign="top" align="center">0.1120</td>
</tr>
<tr>
<td valign="top" align="left">3D U-Net</td>
<td valign="top" align="center">8.0826</td>
<td valign="top" align="center">0.5436</td>
</tr>
<tr>
<td valign="top" align="left">U<sup>2</sup>-Net</td>
<td valign="top" align="center">44.0237</td>
<td valign="top" align="center">0.3020</td>
</tr>
<tr>
<td valign="top" align="left">UNet 3+</td>
<td valign="top" align="center">26.9747</td>
<td valign="top" align="center">6.6886</td>
</tr>
<tr>
<td valign="top" align="left">HC-Net</td>
<td valign="top" align="center">5.4117</td>
<td valign="top" align="center">0.2273</td>
</tr>
</tbody>
</table></table-wrap>
<p>The box plot results shown in <xref ref-type="fig" rid="F7">Figure 7</xref> show that the median of HC-Net was the highest. There was no prominent oscillation of the Dice coefficient on the test dataset, which showed good generalization and stability. Furthermore, <xref ref-type="fig" rid="F8">Figure 8</xref> shows the sample results of the brain extraction obtained by the present study and other methods on dataset I. The blue areas represent the true positive, the red areas mean the false positive, and the green areas mean the false negative. Compared with FSL, AFNI performed better by using the 3dskullstrip (-monkey) command dedicated to brain extraction. AFNI tended to be conservative in brain extraction and showed lower sensitivity, while FSL had low specificity and retains too many non-brain voxels. The 3D U-Net recognized some skulls as the brain. The possible reason was that for low-quality and small sample macaque data, 3D convolution also introduced more noise when retaining more information, resulting in over segmentation. The HC-Net generated a relatively complete mask without extending into the skull and missing parts of the brain tissue.</p>
<fig id="F7" position="float">
<label>FIGURE 7</label>
<caption><p>Box diagram of different methods.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fncom-17-1113381-g007.tif"/>
</fig>
<fig id="F8" position="float">
<label>FIGURE 8</label>
<caption><p>Comparison of segmentation results of different methods. The blue areas mean true positive (TP); the red areas mean false positive (FP); the green areas mean false negative (FN); and the rest areas mean true negative (TN).</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fncom-17-1113381-g008.tif"/>
</fig>
</sec>
<sec id="S3.SS3">
<title>3.3. Evaluating the &#x201C;Data Block&#x201D;</title>
<p>This section used different data loading methods to verify the effectiveness of the &#x201C;Data Block&#x201D; used in this paper by dataset I. Experiments were carried out on the HC-Net model and 2D U-Net model with similar parameters. The same hyper parameters were used in all experiments (learning rate, loss function, etc.). This research used three pre-processing methods to process the data. The first method was to slice along the Z-axis of volume data and input one slice (1A1S) to the 2D U-Net model. The second method was to slice along the Z-axis and input three adjacent slices into the HC-Net and 2D U-Net models (1A3S). To increase the dataset, the setting step was also set to 1, that is, (<italic>s</italic>&#x2212;1,<italic>s</italic>,<italic>s</italic> + 1, (<italic>s</italic>,<italic>s</italic> + 1,<italic>s</italic> + 2), where &#x201C;<italic>s</italic>&#x201D; represents the slice number. The third method (3A3S) was the &#x201C;Data Block&#x201D; introduced in part 2, which adds slices of the X and Y axes based on the second method.</p>
<p><xref ref-type="table" rid="T4">Table 4</xref> shows the experimental results. By inputting three axes into the network, the amount of data was three times higher than that of a single axis. Compared with the first method, the second method increased the Dice coefficient in the 2D U-Net model by 4.09%. Compared with 2D U-Net, using the second method to input adjacent slices into the HC-Net model dramatically improved the extraction results of the test data, in which the Dice coefficient, sensitivity, and specificity were increased by 11.54, 5.98, and 3.47%, respectively, and the VOE was reduced by 16.42%. The results showed that the HC-Net model can extract more features from limited data and improve the extraction effect through the second method. It also proved that the data pre-processing method of inputting three-axis data into the HC-Net model and 2D U-Net model can improve the accuracy of brain extraction with the increase in the training data. At the same time, HC-Net had higher sensitivity to data with spatial features and can fully extract the corresponding features. Therefore, the data pre-processing method of &#x201C;Data Block&#x201D; was appropriate for the proposed model and further improved the performance.</p>
<table-wrap position="float" id="T4">
<label>TABLE 4</label>
<caption><p>The results of different pre-processing methods.</p></caption>
<table cellspacing="5" cellpadding="5" frame="box" rules="all">
<thead>
<tr>
<td valign="top" align="left" style="color:#ffffff;background-color: #7f8080;">Model</td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">Data</td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">Dice</td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">Sen</td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">Spe</td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">VOE</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">2D U-Net</td>
<td valign="top" align="center">1A1S</td>
<td valign="top" align="center">0.7612</td>
<td valign="top" align="center">0.8335</td>
<td valign="top" align="center">0.9599</td>
<td valign="top" align="center">0.3494</td>
</tr>
<tr>
<td valign="top" align="left">2D U-Net</td>
<td valign="top" align="center">1A3S</td>
<td valign="top" align="center">0.8021</td>
<td valign="top" align="center">0.8555</td>
<td valign="top" align="center">0.9564</td>
<td valign="top" align="center">0.3135</td>
</tr>
<tr>
<td valign="top" align="left">HC-Net</td>
<td valign="top" align="center">1A3S</td>
<td valign="top" align="center">0.9175</td>
<td valign="top" align="center">0.9144</td>
<td valign="top" align="center">0.9911</td>
<td valign="top" align="center">0.1493</td>
</tr>
<tr>
<td valign="top" align="left">2D U-Net</td>
<td valign="top" align="center">3A3S</td>
<td valign="top" align="center">0.9454</td>
<td valign="top" align="center">0.9775</td>
<td valign="top" align="center">0.9929</td>
<td valign="top" align="center">0.1030</td>
</tr>
<tr>
<td valign="top" align="left">HC-Net</td>
<td valign="top" align="center">3A3S</td>
<td valign="top" align="center">0.9546</td>
<td valign="top" align="center">0.9464</td>
<td valign="top" align="center">0.9973</td>
<td valign="top" align="center">0.0860</td>
</tr>
</tbody>
</table></table-wrap>
</sec>
<sec id="S3.SS4">
<title>3.4. Evaluating the performance of the model under data of different field strengths</title>
<p>Significant differences in the brain signals captured under different field strengths (FS) are significant. <xref ref-type="fig" rid="F9">Figure 9</xref> shows the macaque data under 1.5 T, 3 T, 4.7 T, 7 T, and 9.4 T field strengths. The stronger the field strength is, the higher the signal-to-noise ratio. This means that the device can image at a higher or exact resolution, and the scanning speed is fast. The ultrahigh-field power will also increase the heterogeneity of B0 and B1, which will strongly influence the tissue contrast in the structure. To verify the generalization of the HC-Net model, macaque dataset II of different field strengths was used as the test dataset to further verify the model&#x2019;s performance. Except for Newcastle (4.7 T), which had other data to participate in the training, other sites&#x2019; data had not participated in the training. The results of the 4.7 T data were for reference only.</p>
<fig id="F9" position="float">
<label>FIGURE 9</label>
<caption><p>T1w images of macaques under different field strengths.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fncom-17-1113381-g009.tif"/>
</fig>
<p><xref ref-type="table" rid="T5">Table 5</xref> shows the brain extraction results on the HC-Net and 2D U-Net models. The Dice coefficient of the HC-Net model was approximately 0.89&#x223C;1.19% higher than that of the 2D U-Net. Importantly, our proposed model was better than the 2D U-Net model under different field intensities and had better generalization on data with different tissue contrasts.</p>
<table-wrap position="float" id="T5">
<label>TABLE 5</label>
<caption><p>Results of different field strengths using the hybrid convolutional neural network (HC-Net) and 2D U-Net models.</p></caption>
<table cellspacing="5" cellpadding="5" frame="box" rules="all">
<thead>
<tr>
<td valign="top" align="left" style="color:#ffffff;background-color: #7f8080;">Model</td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">FS(T)</td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">Dice</td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">Sen</td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">Spe</td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">VOE</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">2D U-Net</td>
<td valign="top" align="center">1.5</td>
<td valign="top" align="center">0.9674</td>
<td valign="top" align="center">0.9970</td>
<td valign="top" align="center">0.9973</td>
<td valign="top" align="center">0.0629</td>
</tr>
<tr>
<td valign="top" align="left">HC-Net</td>
<td valign="top" align="center">1.5</td>
<td valign="top" align="center">0.9793</td>
<td valign="top" align="center">0.9788</td>
<td valign="top" align="center">0.9991</td>
<td valign="top" align="center">0.0405</td>
</tr>
<tr>
<td valign="top" align="left">2D U-Net</td>
<td valign="top" align="center">3</td>
<td valign="top" align="center">0.9058</td>
<td valign="top" align="center">0.9558</td>
<td valign="top" align="center">0.9933</td>
<td valign="top" align="center">0.1721</td>
</tr>
<tr>
<td valign="top" align="left">HC-Net</td>
<td valign="top" align="center">3</td>
<td valign="top" align="center">0.9125</td>
<td valign="top" align="center">0.8903</td>
<td valign="top" align="center">0.9975</td>
<td valign="top" align="center">0.1595</td>
</tr>
<tr>
<td valign="top" align="left">2D U-Net</td>
<td valign="top" align="center">4.7</td>
<td valign="top" align="center">0.9717</td>
<td valign="top" align="center">0.9813</td>
<td valign="top" align="center">0.9939</td>
<td valign="top" align="center">0.0548</td>
</tr>
<tr>
<td valign="top" align="left">HC-Net</td>
<td valign="top" align="center">4.7</td>
<td valign="top" align="center">0.9690</td>
<td valign="top" align="center">0.9488</td>
<td valign="top" align="center">0.9985</td>
<td valign="top" align="center">0.0179</td>
</tr>
<tr>
<td valign="top" align="left">2D U-Net</td>
<td valign="top" align="center">7</td>
<td valign="top" align="center">0.9415</td>
<td valign="top" align="center">0.9731</td>
<td valign="top" align="center">0.9920</td>
<td valign="top" align="center">0.1104</td>
</tr>
<tr>
<td valign="top" align="left">HC-Net</td>
<td valign="top" align="center">7</td>
<td valign="top" align="center">0.9504</td>
<td valign="top" align="center">0.9324</td>
<td valign="top" align="center">0.9976</td>
<td valign="top" align="center">0.0941</td>
</tr>
</tbody>
</table></table-wrap>
</sec>
<sec id="S3.SS5">
<title>3.5. Evaluating the performance of the model on different datasets</title>
<p>Here, the human T1w MRI image dataset and the macaque B0 dataset were used to evaluate the utility of our proposed model. <xref ref-type="table" rid="T6">Table 6</xref> shows the Dice coefficient, sensitivity, specificity, and VOE of the HC-Net model. The Dice coefficient, sensitivity, and specificity exceeded 98, 98, and 99%, respectively, in the two datasets, and the VOE was lower than 4%.</p>
<table-wrap position="float" id="T6">
<label>TABLE 6</label>
<caption><p>The dice coefficient, sensitivity, specificity, and VOE of the hybrid convolutional neural network (HC-Net) model on the human dataset and B0 dataset.</p></caption>
<table cellspacing="5" cellpadding="5" frame="box" rules="all">
<thead>
<tr>
<td valign="top" align="left" style="color:#ffffff;background-color: #7f8080;">Dataset</td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">Dice</td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">Sen</td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">Spe</td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">VOE</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">humans</td>
<td valign="top" align="center">0.9830</td>
<td valign="top" align="center">0.9868</td>
<td valign="top" align="center">0.9950</td>
<td valign="top" align="center">0.0332</td>
</tr>
<tr>
<td valign="top" align="left">B0 (macaques)</td>
<td valign="top" align="center">0.9841</td>
<td valign="top" align="center">0.9876</td>
<td valign="top" align="center">0.9989</td>
<td valign="top" align="center">0.0310</td>
</tr>
</tbody>
</table></table-wrap>
<p>The proposed model had good stability in two datasets. Compared with macaques, the human cerebral cortex forms more folds on the surface of the brain. As such, folded and meandering brain morphology is one of the most difficult aspects of human brain extraction. Our HC-Net model correctly identified the boundary of the human brain, retained more details of the gyrus and sulcus, and obtained a more complete brain, as shown in <xref ref-type="fig" rid="F10">Figure 10</xref>. AFNI and FSL smoothed the gyrus and sulcus excessively, resulting in loss of brain edge details. FSL failed when applied to the B0 image of macaques with eyes, as shown in <xref ref-type="fig" rid="F11">Figure 11</xref>. It did not successfully separate the eyes. Furthermore, when no significant difference was present between the intensities of the brain and non-brain edges, it missed brain tissue. Compared with FSL, AFNI 3dSkullStrip with parameters customized for macaques showed better performance. However, AFNI 3dSkullStrip missed identifying the brain tissue around the eyes. <xref ref-type="fig" rid="F11">Figure 11</xref> shows two examples of the HC-Net model, and they performed outstandingly, showing little difference from ground truth masks.</p>
<fig id="F10" position="float">
<label>FIGURE 10</label>
<caption><p>The results of human brain extraction using masks obtained by the ground truth, hybrid convolutional neural network (HC-Net model), AFNI, and FSL.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fncom-17-1113381-g010.tif"/>
</fig>
<fig id="F11" position="float">
<label>FIGURE 11</label>
<caption><p>The results of macaques brain extraction using masks obtained by the ground truth, HC-Net model, AFNI, and FSL.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fncom-17-1113381-g011.tif"/>
</fig>
</sec>
</sec>
<sec id="S4" sec-type="discussion">
<title>4. Discussion</title>
<p>The present work demonstrated the feasibility of developing a brain extraction model of generalization and less training complexity for macaques by concatenating 3D and 2D convolutions. Central to the success of our effort was that this study fully extracted the spatial information between volume data by using 3D convolution and reduced the amount of calculation and parameters by 2D convolution. Our model overcame the problems of overfitting 3D convolution on small samples and underfitting 2D convolution on three-dimensional data. Compared with other cascade modes, our method reduced the complexity of training. Our work employed heterogeneous, multisite, different modes and different species of data resources to evaluate the effectiveness of the model. The results showed that the proposed model identifies the brains of macaques more accurately than the traditional methods. Furthermore, it had smaller parameters and better generalization than the large-scale model in small datasets. It was worth noting that our model also had excellent advantages in inference time.</p>
<p>Current works (<xref ref-type="bibr" rid="B39">Wang et al., 2021</xref>, <xref ref-type="bibr" rid="B38">2022</xref>) show that macaque data samples for deep learning may not need to be as many as human data. The reason may be that the folding surface of the macaque brain is far less complex than that of humans (<xref ref-type="bibr" rid="B14">Hopkins et al., 2014</xref>), and the surface edge of macaque brain tissue is relatively smooth (<xref ref-type="bibr" rid="B13">Hopkins, 2018</xref>). At the same time, <xref ref-type="bibr" rid="B9">Croxson et al. (2018)</xref> showed that the similarity between individual macaques is higher than that between human samples, which also makes it possible for deep learning to train on small sample macaque data.</p>
<p>An essential discovery of the current work was the order of 2D convolution and 3D convolution. The research has proven the effectiveness of the serial convolution of 3D and then 2D for brain extraction. This study exchanged 2D and 3D convolution positions to build a new network. In the encoder stage, the network first used two 2D convolution modules and then three 3D convolution modules, and in the decoder stage, it first used two 3D convolution modules and then two 2D convolution modules. This new network reduced the training time and reasoning time because 3D convolution processing of large images required more computing power and time. However, the network was deficient in feature extraction after the exchange, which was even worse than that of the 2D U-Net model. The first reason may be that 3D convolution is more challenging to train in the middle of the model. The second reason may be that the spatial information that 3D convolution can learn from the feature map is limited when 3D convolution is trained after 2D convolution. In the HC-Net model, 3D convolution first extracts the spatial information of the original image. This spatial information is not only input into the 2D encoders but also fused with the information after decoding by skip connection. This method makes full use of spatial information.</p>
<p>It is worth noting that the HC-Net we proposed is not a complete 3D network, although the 3D network usually tends to have higher accuracy on large sample data. HC-Net has a smaller network scale, lower memory cost, and less computing time. In addition, HC-Net is easier to transplant on platforms with limited memory. The study used the data of three axes to increase the sample size and smooth the contour of the brain by constraints from adjacent slices. The experiment shows that this data enhancement method helps improve the model&#x2019;s accuracy. However, there are differences between the data on the three axes of volume data. At the same time, to synthesize the final probability map, our study needs to resample the data, which loses some information. The introduction of data differences and the loss of information will affect the segmentation results to a certain extent. Future work may consider reducing the loss of information.</p>
</sec>
<sec id="S5" sec-type="conclusion">
<title>5. Conclusion</title>
<p>This research proposed an end-to-end trainable hybrid convolutional neural network, HC-Net, for brain extraction from MRI brain volumes of non-human primates. This study was a new way to extract inter-slice and intra-slice information from volume data by concatenating 2D and 3D convolutions. It reduced the calculation consumption and promotes accuracy by combining three consecutive slices from three axes for 3D convolutions. This architecture solved the problem that 2D convolution ignores the context information of volume data and the overfitting of 3D convolutions to small samples. Our model achieved excellent performance on limited macaque data samples. Experiments on human data and macaque B0 data also proved the effectiveness of our proposed HC-Net model.</p>
</sec>
<sec id="S6" sec-type="data-availability">
<title>Data availability statement</title>
<p>The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.</p>
</sec>
<sec id="S7" sec-type="author-contributions">
<title>Author contributions</title>
<p>HL and QW evaluated and guided the experimental design of this research. HF conceived and designed the experiments and wrote the manuscript. FS and WX researched the data. XC and YC analyzed the results. All authors contributed to the article and approved the submitted version.</p>
</sec>
</body>
<back>
<sec id="S8" sec-type="funding-information">
<title>Funding</title>
<p>This project was supported by the National Natural Science Foundation of China (61976150), the Shanxi Science and Technology Department (YDZJSX2021C005), and the Natural Science Foundation of Shanxi (201801D121135).</p>
</sec>
<sec id="S9" sec-type="COI-statement">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec id="S10" sec-type="disclaimer">
<title>Publisher&#x2019;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ref-list>
<title>References</title>
<ref id="B1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Autio</surname> <given-names>J. A.</given-names></name> <name><surname>Glasser</surname> <given-names>M. F.</given-names></name> <name><surname>Ose</surname> <given-names>T.</given-names></name> <name><surname>Donahue</surname> <given-names>C. J.</given-names></name> <name><surname>Bastiani</surname> <given-names>M.</given-names></name> <name><surname>Ohno</surname> <given-names>M.</given-names></name><etal/></person-group> (<year>2020</year>). <article-title>Towards HCP-Style macaque connectomes: 24-Channel 3T multi-array coil, MRI sequences and preprocessing.</article-title> <source><italic>Neuroimage</italic></source> <volume>215</volume>:<issue>116800</issue>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2020.116800</pub-id> <pub-id pub-id-type="pmid">32276072</pub-id></citation></ref>
<ref id="B2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Badrinarayanan</surname> <given-names>V.</given-names></name> <name><surname>Kendall</surname> <given-names>A.</given-names></name> <name><surname>Cipolla</surname> <given-names>R.</given-names></name></person-group> (<year>2017</year>). <article-title>SegNet: a deep convolutional encoder-decoder architecture for image segmentation.</article-title> <source><italic>IEEE Trans. Pattern Anal. Mach. Intell</italic>.</source> <volume>39</volume> <fpage>2481</fpage>&#x2013;<lpage>2495</lpage>. <pub-id pub-id-type="doi">10.1109/TPAMI.2016.2644615</pub-id> <pub-id pub-id-type="pmid">28060704</pub-id></citation></ref>
<ref id="B3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cai</surname> <given-names>D. C.</given-names></name> <name><surname>Wang</surname> <given-names>Z.</given-names></name> <name><surname>Bo</surname> <given-names>T.</given-names></name> <name><surname>Yan</surname> <given-names>S.</given-names></name> <name><surname>Liu</surname> <given-names>Y.</given-names></name> <name><surname>Liu</surname> <given-names>Z.</given-names></name><etal/></person-group> (<year>2020</year>). <article-title>MECP2 duplication causes aberrant GABA pathways, circuits and behaviors in transgenic monkeys: neural mappings to patients with autism.</article-title> <source><italic>J. Neurosci</italic>.</source> <volume>40</volume> <fpage>3799</fpage>&#x2013;<lpage>3814</lpage>. <pub-id pub-id-type="doi">10.1523/JNEUROSCI.2727-19.2020</pub-id> <pub-id pub-id-type="pmid">32269107</pub-id></citation></ref>
<ref id="B4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>X. Y.</given-names></name> <name><surname>Jiang</surname> <given-names>S. F.</given-names></name> <name><surname>Guo</surname> <given-names>L. T.</given-names></name> <name><surname>Chen</surname> <given-names>Z.</given-names></name> <name><surname>Zhang</surname> <given-names>C. X.</given-names></name></person-group> (<year>2021</year>). <article-title>Whole brain segmentation method from 2.5D brain MRI slice image based on Triple U-Net.</article-title> <source><italic>Vis. Comput.</italic></source> <volume>39</volume> <fpage>255</fpage>&#x2013;<lpage>266</lpage>. <pub-id pub-id-type="doi">10.1007/s00371-021-02326-9</pub-id></citation></ref>
<ref id="B5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>&#x00C7;i&#x00E7;ek</surname> <given-names>&#x00D6;</given-names></name> <name><surname>Abdulkadir</surname> <given-names>A.</given-names></name> <name><surname>Lienkamp</surname> <given-names>S. S.</given-names></name> <name><surname>Brox</surname> <given-names>T.</given-names></name> <name><surname>Ronneberger</surname> <given-names>O.</given-names></name></person-group> (<year>2016</year>). &#x201C;<article-title>3D U-Net: learning dense volumetric segmentation from sparse annotation</article-title>,&#x201D; in <source><italic>Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention</italic></source>, (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>424</fpage>&#x2013;<lpage>432</lpage>.</citation></ref>
<ref id="B6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Coates</surname> <given-names>A.</given-names></name> <name><surname>Huval</surname> <given-names>B.</given-names></name> <name><surname>Wang</surname> <given-names>T.</given-names></name> <name><surname>Wu</surname> <given-names>D.</given-names></name> <name><surname>Catanzaro</surname> <given-names>B.</given-names></name> <name><surname>Andrew</surname> <given-names>N.</given-names></name></person-group> (<year>2013</year>). &#x201C;<article-title>Deep learning with COTS HPC systems</article-title>,&#x201D; in <source><italic>Proceedings of the International Conference on Machine Learning 2013</italic></source>, (<publisher-loc>New York, NY</publisher-loc>: <publisher-name>PMLR</publisher-name>), <fpage>1337</fpage>&#x2013;<lpage>1345</lpage>.</citation></ref>
<ref id="B7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Coupeau</surname> <given-names>P.</given-names></name> <name><surname>Fasquel</surname> <given-names>J. B.</given-names></name> <name><surname>Mazerand</surname> <given-names>E.</given-names></name> <name><surname>Menei</surname> <given-names>P.</given-names></name> <name><surname>Montero-Menei</surname> <given-names>C. N.</given-names></name> <name><surname>Dinomais</surname> <given-names>M.</given-names></name></person-group> (<year>2022</year>). <article-title>Patch-based 3D U-Net and transfer learning for longitudinal piglet brain segmentation on MRI.</article-title> <source><italic>Comput. Methods Programs Biomed</italic>.</source> <volume>214</volume>:<issue>106563</issue>. <pub-id pub-id-type="doi">10.1016/j.cmpb.2021.106563</pub-id> <pub-id pub-id-type="pmid">34890993</pub-id></citation></ref>
<ref id="B8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cox</surname> <given-names>R. W.</given-names></name></person-group> (<year>1996</year>). <article-title>AFNI: software for analysis and visualization of functional magnetic resonance neuroimages.</article-title> <source><italic>Comput. Biomed Res</italic>.</source> <volume>29</volume> <fpage>162</fpage>&#x2013;<lpage>173</lpage>. <pub-id pub-id-type="doi">10.1006/cbmr.1996.0014</pub-id> <pub-id pub-id-type="pmid">8812068</pub-id></citation></ref>
<ref id="B9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Croxson</surname> <given-names>P. L.</given-names></name> <name><surname>Forkel</surname> <given-names>S. J.</given-names></name> <name><surname>Cerliani</surname> <given-names>L.</given-names></name> <name><surname>Thiebaut de Schotten</surname> <given-names>M.</given-names></name></person-group> (<year>2018</year>). <article-title>Structural variability across the primate brain: a cross-species comparison.</article-title> <source><italic>Cereb. Cortex</italic></source> <volume>28</volume> <fpage>3829</fpage>&#x2013;<lpage>3841</lpage>. <pub-id pub-id-type="doi">10.1093/cercor/bhx244</pub-id> <pub-id pub-id-type="pmid">29045561</pub-id></citation></ref>
<ref id="B10"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dolz</surname> <given-names>J.</given-names></name> <name><surname>Desrosiers</surname> <given-names>C.</given-names></name> <name><surname>Ben Ayed</surname> <given-names>I.</given-names></name></person-group> (<year>2018</year>). <article-title>3D fully convolutional networks for subcortical segmentation in MRI: a large-scale study.</article-title> <source><italic>Neuroimage</italic></source> <volume>170</volume> <fpage>456</fpage>&#x2013;<lpage>470</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2017.04.039</pub-id> <pub-id pub-id-type="pmid">28450139</pub-id></citation></ref>
<ref id="B11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Donahue</surname> <given-names>C. J.</given-names></name> <name><surname>Sotiropoulos</surname> <given-names>S. N.</given-names></name> <name><surname>Jbabdi</surname> <given-names>S.</given-names></name> <name><surname>Hernandez-Fernandez</surname> <given-names>M.</given-names></name> <name><surname>Behrens</surname> <given-names>T. E.</given-names></name> <name><surname>Dyrby</surname> <given-names>T. B.</given-names></name><etal/></person-group> (<year>2016</year>). <article-title>Using diffusion tractography to predict cortical connection strength and distance: a quantitative comparison with tracers in the monkey.</article-title> <source><italic>J. Neurosci</italic>.</source> <volume>36</volume> <fpage>6758</fpage>&#x2013;<lpage>6770</lpage>. <pub-id pub-id-type="doi">10.1523/Jneurosci.0493-16.2016</pub-id> <pub-id pub-id-type="pmid">27335406</pub-id></citation></ref>
<ref id="B12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Esteban</surname> <given-names>O.</given-names></name> <name><surname>Markiewicz</surname> <given-names>C. J.</given-names></name> <name><surname>Blair</surname> <given-names>R. W.</given-names></name> <name><surname>Moodie</surname> <given-names>C. A.</given-names></name> <name><surname>Isik</surname> <given-names>A. I.</given-names></name> <name><surname>Erramuzpe</surname> <given-names>A.</given-names></name><etal/></person-group> (<year>2019</year>). <article-title>fMRIPrep: a robust preprocessing pipeline for functional MRI.</article-title> <source><italic>Nat. Methods</italic></source> <volume>16</volume> <fpage>111</fpage>&#x2013;<lpage>116</lpage>.</citation></ref>
<ref id="B13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hopkins</surname> <given-names>W. D.</given-names></name></person-group> (<year>2018</year>). &#x201C;<article-title>Motor and communicative correlates of the inferior frontal gyrus (Broca&#x2019;s Area) in chimpanzees</article-title>,&#x201D; in <source><italic>Origins of Human Language: Continuities and Discontinuities with Nonhuman Primates</italic></source>, <role>eds</role> <person-group person-group-type="editor"><name><surname>Pascal</surname> <given-names>P.</given-names></name> <name><surname>Louis-Jean</surname> <given-names>B.</given-names></name> <name><surname>Joe</surname> <given-names>F.</given-names></name></person-group> (<publisher-loc>Bern</publisher-loc>: <publisher-name>Peter Lang</publisher-name>), <fpage>153</fpage>.</citation></ref>
<ref id="B14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hopkins</surname> <given-names>W. D.</given-names></name> <name><surname>Meguerditchian</surname> <given-names>A.</given-names></name> <name><surname>Coulon</surname> <given-names>O.</given-names></name> <name><surname>Bogart</surname> <given-names>S.</given-names></name> <name><surname>Mangin</surname> <given-names>J. F.</given-names></name> <name><surname>Sherwood</surname> <given-names>C. C.</given-names></name><etal/></person-group> (<year>2014</year>). <article-title>Evolution of the central sulcus morphology in primates.</article-title> <source><italic>Brain Behav. Evol</italic>.</source> <volume>84</volume> <fpage>19</fpage>&#x2013;<lpage>30</lpage>. <pub-id pub-id-type="doi">10.1159/000362431</pub-id> <pub-id pub-id-type="pmid">25139259</pub-id></citation></ref>
<ref id="B15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Huang</surname> <given-names>H. M.</given-names></name> <name><surname>Lin</surname> <given-names>L. F.</given-names></name> <name><surname>Tong</surname> <given-names>R. F.</given-names></name> <name><surname>Hu</surname> <given-names>H. J.</given-names></name> <name><surname>Zhang</surname> <given-names>Q. W.</given-names></name> <name><surname>Iwamoto</surname> <given-names>Y.</given-names></name><etal/></person-group> (<year>2020</year>). &#x201C;<article-title>Unet 3+: a full-scale connected unet for medical image segmentation</article-title>,&#x201D; in <source><italic>Proceedings of the 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing</italic></source>, (<publisher-loc>Piscataway</publisher-loc>: <publisher-name>IEEE</publisher-name>), <fpage>1055</fpage>&#x2013;<lpage>1059</lpage>.</citation></ref>
<ref id="B16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Huang</surname> <given-names>Z.</given-names></name> <name><surname>Guo</surname> <given-names>Y.</given-names></name> <name><surname>Zhang</surname> <given-names>N.</given-names></name> <name><surname>Huang</surname> <given-names>X.</given-names></name> <name><surname>Decazes</surname> <given-names>P.</given-names></name> <name><surname>Becker</surname> <given-names>S.</given-names></name><etal/></person-group> (<year>2022</year>). <article-title>Multi-scale feature similarity-based weakly supervised lymphoma segmentation in PET/CT images.</article-title> <source><italic>Comput. Biol. Med.</italic></source> <volume>151</volume>:<issue>106230</issue>. <pub-id pub-id-type="doi">10.1016/j.compbiomed.2022.106230</pub-id> <pub-id pub-id-type="pmid">36306574</pub-id></citation></ref>
<ref id="B17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hwang</surname> <given-names>H.</given-names></name> <name><surname>Rehman</surname> <given-names>H. Z. U.</given-names></name> <name><surname>Lee</surname> <given-names>S.</given-names></name></person-group> (<year>2019</year>). <article-title>3D U-Net for skull stripping in brain MRI.</article-title> <source><italic>Appl. Sci. Basel</italic></source> <volume>9</volume>:<issue>569</issue>.</citation></ref>
<ref id="B18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jung</surname> <given-names>B.</given-names></name> <name><surname>Taylor</surname> <given-names>P. A.</given-names></name> <name><surname>Seidlitz</surname> <given-names>J.</given-names></name> <name><surname>Sponheim</surname> <given-names>C.</given-names></name> <name><surname>Perkins</surname> <given-names>P.</given-names></name> <name><surname>Ungerleider</surname> <given-names>L. G.</given-names></name><etal/></person-group> (<year>2021</year>). <article-title>A comprehensive macaque fMRI pipeline and hierarchical atlas.</article-title> <source><italic>Neuroimage</italic></source> <volume>235</volume>:<issue>117997</issue>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2021.117997</pub-id> <pub-id pub-id-type="pmid">33789138</pub-id></citation></ref>
<ref id="B19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kleesiek</surname> <given-names>J.</given-names></name> <name><surname>Urban</surname> <given-names>G.</given-names></name> <name><surname>Hubert</surname> <given-names>A.</given-names></name> <name><surname>Schwarz</surname> <given-names>D.</given-names></name> <name><surname>Maier-Hein</surname> <given-names>K.</given-names></name> <name><surname>Bendszus</surname> <given-names>M.</given-names></name><etal/></person-group> (<year>2016</year>). <article-title>Deep MRI brain extraction: a 3D convolutional neural network for skull stripping.</article-title> <source><italic>Neuroimage</italic></source> <volume>129</volume> <fpage>460</fpage>&#x2013;<lpage>469</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2016.01.024</pub-id> <pub-id pub-id-type="pmid">26808333</pub-id></citation></ref>
<ref id="B20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lepage</surname> <given-names>C.</given-names></name> <name><surname>Wagstyl</surname> <given-names>K.</given-names></name> <name><surname>Jung</surname> <given-names>B.</given-names></name> <name><surname>Seidlitz</surname> <given-names>J.</given-names></name> <name><surname>Sponheim</surname> <given-names>C.</given-names></name> <name><surname>Ungerleider</surname> <given-names>L.</given-names></name><etal/></person-group> (<year>2021</year>). <article-title>CIVET-Macaque: an automated pipeline for MRI-based cortical surface generation and cortical thickness in macaques.</article-title> <source><italic>Neuroimage</italic></source> <volume>227</volume>:<issue>117622</issue>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2020.117622</pub-id> <pub-id pub-id-type="pmid">33301944</pub-id></citation></ref>
<ref id="B21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>Q.</given-names></name> <name><surname>Xi</surname> <given-names>Y.</given-names></name> <name><surname>Zhang</surname> <given-names>M.</given-names></name> <name><surname>Liu</surname> <given-names>L.</given-names></name> <name><surname>Tang</surname> <given-names>X.</given-names></name></person-group> (<year>2019</year>). <article-title>Distinct mechanism of audiovisual integration with informative and uninformative sound in a visual detection task: a DCM study.</article-title> <source><italic>Front. Comput. Neurosci</italic>.</source> <volume>13</volume>:<issue>59</issue>. <pub-id pub-id-type="doi">10.3389/fncom.2019.00059</pub-id> <pub-id pub-id-type="pmid">31555115</pub-id></citation></ref>
<ref id="B22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>X.</given-names></name> <name><surname>Chen</surname> <given-names>H.</given-names></name> <name><surname>Qi</surname> <given-names>X.</given-names></name> <name><surname>Dou</surname> <given-names>Q.</given-names></name> <name><surname>Fu</surname> <given-names>C. W.</given-names></name> <name><surname>Heng</surname> <given-names>P. A.</given-names></name></person-group> (<year>2018</year>). <article-title>H-DenseUNet: hybrid densely connected UNet for Liver and Tumor Segmentation From CT Volumes.</article-title> <source><italic>IEEE Trans Med Imaging</italic></source> <volume>37</volume> <fpage>2663</fpage>&#x2013;<lpage>2674</lpage>. <pub-id pub-id-type="doi">10.1109/TMI.2018.2845918</pub-id> <pub-id pub-id-type="pmid">29994201</pub-id></citation></ref>
<ref id="B23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>Z.</given-names></name> <name><surname>Cai</surname> <given-names>Y.</given-names></name> <name><surname>Wang</surname> <given-names>Y.</given-names></name> <name><surname>Nie</surname> <given-names>Y.</given-names></name> <name><surname>Zhang</surname> <given-names>C.</given-names></name> <name><surname>Xu</surname> <given-names>Y.</given-names></name><etal/></person-group> (<year>2018</year>). <article-title>Cloning of macaque monkeys by somatic cell nuclear transfer.</article-title> <source><italic>Cell</italic></source> <volume>172</volume> <fpage>881</fpage>&#x2013;<lpage>887</lpage>.</citation></ref>
<ref id="B24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lohmeier</surname> <given-names>J.</given-names></name> <name><surname>Kaneko</surname> <given-names>T.</given-names></name> <name><surname>Hamm</surname> <given-names>B.</given-names></name> <name><surname>Makowski</surname> <given-names>M. R.</given-names></name> <name><surname>Okano</surname> <given-names>H.</given-names></name></person-group> (<year>2019</year>). <article-title>atlasBREX: automated template-derived brain extraction in animal MRI.</article-title> <source><italic>Sci. Rep</italic>.</source> <volume>9</volume> <fpage>1</fpage>&#x2013;<lpage>9</lpage>. <pub-id pub-id-type="doi">10.1038/s41598-019-48489-3</pub-id> <pub-id pub-id-type="pmid">31434923</pub-id></citation></ref>
<ref id="B25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lucena</surname> <given-names>O.</given-names></name> <name><surname>Souza</surname> <given-names>R.</given-names></name> <name><surname>Rittner</surname> <given-names>L.</given-names></name> <name><surname>Frayne</surname> <given-names>R.</given-names></name> <name><surname>Lotufo</surname> <given-names>R.</given-names></name></person-group> (<year>2019</year>). <article-title>Convolutional neural networks for skull-stripping in brain MR imaging using silver standard masks.</article-title> <source><italic>Artif. Intell. Med</italic>.</source> <volume>98</volume> <fpage>48</fpage>&#x2013;<lpage>58</lpage>. <pub-id pub-id-type="doi">10.1016/j.artmed.2019.06.008</pub-id> <pub-id pub-id-type="pmid">31521252</pub-id></citation></ref>
<ref id="B26"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Milham</surname> <given-names>M. P.</given-names></name> <name><surname>Ai</surname> <given-names>L.</given-names></name> <name><surname>Koo</surname> <given-names>B.</given-names></name> <name><surname>Xu</surname> <given-names>T.</given-names></name> <name><surname>Amiez</surname> <given-names>C.</given-names></name> <name><surname>Balezeau</surname> <given-names>F.</given-names></name><etal/></person-group> (<year>2018</year>). <article-title>An open resource for non-human primate imaging.</article-title> <source><italic>Neuron</italic></source> <volume>100</volume> <fpage>61</fpage>&#x2013;<lpage>74.e62</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuron.2018.08.039.</pub-id> <pub-id pub-id-type="pmid">30269990</pub-id></citation></ref>
<ref id="B27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Milletari</surname> <given-names>F.</given-names></name> <name><surname>Navab</surname> <given-names>N.</given-names></name> <name><surname>Ahmadi</surname> <given-names>S. A.</given-names></name></person-group> (<year>2016</year>). &#x201C;<article-title>V-Net: fully convolutional neural networks for volumetric medical image segmentation</article-title>,&#x201D; in <source><italic>Proceedings of 2016 Fourth International Conference on 3d Vision (3dv)</italic></source>, (<publisher-loc>Piscataway</publisher-loc>: <publisher-name>IEEE</publisher-name>), <fpage>565</fpage>&#x2013;<lpage>571</lpage>. <pub-id pub-id-type="doi">10.1109/3dv.2016.79</pub-id></citation></ref>
<ref id="B28"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nei</surname> <given-names>M.</given-names></name> <name><surname>Xu</surname> <given-names>P.</given-names></name> <name><surname>Glazko</surname> <given-names>G.</given-names></name></person-group> (<year>2001</year>). <article-title>Estimation of divergence times from multiprotein sequences for a few mammalian species and several distantly related organisms.</article-title> <source><italic>Proc. Natl. Acad. Sci</italic>.</source> <volume>98</volume> <fpage>2497</fpage>&#x2013;<lpage>2502</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.051611498</pub-id> <pub-id pub-id-type="pmid">11226267</pub-id></citation></ref>
<ref id="B29"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Prasoon</surname> <given-names>A.</given-names></name> <name><surname>Petersen</surname> <given-names>K.</given-names></name> <name><surname>Igel</surname> <given-names>C.</given-names></name> <name><surname>Lauze</surname> <given-names>F.</given-names></name> <name><surname>Dam</surname> <given-names>E.</given-names></name> <name><surname>Nielsen</surname> <given-names>M.</given-names></name></person-group> (<year>2013</year>). &#x201C;<article-title>Deep feature learning for knee cartilage segmentation using a triplanar convolutional neural network</article-title>,&#x201D; in <source><italic>Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention</italic></source>, (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>246</fpage>&#x2013;<lpage>253</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-642-40763-5_31</pub-id> <pub-id pub-id-type="pmid">24579147</pub-id></citation></ref>
<ref id="B30"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Qin</surname> <given-names>X.</given-names></name> <name><surname>Zhang</surname> <given-names>Z.</given-names></name> <name><surname>Huang</surname> <given-names>C.</given-names></name> <name><surname>Dehghan</surname> <given-names>M.</given-names></name> <name><surname>Zaiane</surname> <given-names>O. R.</given-names></name> <name><surname>Jagersand</surname> <given-names>M.</given-names></name></person-group> (<year>2020</year>). <article-title>U2-Net: going deeper with nested U-structure for salient object detection.</article-title> <source><italic>Patt. Recognit</italic>.</source> <volume>106</volume>:<issue>107404</issue>.</citation></ref>
<ref id="B31"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ronneberger</surname> <given-names>O.</given-names></name> <name><surname>Fischer</surname> <given-names>P.</given-names></name> <name><surname>Brox</surname> <given-names>T.</given-names></name></person-group> (<year>2015</year>). &#x201C;<article-title>U-net: convolutional networks for biomedical image segmentation</article-title>,&#x201D; in <source><italic>Proceedings of the International Conference on Medical image Computing and Computer-Assisted Intervention</italic></source>, (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>234</fpage>&#x2013;<lpage>241</lpage>.</citation></ref>
<ref id="B32"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schmidhuber</surname> <given-names>J.</given-names></name></person-group> (<year>2015</year>). <article-title>Deep learning in neural networks: an overview.</article-title> <source><italic>Neural Netw</italic>.</source> <volume>61</volume> <fpage>85</fpage>&#x2013;<lpage>117</lpage>. <pub-id pub-id-type="doi">10.1016/j.neunet.2014.09.003</pub-id> <pub-id pub-id-type="pmid">25462637</pub-id></citation></ref>
<ref id="B33"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Seyedhosseini</surname> <given-names>M.</given-names></name> <name><surname>Sajjadi</surname> <given-names>M.</given-names></name> <name><surname>Tasdizen</surname> <given-names>T.</given-names></name></person-group> (<year>2013</year>). &#x201C;<article-title>Image segmentation with cascaded hierarchical models and logistic disjunctive normal networks</article-title>,&#x201D; in <source><italic>Proceedings of the 2013 IEEE International Conference on Computer Vision</italic></source>, (<publisher-loc>Sydney, NSW</publisher-loc>: <publisher-name>IEEE</publisher-name>), <fpage>2168</fpage>&#x2013;<lpage>2175</lpage>. <pub-id pub-id-type="doi">10.1109/ICCV.2013.269</pub-id> <pub-id pub-id-type="pmid">25419193</pub-id></citation></ref>
<ref id="B34"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sun</surname> <given-names>Q.</given-names></name> <name><surname>Dai</surname> <given-names>M.</given-names></name> <name><surname>Lan</surname> <given-names>Z.</given-names></name> <name><surname>Cai</surname> <given-names>F.</given-names></name> <name><surname>Wei</surname> <given-names>L.</given-names></name> <name><surname>Yang</surname> <given-names>C.</given-names></name><etal/></person-group> (<year>2022</year>). <article-title>UCR-Net: U-shaped context residual network for medical image segmentation.</article-title> <source><italic>Comput. Biol. Med.</italic></source> <volume>151</volume>:<issue>106203</issue>. <pub-id pub-id-type="doi">10.1016/j.compbiomed.2022.106203</pub-id> <pub-id pub-id-type="pmid">36306581</pub-id></citation></ref>
<ref id="B35"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tasserie</surname> <given-names>J.</given-names></name> <name><surname>Grigis</surname> <given-names>A.</given-names></name> <name><surname>Uhrig</surname> <given-names>L.</given-names></name> <name><surname>Dupont</surname> <given-names>M.</given-names></name> <name><surname>Amadon</surname> <given-names>A.</given-names></name> <name><surname>Jarraya</surname> <given-names>B.</given-names></name></person-group> (<year>2020</year>). <article-title>Pypreclin: an automatic pipeline for macaque functional MRI preprocessing.</article-title> <source><italic>Neuroimage</italic></source> <volume>207</volume>:<issue>116353</issue>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2019.116353</pub-id> <pub-id pub-id-type="pmid">31743789</pub-id></citation></ref>
<ref id="B36"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Van de Moortele</surname> <given-names>P. F.</given-names></name> <name><surname>Akgun</surname> <given-names>C.</given-names></name> <name><surname>Adriany</surname> <given-names>G.</given-names></name> <name><surname>Moeller</surname> <given-names>S.</given-names></name> <name><surname>Ritter</surname> <given-names>J.</given-names></name> <name><surname>Collins</surname> <given-names>C. M.</given-names></name><etal/></person-group> (<year>2005</year>). <article-title>B1 destructive interferences and spatial phase patterns at 7 T with a head transceiver array coil.</article-title> <source><italic>Magn. Reson. Med</italic>.</source> <volume>54</volume> <fpage>1503</fpage>&#x2013;<lpage>1518</lpage>. <pub-id pub-id-type="doi">10.1002/mrm.20708</pub-id> <pub-id pub-id-type="pmid">16270333</pub-id></citation></ref>
<ref id="B37"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Van Essen</surname> <given-names>D. C.</given-names></name> <name><surname>Ugurbil</surname> <given-names>K.</given-names></name> <name><surname>Auerbach</surname> <given-names>E.</given-names></name> <name><surname>Barch</surname> <given-names>D.</given-names></name> <name><surname>Behrens</surname> <given-names>T. E.</given-names></name> <name><surname>Bucholz</surname> <given-names>R.</given-names></name><etal/></person-group> (<year>2012</year>). <article-title>The Human Connectome Project: a data acquisition perspective.</article-title> <source><italic>Neuroimage</italic></source> <volume>62</volume> <fpage>2222</fpage>&#x2013;<lpage>2231</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2012.02.018</pub-id> <pub-id pub-id-type="pmid">22366334</pub-id></citation></ref>
<ref id="B38"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>Q.</given-names></name> <name><surname>Fei</surname> <given-names>H.</given-names></name> <name><surname>Abdu Nasher</surname> <given-names>S. N.</given-names></name> <name><surname>Xia</surname> <given-names>X.</given-names></name> <name><surname>Li</surname> <given-names>H.</given-names></name></person-group> (<year>2022</year>). <article-title>A macaque brain extraction model based on U-Net combined with residual structure.</article-title> <source><italic>Brain Sci.</italic></source> <volume>12</volume>:<issue>260</issue>. <pub-id pub-id-type="doi">10.3390/brainsci12020260</pub-id> <pub-id pub-id-type="pmid">35204023</pub-id></citation></ref>
<ref id="B39"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>X.</given-names></name> <name><surname>Li</surname> <given-names>X. H.</given-names></name> <name><surname>Cho</surname> <given-names>J. W.</given-names></name> <name><surname>Russ</surname> <given-names>B. E.</given-names></name> <name><surname>Rajamani</surname> <given-names>N.</given-names></name> <name><surname>Omelchenko</surname> <given-names>A.</given-names></name><etal/></person-group> (<year>2021</year>). <article-title>U-net model for brain extraction: trained on humans for transfer to non-human primates.</article-title> <source><italic>Neuroimage</italic></source> <volume>235</volume>:<issue>118001</issue>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2021.118001</pub-id> <pub-id pub-id-type="pmid">33789137</pub-id></citation></ref>
<ref id="B40"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xi</surname> <given-names>Y.</given-names></name> <name><surname>Li</surname> <given-names>Q.</given-names></name> <name><surname>Gao</surname> <given-names>N.</given-names></name> <name><surname>He</surname> <given-names>S.</given-names></name> <name><surname>Tang</surname> <given-names>X.</given-names></name></person-group> (<year>2019a</year>). <article-title>Cortical network underlying audiovisual semantic integration and modulation of attention: an fMRI and graph-based study.</article-title> <source><italic>PLoS One</italic>.</source> <volume>14</volume>:<issue>e0221185</issue>. <pub-id pub-id-type="doi">10.1371/journal.pone.0221185</pub-id> <pub-id pub-id-type="pmid">31442242</pub-id></citation></ref>
<ref id="B41"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xi</surname> <given-names>Y.</given-names></name> <name><surname>Li</surname> <given-names>Q.</given-names></name> <name><surname>Zhang</surname> <given-names>M.</given-names></name> <name><surname>Liu</surname> <given-names>L.</given-names></name> <name><surname>Li</surname> <given-names>G.</given-names></name> <name><surname>Lin</surname> <given-names>W.</given-names></name><etal/></person-group> (<year>2019b</year>). <article-title>Optimized configuration of functional brain network for processing semantic audio visual stimuli underlying the modulation of attention: a graph-based study.</article-title> <source><italic>Front. Integr. Neurosci</italic>.</source> <volume>13</volume>:<issue>67</issue>. <pub-id pub-id-type="doi">10.3389/fnint.2019.00067</pub-id> <pub-id pub-id-type="pmid">31798426</pub-id></citation></ref>
<ref id="B42"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yan</surname> <given-names>L. C.</given-names></name> <name><surname>Yoshua</surname> <given-names>B.</given-names></name> <name><surname>Geoffrey</surname> <given-names>H.</given-names></name></person-group> (<year>2015</year>). <article-title>Deep learning.</article-title> <source><italic>Nature</italic></source> <volume>521</volume> <fpage>436</fpage>&#x2013;<lpage>444</lpage>.</citation></ref>
<ref id="B43"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>H.</given-names></name> <name><surname>Valcarcel</surname> <given-names>A. M.</given-names></name> <name><surname>Bakshi</surname> <given-names>R.</given-names></name> <name><surname>Chu</surname> <given-names>R.</given-names></name> <name><surname>Bagnato</surname> <given-names>F.</given-names></name> <name><surname>Shinohara</surname> <given-names>R. T.</given-names></name><etal/></person-group> (<year>2019</year>). &#x201C;<article-title>Multiple sclerosis lesion segmentation with tiramisu and 2.5 d stacked slices</article-title>,&#x201D; in <source><italic>Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention</italic></source>, (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>338</fpage>&#x2013;<lpage>346</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-030-32248-9_38</pub-id> <pub-id pub-id-type="pmid">34950934</pub-id></citation></ref>
<ref id="B44"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>Y.-P.</given-names></name> <name><surname>Shi</surname> <given-names>L.-M.</given-names></name></person-group> (<year>1993</year>). <article-title>Phylogeny of rheusus monkeys (<italic>Macaca mulatta</italic>) as revealed by mitochondrial DNA restriction enzyme analysis.</article-title> <source><italic>Int. J. Primatol.</italic></source> <volume>14</volume> <fpage>587</fpage>&#x2013;<lpage>605</lpage>.</citation></ref>
<ref id="B45"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhao</surname> <given-names>G.</given-names></name> <name><surname>Liu</surname> <given-names>F.</given-names></name> <name><surname>Oler</surname> <given-names>J. A.</given-names></name> <name><surname>Meyerand</surname> <given-names>M. E.</given-names></name> <name><surname>Kalin</surname> <given-names>N. H.</given-names></name> <name><surname>Birn</surname> <given-names>R. M.</given-names></name></person-group> (<year>2018</year>). <article-title>Bayesian convolutional neural network based MRI brain extraction on nonhuman primates.</article-title> <source><italic>Neuroimage</italic></source> <volume>175</volume> <fpage>32</fpage>&#x2013;<lpage>44</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2018.03.065</pub-id> <pub-id pub-id-type="pmid">29604454</pub-id></citation></ref>
</ref-list>
</back>
</article>