<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="research-article" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Bioeng. Biotechnol.</journal-id>
<journal-title>Frontiers in Bioengineering and Biotechnology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Bioeng. Biotechnol.</abbrev-journal-title>
<issn pub-type="epub">2296-4185</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">791424</article-id>
<article-id pub-id-type="doi">10.3389/fbioe.2022.791424</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Bioengineering and Biotechnology</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Study on the Prediction Method of Long-term Benign and Malignant Pulmonary Lesions Based on LSTM</article-title>
<alt-title alt-title-type="left-running-head">Liu et&#x20;al.</alt-title>
<alt-title alt-title-type="right-running-head">Prediction Method for Long-Term Pulmonary Lesions</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Liu</surname>
<given-names>Xindong</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1595379/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Wang</surname>
<given-names>Mengnan</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Aftab</surname>
<given-names>Rukhma</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1509834/overview"/>
</contrib>
</contrib-group>
<aff id="aff1">
<sup>1</sup>
<institution>Faculty of Science</institution>, <institution>Hong Kong Baptist University</institution>, <addr-line>Hong Kong</addr-line>, <country>China</country>
</aff>
<aff id="aff2">
<sup>2</sup>
<institution>College of Information and Computer</institution>, <institution>Taiyuan University of Technology</institution>, <addr-line>Taiyuan</addr-line>, <country>China</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1254880/overview">Tinggui Chen</ext-link>, Zhejiang Gongshang University, China</p>
</fn>
<fn fn-type="edited-by">
<p>
<bold>Reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/584359/overview">Zhen Li</ext-link>, The University of Hong Kong, Hong Kong SAR, China</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1523470/overview">Xin Yang</ext-link>, Huazhong University of Science and Technology, China</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1527275/overview">Minghui Sun</ext-link>, Jilin University, China</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Rukhma Aftab, <email>18234132492@163.com</email>
</corresp>
<fn fn-type="other">
<p>This article was submitted to Bionics and Biomimetics, a section of the journal Frontiers in Bioengineering and Biotechnology</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>02</day>
<month>03</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>10</volume>
<elocation-id>791424</elocation-id>
<history>
<date date-type="received">
<day>08</day>
<month>10</month>
<year>2021</year>
</date>
<date date-type="accepted">
<day>06</day>
<month>01</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2022 Liu, Wang and Aftab.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Liu, Wang and Aftab</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these&#x20;terms.</p>
</license>
</permissions>
<abstract>
<p>In order to more accurately and comprehensively characterize the changes and development rules of lesion characteristics in pulmonary medical images in different periods, the study was conducted to predict the evolution of pulmonary nodules in the longitudinal dimension of time, and a benign and malignant prediction model of pulmonary lesions in different periods was constructed under multiscale three-dimensional (3D) feature fusion. According to the sequence of computed tomography (CT) images of patients at different stages, 3D interpolation was conducted to generate 3D lung CT images. The 3D features of different size lesions in the lungs were extracted using 3D convolutional neural networks for fusion features. A time-modulated long short-term memory was constructed to predict the benign and malignant lesions by using the improved time-length memory method to learn the feature vectors of lung lesions with temporal and spatial characteristics in different periods. The experiment shows that the area under the curve of the proposed method is 92.71%, which is higher than that of the traditional method.</p>
</abstract>
<kwd-group>
<kwd>3D CNNs</kwd>
<kwd>time-modulated LSTM</kwd>
<kwd>multiscale three-dimensional feature</kwd>
<kwd>prediction</kwd>
<kwd>characteristics of the fusion</kwd>
<kwd>pulmonary lesions</kwd>
</kwd-group>
<contract-num rid="cn001">61872261 61972274</contract-num>
<contract-sponsor id="cn001">National Natural Science Foundation of China<named-content content-type="fundref-id">10.13039/501100001809</named-content>
</contract-sponsor>
</article-meta>
</front>
<body>
<sec id="s1">
<title>1 Introduction</title>
<p>Because of factors such as smoking, air pollution, and occupational environment, lung cancer has become one of the most malignant tumors that threaten human health and life and has become the number one killer of all cancers (<xref ref-type="bibr" rid="B40">Taillant et&#x20;al., 2004</xref>; <xref ref-type="bibr" rid="B44">Zhang et&#x20;al., 2018</xref>). Global cancer data show that the number of new cases and deaths of lung cancer in the world in 2018 were 2.1 million and 1.8 million, respectively, with the highest morbidity and mortality rates among all cancers. The 5-year survival rate of patients with advanced lung cancer is approximately 16%, but for effective treatment in patients with early-stage disease, the 5-year survival rate can increase by approximately four to five times (<xref ref-type="bibr" rid="B32">Nagaratnam et&#x20;al., 2018</xref> Cheuk). Pulmonary nodules are an early manifestation of lung cancer, and their benign and malignant predictions are very important for radiologists to carry out cancer staging assessment and individualized clinical treatment planning. With the development of medical imaging technology, the number of computed tomography (CT) images of the lungs continues to increase, but the number of experienced physicians is limited, resulting in the explosive growth of image data and the serious shortage of manual diagnosis. Therefore, computer-aided diagnosis technology is urgently needed (<xref ref-type="bibr" rid="B45">Zhang et&#x20;al., 2020a</xref>) to assist physicians in feature extraction and benign and malignant prediction of lung nodules.</p>
<p>In clinical diagnosis, the objects of lung medical image processing are often limited to the data of the patient in the same period, and the feature vectors of a slice in a certain period are considered in isolation, and the global features with spatial information on the time axis are ignored. In addition, Existing prediction methods, such as medical decision-making systems (<xref ref-type="bibr" rid="B12">Christo et&#x20;al., 2020</xref>) combined with intelligent optimization (<xref ref-type="bibr" rid="B15">Deng et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B48">Zhang et&#x20;al., 2021</xref>), are divided into multiobjective (<xref ref-type="bibr" rid="B13">Cui et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B7">Cai et&#x20;al., 2021a</xref>) and single-objective optimization (<xref ref-type="bibr" rid="B6">Boudjemaa et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B42">Yang et&#x20;al., 2020</xref>). Although the factors considered can be more comprehensive, they mostly rely on artificial features. Because of the limited expressive power of manual features, the prediction effects of existing methods are often unsatisfactory. At the same time, because of the complexity of the growth and evolution of lung nodules in the lung cancer lesion area (<xref ref-type="bibr" rid="B17">Duffy and Field, 2020</xref>), the same lesion often has different imaging manifestations at different periods. Among them, the medical imaging data of lesions at different periods contain a large amount of their evolution (development, death)&#x2013;related information. Lung CT images have blurred edges, low gray values, and difficult-to-express texture information. It is difficult to accurately and comprehensively characterize lung lesions. In recent years, longitudinal prediction methods have been proposed (<xref ref-type="bibr" rid="B38">Santeramo et&#x20;al., 2018</xref>; <xref ref-type="bibr" rid="B33">Oh et&#x20;al., 2019</xref>), and the current research methods are rarely useful in the field of pulmonary medicine, and the existing intelligent diagnosis mostly uses isolated image fragments, which cannot present the entire cycle of the lesion, resulting in the inability to link the characteristics of lung cancer at different periods.</p>
<p>
<xref ref-type="fig" rid="F1">Figure&#x20;1</xref> shows the evolution trend of the sequence of long-course lung lesions examined every 3&#xa0;months in the same patient.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption>
<p>Long-term sequence of lung lesions. </p>
</caption>
<graphic xlink:href="fbioe-10-791424-g001.tif"/>
</fig>
<p>We propose a scheme that uses the latest deep learning techniques (<xref ref-type="bibr" rid="B14">Cui et&#x20;al., 2021</xref>) to extract the depth features of long-term lung CT lesion sequence images for early benign and malignant lung lesion prediction. According to the sequence images of the lesions in each period, make full use of the temporal and spatial information of the image to extract the depth features of the lesions in different periods. According to the characteristics of lung medical images in different periods, the long- and short-term memory model recurrent neural network (RNN) architecture is good for lung lesions. Longitudinal prediction of malignancy provides reliable help for physicians.</p>
<p>The major contributions of this article are as follows:<list list-type="simple">
<list-item>
<p>1) On the lung lesion image data set, RPN was used to extract the candidate region (<xref ref-type="bibr" rid="B36">Ren et&#x20;al., 2017</xref>), and linear interpolation technology was used to obtain the three-dimensional (3D) structure of the candidate region.</p>
</list-item>
<list-item>
<p>2) We propose a novel method to exploit 3D convolutional neural network (CNN) deep network to extract the deep hidden features of long-duration lung lesions; compared with their 2D counterparts, the 3D CNNs can encode richer spatial information and extract more discriminative representations <italic>via</italic> the hierarchical architecture trained with 3D samples.</p>
</list-item>
<list-item>
<p>3) We propose a novel long short-term memory (LSTM) network with time modulation information to propagate the spatial&#x2013;temporal information between pulmonary lesions adjacent slices for a long period and capture the corresponding long-term dependencies and solve the problem that the input must be the image of lung lesions with equal intervals, thereby predicting the next stage of pulmonary lesion.</p>
</list-item>
</list>
</p>
</sec>
<sec id="s2">
<title>2 Related Work</title>
<sec id="s2-1">
<title>2.1 Methods of Extracting Medical Image Feature Information</title>
<p>The large amount of information contained in the lesions in each period of medical imaging has important guiding significance for obtaining accurate prediction results, and accurate prediction results also play an important guiding role for doctors&#x2019; diagnosis (<xref ref-type="bibr" rid="B23">Hu et&#x20;al., 2016</xref>). For extracting a large amount of information from the lesions, <xref ref-type="bibr" rid="B49">Zhao and Du (2016)</xref> used dimensionality reduction technology and deep learning technology, respectively, to extract spectral features and spatial features and used CNN to find space-related features. <xref ref-type="bibr" rid="B5">Bodla et&#x20;al. (2017)</xref> proposed a face recognition method based on deep heterogeneous feature fusion, which uses different deep CNNs (DCNNs) to concatenate the generated features and merge the feature information. <xref ref-type="bibr" rid="B26">Khusnuliawati et&#x20;al. (2017)</xref> proposed that the scale invariant feature transform and local extensive binary pattern should be used for multifeature extraction, and the extracted features should be concatenated and fused in the form of histogram. <xref ref-type="bibr" rid="B41">Xiao et&#x20;al. (2015)</xref> proposed a feature fusion method based on SoftMax regression to perform effective feature fusions by estimating the similarity measure from object to class and the probabilities that each object belongs to a different kind. <xref ref-type="bibr" rid="B3">Shi et&#x20;al. (2017)</xref> put forward a new nonlinear measurement learning method, which uses deep sparse autoencoder feature fusion strategy based on deep network.</p>
</sec>
<sec id="s2-2">
<title>2.2 Application of Traditional Methods to Time-Series Data</title>
<p>In recent years, scholars have also studied time-series data in medicine. <xref ref-type="bibr" rid="B34">Onisko et&#x20;al. (2016)</xref> analyzed medical time series through Kaplan&#x2013;Meier estimator Cox proportional hazard regression model and dynamic Bayesian network modeling. <xref ref-type="bibr" rid="B29">Li and Feng (2015)</xref> predicted the number of future medical appointments by analyzing the appointment capacity of emergency patients every day and every hour. <xref ref-type="bibr" rid="B11">Cheng et&#x20;al. (2020)</xref> applied a Bayesian nonparametric model based on Gaussian process regression to hospital patient monitoring using clinical covariables and all information provided by laboratory tests and successfully conducted medical intervention. As deep learning has a good advantage in time-series learning, many scholars have applied it to many fields. <xref ref-type="bibr" rid="B9">Chandra (2015)</xref> has proposed utilizing RNN to predict collaborative evolution by analyzing time series. <xref ref-type="bibr" rid="B18">Fragkiadaki et&#x20;al., 2016</xref> proposed an improved RNN model that captures moving body gestures in video for recognition and prediction. <xref ref-type="bibr" rid="B28">Koutn&#xed;k et&#x20;al. (2014)</xref> have introduced a modified clockwork RNN architecture, which divides its hidden layers into separate modules, achieving the processing input of each module at its own time granularity, improving the performance of task tests, and speeding up the network&#x20;speed.</p>
</sec>
<sec id="s2-3">
<title>2.3 Application of CNN and Long-Term Memory and LSTM to Time-Series Data</title>
<p>In recent years, CNNs have been successfully used to detect radiological anomalies in medical images, such as ordinary X-rays. LSTMs is a special type of RNN that can classify, process, and predict time series (<xref ref-type="bibr" rid="B21">Graves, 2012</xref>; <xref ref-type="bibr" rid="B46">Zhang, 2020</xref>). The internal state of the LSTM (also known as cell state or memory) enables the architecture to remember the standard LSTM. The standard LSTM contains memory blocks, which contain memory units. A typical memory block consists of three main components: an input gate controlling the input activation flow of memory cells, an output gate controlling the output activation flow, and a forgetting gate regulating the internal state of cells. The forgetting gate adjusts the amount of information used in the internal state of the previous time step. <xref ref-type="bibr" rid="B38">Santeramo et&#x20;al. (2018)</xref> attempted to automate the analysis of longitudinal medical image data by using the LSTM network to analyses the temporal context of a series of chest radiographs. In the field of breast pathological images, <xref ref-type="bibr" rid="B27">Kooi and Karssemeijer (2017)</xref> proposed a region of interest (ROI)&#x2013;based method to compare plaques aligned at different time points. Although the latter method is slightly improved compared with a single detection method, it depends on specific lesion detection and requires local&#x20;data.</p>
<p>These algorithms are very effective, but are rarely applied to long-term lung CT image prediction. So far, most studies have used CNNs in individual tests, but abandoned previously available clinical information. One limitation of traditional LSTM is that they implicitly assume equal interval observations, while medical examination is event based, so the sampling is irregular.</p>
<p>LSTMs and more general RNNs do not perform well in time series with irregularly sampled or missing data (<xref ref-type="bibr" rid="B10">Che et&#x20;al., 2018</xref>; <xref ref-type="bibr" rid="B47">Zhang, 2021</xref>). Previous attempts to apply LSTMs to irregularly sampled data points focused on accelerating algorithm convergence or reducing short-term memory in an environment with high-resolution sampled data (<xref ref-type="bibr" rid="B4">Baytas et&#x20;al., 2017</xref>). This project set out to explore the performance of LSTM network, which became one of the selection methods of sequence modeling, especially when combined with CNNs for medical image feature extraction (<xref ref-type="bibr" rid="B16">Donahue et&#x20;al., 2015</xref>; <xref ref-type="bibr" rid="B20">Grano and Zhang, 2019</xref>). The main advantages of combining CNNs with LSTMs are flexibility and scalability; it allows multiple prior sequences of variable length to be classified using the same network. Longitudinal analysis of images can potentially improve the ability of machine learning algorithms to interpret imaging studies accurately and reliably, thus providing value for medical image processing (<xref ref-type="bibr" rid="B19">Gao et&#x20;al., 2018</xref>).</p>
</sec>
</sec>
<sec id="s3">
<title>3 Methods</title>
<p>In this article, the benign and malignant lung lesions can be predicted by spatiotemporal feature fusion. For CT sequence images of the same patient from the early stage to the diagnosis, a faster Region-CNN (R-CNN) (<xref ref-type="bibr" rid="B39">Shinde et&#x20;al., 2019</xref>) detector was used to generate ROI (<xref ref-type="bibr" rid="B35">Qiang et&#x20;al., 2015</xref>), to extract temporal and spatial features of multilayer context information around pulmonary nodules, and a 3D CNN (<xref ref-type="bibr" rid="B8">Cai et&#x20;al., 2021b</xref>) was used for fusion. Then, the temporal and spatial feature fusion vectors of pulmonary nodules in each period were selected to study the variation trend and relationship of feature vectors in each period by using time-modulated long&#x2013;short memory network. Finally, the time-modulated LSTM (T-LSTM) model was used to predict the evolution trend of lung lesions over a long period and to determine their malignancy. The overall process is shown in <xref ref-type="fig" rid="F2">Figure&#x20;2</xref>.</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption>
<p>The framework of our proposed network.</p>
</caption>
<graphic xlink:href="fbioe-10-791424-g002.tif"/>
</fig>
<p>2D CNN selects ALEXNET network as baseline. CNN architectures for medical imaging usually contain fewer convolutional layers because of the small data sets and input size. The CNN architecture consisted of three convolutional layers and two fully connected layers, where each convolutional layer was followed by a max-pooling layer. In 2D CNN, the kernel moves in two directions. The input and output data of 2D CNN are 3D. It can be mainly used for single image data. In 3D CNN, the kernel moves in three directions. The input and output data of 3D CNN are 4D. It can be mainly used for 3D image data (magnetic resonance imaging, CT scan).</p>
<sec id="s3-1">
<title>3.1 Lung CT Sequence Image Preprocessing</title>
<p>In the diagnosis process of doctors, the focus of observation and research is pulmonary nodules, which are transparent light and shadow with the maximum diameter of no more than 30&#xa0;mm in the pulmonary parenchyma and occupy only a small part of the CT area of the chest cavity. In order to reduce the interference of other organs and tissues on the diagnosis process of doctors and effectively reduce the algorithm complexity, the lung CT images obtained from The National Lung Screening Trial (NLST) and cooperative hospitals were preprocessed. As the location of pulmonary nodules was not marked in detail in the data set, we adopted a pace&#x2013;R-CNN detector to detect the target nodules and intercept the ROI-centered peripheral rectangular area to construct the pulmonary nodule data&#x20;set.</p>
<p>We screened lung CT images of patients followed up for 3&#xa0;years or more in the NLST data set to construct a long-term data set. The NLST data set marked the section number and approximate location of the most prominent pulmonary nodules in each phase sequence. The pulmonary CT image corresponding to the section number was examined for nodules. ResNet 101 (<xref ref-type="bibr" rid="B22">He et&#x20;al., 2016</xref>) was selected as the backbone network of faster R-CNN. Boundary boxes were defined with five aspect ratios of 1:3, 1:2, 1:1, 2:1, and 3:1 and four scales of 8&#x20;&#xd7; 8, 16&#x20;&#xd7; 16, 32&#x20;&#xd7; 32, and 64&#x20;&#xd7; 64 to cover blocks of different shapes. It is worth noting that the 1:3 and 3:1 aspect ratio settings are due to the presence of pulmonary vascularized lesions, which are critical for the diagnosis of lung cancer.</p>
<p>According to the detection of pulmonary nodules, use a rectangular area with a scale of 30 &#x2a; 30 or 40 &#x2a; 40, take the coordinate information of the upper left corner of the detailed annotation rectangle in the lower right corner, cut the first five and the last five rectangular boxes according to the pulmonary nodules with the most obvious coordinate information as the center, and construct a 3D block. When each data set has the same sequence, do the same processing on the CT image, and establish a long-term pulmonary nodule sequence image data set.</p>
</sec>
<sec id="s3-2">
<title>3.2 Spatiotemporal Feature Extraction</title>
<p>The feature extraction methods of pulmonary lesions can be generally divided into traditional feature extraction methods and deep learning feature extraction methods. Generally speaking, the traditional method of feature extraction can only de-scribe a specific type of information. Deep learning, such as 2D CNN, has achieved good results in image feature extraction and can express high-level semantic information of lesions. However, this solution based on 2D CNN still cannot make full use of the 3D spatial context information of pulmonary nodules to extract the benign and malignant information of pulmonary nodules with temporal and spatial characteristics. Therefore, this article proposes a new method to extract the benign and malignant features of pulmonary nodules from CT sequences using 3D CNNs. Compared with 2D CNN, 3D CNN can encode more spatial information and extract more spatial discrimination information through the hierarchical structure of 3D sample training.</p>
<p>Features extracted by DCNN can represent the inherent semantic information of images (<xref ref-type="bibr" rid="B25">Kamnitsas et&#x20;al., 2016</xref>). With the emergence of deep neural networks in computer vision, 3D CNN has developed rapidly in the past few years. Although 3D medical data are very common and popular in clinical practice, 3D CNN is still in its infancy in medical application. Furthermore, the hyperparameter adjustment of thousands of filters on large data sets is still an important challenge. To alleviate this problem, migrating pretrained 3D CNN to specific application scenarios is a very efficient and simple solution (<xref ref-type="bibr" rid="B1">Aaron et&#x20;al., 2018</xref>).</p>
<p>We proposed a two-channel network, which is suitable for input of different sizes. The main structure of our multilevel 3D CNN framework is shown in <xref ref-type="fig" rid="F3">Figure&#x20;3</xref>. Each network has four convolutional layers. Both cnn-30 and cnn-40 contain a fully connected layer. After each hidden layer, a batch normalization layer is inserted to ensure a higher learning rate and reduce overfitting, and a dropout layer is added to further reduce the overfitting performance.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption>
<p>The main network structure of multiscale 3D CNN framework. C is the 3D convolutional layer; MP represents the 3D maximum pooling layer, whereas FC is the full connection&#x20;layer.</p>
</caption>
<graphic xlink:href="fbioe-10-791424-g003.tif"/>
</fig>
<p>The two architectures, respectively, output the 2D classification prediction of nodule or nonnodule by SoftMax in the upper layer and a 256-D feature vector from the last hidden layer. Their outputs are then combined into a single classification result of a given original 3D volume. This feature is used for feature fusion and for predicting classification of pulmonary nodules. We used data fusion techniques to, namely, late fusion. The two features from the last hidden layer of CNN are connected into a complete feature vector and sent to the prediction module. <xref ref-type="table" rid="T1">Table&#x20;1</xref> details the network configuration.</p>
<table-wrap id="T1" position="float">
<label>TABLE 1</label>
<caption>
<p>Architecture of the multilevel contextual 3D CNNs.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th colspan="3" align="center">Archi-1</th>
<th colspan="3" align="center">Archi-2</th>
</tr>
<tr>
<th align="left">Layer</th>
<th align="center">Kernel</th>
<th align="center">Channel</th>
<th align="center">Layer</th>
<th align="center">Kernel</th>
<th align="center">Channel</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">Input</td>
<td align="center">&#x2014;</td>
<td align="char" char=".">1</td>
<td align="center">Input</td>
<td align="center">&#x2014;</td>
<td align="char" char=".">1</td>
</tr>
<tr>
<td align="left">C1</td>
<td align="center">5 &#xd7; 5 &#xd7; 5</td>
<td align="char" char=".">64</td>
<td align="center">C1</td>
<td align="center">5 &#xd7; 5 &#xd7; 5</td>
<td align="char" char=".">64</td>
</tr>
<tr>
<td align="left">M1</td>
<td align="center">2 &#xd7; 2 &#xd7; 2</td>
<td align="char" char=".">64</td>
<td align="center">M1</td>
<td align="center">2 &#xd7; 2 &#xd7; 2</td>
<td align="char" char=".">64</td>
</tr>
<tr>
<td align="left">C2</td>
<td align="center">2 &#xd7; 2 &#xd7; 2</td>
<td align="char" char=".">128</td>
<td align="center">C2</td>
<td align="center">5 &#xd7; 5 &#xd7; 5</td>
<td align="char" char=".">128</td>
</tr>
<tr>
<td align="left">M2</td>
<td align="center">2 &#xd7; 2 &#xd7; 2</td>
<td align="char" char=".">128</td>
<td align="center">M2</td>
<td align="center">2 &#xd7; 2 &#xd7; 2</td>
<td align="char" char=".">128</td>
</tr>
<tr>
<td align="left">C3</td>
<td align="center">3 &#xd7; 3 &#xd7; 3</td>
<td align="char" char=".">256</td>
<td align="center">C3</td>
<td align="center">2 &#xd7; 2 &#xd7; 2</td>
<td align="char" char=".">256</td>
</tr>
<tr>
<td align="left">FC1</td>
<td align="left"/>
<td align="char" char=".">256</td>
<td align="center">FC1</td>
<td align="center">&#x2014;</td>
<td align="char" char=".">256</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>A batch of 3D training samples are expressed as <inline-formula id="inf1">
<mml:math id="m1">
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:msup>
<mml:mi>x</mml:mi>
<mml:mn>1</mml:mn>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mi>y</mml:mi>
<mml:mn>1</mml:mn>
</mml:msup>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x2026;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:msup>
<mml:mi>x</mml:mi>
<mml:mtext>i</mml:mtext>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mi>y</mml:mi>
<mml:mtext>i</mml:mtext>
</mml:msup>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:msup>
<mml:mi>x</mml:mi>
<mml:mi>m</mml:mi>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mi>y</mml:mi>
<mml:mi>m</mml:mi>
</mml:msup>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>, where <italic>m</italic> is the number of samples, <inline-formula id="inf2">
<mml:math id="m2">
<mml:mrow>
<mml:msup>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> is the input sample, and <inline-formula id="inf3">
<mml:math id="m3">
<mml:mrow>
<mml:msup>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> is the real label corresponding to the sample. <inline-formula id="inf4">
<mml:math id="m4">
<mml:mrow>
<mml:msup>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msup>
<mml:mo>&#x2208;</mml:mo>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mn>0,1</mml:mn>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>, where 0 represents benign nodule and one represents malignant nodule. <inline-formula id="inf5">
<mml:math id="m5">
<mml:mrow>
<mml:msup>
<mml:mtext>p</mml:mtext>
<mml:mi>i</mml:mi>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> is the probability of prediction, and &#x3b8; represents all trainable parameters in the model. In this article, the weight factor of the right of use, <inline-formula id="inf6">
<mml:math id="m6">
<mml:mrow>
<mml:mi>&#x3b1;</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mn>0,1</mml:mn>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>, and the adjustable focus parameter, <inline-formula id="inf7">
<mml:math id="m7">
<mml:mrow>
<mml:mi>&#x3b3;</mml:mi>
<mml:mo>&#x2265;</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>, are used to solve the class imbalance problem, and the attention is focused on the sample of more complex training situations. The population objective function is the average value of the sample loss, as shown in <xref ref-type="disp-formula" rid="e1">Eq. 1</xref>, minimizing <inline-formula id="inf8">
<mml:math id="m8">
<mml:mrow>
<mml:mi>J</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>&#x3b8;</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> by optimizing network parameters.<disp-formula id="e1">
<mml:math id="m9">
<mml:mrow>
<mml:mi>J</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>&#x3b8;</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mo>&#x2212;</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mi>m</mml:mi>
</mml:mfrac>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mi>&#x3b1;</mml:mi>
<mml:msup>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msup>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>&#x3b8;</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mi>&#x3b3;</mml:mi>
</mml:msup>
<mml:mo>&#x2061;</mml:mo>
<mml:mi>log</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mi>p</mml:mi>
<mml:mi>i</mml:mi>
</mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>&#x3b8;</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x2b;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>&#x3b1;</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:msup>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mi>p</mml:mi>
<mml:mi>i</mml:mi>
</mml:msup>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>&#x3b8;</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mi>&#x3b3;</mml:mi>
</mml:msup>
<mml:mo>&#x2061;</mml:mo>
<mml:mi>log</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:msup>
<mml:mi>p</mml:mi>
<mml:mi>i</mml:mi>
</mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>&#x3b8;</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<label>(1)</label>
</disp-formula>
</p>
</sec>
<sec id="s3-3">
<title>3.3&#x20;Long-Term Lung Lesion Prediction Based on the T-LSTM Model</title>
<p>In this article, the long-term pulmonary nodules sequence image data set prepared in <xref ref-type="sec" rid="s3-1">Section 3.1</xref> was used to construct a long-term pulmonary nodule benign and malignant prediction model. LSTM and RNN are deep network architecture. The connection between hidden units forms a directed cycle. The feedback loop enables the network to save the previous hidden state information as internal memory. Therefore, RNNs are preferred for problems where the system needs to store and update the context information (<xref ref-type="bibr" rid="B30">Li et&#x20;al., 2020</xref>). Hidden Markov model (HMM) and other methods are&#x20;also used for similar purposes. However, RNN has its unique characteristics, which is different from traditional methods (such as HMM). For example, RNN can deal with&#x20;variable length sequences without the assumption of Markov property. In addition, in principle, the information entered in the past can be saved in memory without being limited by the past time. However, in practice, the optimization of long-term dependence is not always possible. Because when the gradient value becomes too small and too large, the gradient value will disappear and explode. In order to merge long-term dependencies without violating the optimization process, a variant of RNN has been proposed. One popular variant is LSTM, which can handle long-term dependencies using gated structures (<xref ref-type="bibr" rid="B24">Huimei et&#x20;al., 2020</xref>).</p>
<p>However, traditional LSTMs are not suitable for our task because the time between consecutive follow-up of patients is variable (<xref ref-type="bibr" rid="B43">Zhang et&#x20;al., 2020b</xref>), and they have no mechanism to explicitly model the arrival time of each observation (<xref ref-type="bibr" rid="B4">Baytas et&#x20;al., 2017</xref>). In fact, LSTM and, more generally, RNN have been shown to perform poorly in time series with irregular sampling data or lack of values (<xref ref-type="bibr" rid="B10">Che et&#x20;al., 2018</xref>). A previous study attempted to use LSTM for irregular sampling data points mainly focused on accelerating the convergence speed of the algorithm or reducing short-term memory in the setting with high-resolution sampling&#x20;data.</p>
<p>For the first time, we propose a temporal information enhancing LSTM neural networks (T-LSTM) that combine recurrent time labels with RNNs, which makes the best use of the temporal features to improve the accuracy of short-term prediction. And the Long-term lung lesion prediction algorithm in T-LSTM is shown as <xref ref-type="other" rid="algorithm_1">Algorithm 1</xref>.</p>
<p>To solve these problems, we introduce two simple modifications to the standard LSTM architecture, called T-LSTM, both of which explicitly use the input-related time index. In the architecture proposed in this article, all images of a given patient are first processed by CNN architecture, which extracts a set of image features, denoted by <italic>X</italic>
<sup>&#x2c6;<italic>t</italic>
</sup>, at each time step. The LSTM takes as inputs <inline-formula id="inf9">
<mml:math id="m10">
<mml:mrow>
<mml:msubsup>
<mml:mi>l</mml:mi>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula>, that is, the radiological labels describing the images acquired at the previous time step, the current image features <inline-formula id="inf10">
<mml:math id="m11">
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>X</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>t</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula>, and the time lapse between <inline-formula id="inf11">
<mml:math id="m12">
<mml:mrow>
<mml:msubsup>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf12">
<mml:math id="m13">
<mml:mrow>
<mml:msubsup>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>t</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula>, which we denote&#x20;as:</p>
<p>For the last image in the sequence, the LSTM predicts the image labels <inline-formula id="inf13">
<mml:math id="m14">
<mml:mrow>
<mml:msubsup>
<mml:mi>l</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>t</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula>, called <inline-formula id="inf14">
<mml:math id="m15">
<mml:mrow>
<mml:msubsup>
<mml:mtext>y</mml:mtext>
<mml:mi>i</mml:mi>
<mml:mi>t</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula>.The cell structure of T-LSTM is shown in <xref ref-type="fig" rid="F4">Figure&#x20;4</xref> The equations below define the T-LSTM unit:<disp-formula id="e2">
<mml:math id="m16">
<mml:mrow>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>&#x3c3;</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>f</mml:mi>
<mml:mi>l</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2217;</mml:mo>
<mml:msup>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo>-</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>f</mml:mi>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2217;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>X</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mi>t</mml:mi>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>f</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2217;</mml:mo>
<mml:msup>
<mml:mi>&#x3b4;</mml:mi>
<mml:mi>t</mml:mi>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>b</mml:mi>
<mml:mi>f</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<label>(2)</label>
</disp-formula>
<disp-formula id="e3">
<mml:math id="m17">
<mml:mrow>
<mml:msub>
<mml:mi>i</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>&#x3c3;</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>l</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2217;</mml:mo>
<mml:msup>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo>-</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2217;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>X</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mi>t</mml:mi>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2217;</mml:mo>
<mml:msup>
<mml:mi>&#x3b4;</mml:mi>
<mml:mi>t</mml:mi>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>b</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<label>(3)</label>
</disp-formula>
<disp-formula id="e4">
<mml:math id="m18">
<mml:mrow>
<mml:msub>
<mml:mi>o</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>&#x3c3;</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>o</mml:mi>
<mml:mi>l</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2217;</mml:mo>
<mml:msup>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>o</mml:mi>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2217;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>X</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mi>t</mml:mi>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>o</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2217;</mml:mo>
<mml:msup>
<mml:mi>&#x3b4;</mml:mi>
<mml:mi>t</mml:mi>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>b</mml:mi>
<mml:mi>o</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<label>(4)</label>
</disp-formula>
<disp-formula id="e5">
<mml:math id="m19">
<mml:mrow>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>tanh</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>l</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2217;</mml:mo>
<mml:msup>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2217;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>X</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mi>t</mml:mi>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2217;</mml:mo>
<mml:msup>
<mml:mi>&#x3b4;</mml:mi>
<mml:mi>t</mml:mi>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>b</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<label>(5)</label>
</disp-formula>
<disp-formula id="e6">
<mml:math id="m20">
<mml:mrow>
<mml:msub>
<mml:mi>h</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>&#x2217;</mml:mo>
<mml:msub>
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo>-</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>i</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>&#x2217;</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
<label>(6)</label>
</disp-formula>
<disp-formula id="e7">
<mml:math id="m21">
<mml:mrow>
<mml:msup>
<mml:mi>y</mml:mi>
<mml:mi>t</mml:mi>
</mml:msup>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mi>o</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>&#x2217;</mml:mo>
<mml:mi>tanh</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>h</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<label>(7)</label>
</disp-formula>
</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption>
<p>T-LSTM&#x20;cell.</p>
</caption>
<graphic xlink:href="fbioe-10-791424-g004.tif"/>
</fig>
<p>
<statement content-type="algorithm" id="algorithm_1">
<label>Algorithm 1</label>
<p>Long-term lung lesion prediction algorithm for T-LSTM.<list list-type="simple">
<list-item>
<p>Input: fusions of pulmonary nodules at different periods of the same patient <italic>X</italic>
<sup>&#x2c6;<italic>t</italic>
</sup>, <italic>t</italic>&#x20;&#x3d; 1, 2,&#x20;3;</p>
</list-item>
<list-item>
<p>Output: The results of classification&#x20;{<italic>0,1</italic>}.</p>
</list-item>
<list-item>
<label>
<bold>Step 1:</bold>
</label>
<p>When calculating C0, the first implied state, Ct&#x2212;1 is needed, but it does not exist, so it is set to&#x20;0.</p>
</list-item>
<list-item>
<label>
<bold>Step 2:</bold>
</label>
<p>Calculate the input gate, such as <xref ref-type="disp-formula" rid="e3">Eq. 3</xref>, including the benign and malignant label of the lesion sequence image at time <italic>t</italic>, the input of the feature vector of the lesion sequence image at time <italic>t</italic>, and the time interval between <italic>t</italic>&#x2212;1 and <italic>t</italic>. The activation function is calculated after summation.</p>
</list-item>
<list-item>
<label>
<bold>Step 3:</bold>
</label>
<p>The forgetting gate was calculated as <xref ref-type="disp-formula" rid="e2">Eq. 2</xref>, including the benign and malignant labels of the lesion sequence image at time <italic>t</italic>, the input of the feature vector of the lesion sequence image at time <italic>t</italic>, and the time interval between <italic>t</italic>&#x2212;1 and <italic>t</italic>. The activation function is calculated after summation.</p>
</list-item>
<list-item>
<label>
<bold>Step 4:</bold>
</label>
<p>The output gate is calculated as <xref ref-type="disp-formula" rid="e4">Eq. 4</xref>, which includes the benign and malignant labels of the lesion sequence image at time <italic>t</italic>, the input of the feature vector of the lesion sequence image at time <italic>t</italic>, and the cumulative sum of the time intervals of <italic>t</italic>&#x2212;1 and <italic>t</italic>, and then the activation function is calculated.</p>
</list-item>
<list-item>
<label>
<bold>Step 5:</bold>
</label>
<p>The computational memory unit (the first layer is not calculated), as shown in <xref ref-type="disp-formula" rid="e5">Eq. 5</xref>, contains the benign and malignant labels of the lesion sequence image at time <italic>t</italic>, the input of the feature vector of the lesion sequence image at time <italic>t</italic>, and the time interval between <italic>t</italic>&#x2212;1 and <italic>t</italic>. The activation function is calculated after summation.</p>
</list-item>
<list-item>
<label>
<bold>Step 6:</bold>
</label>
<p>Calculation of implicit elements, such as <xref ref-type="disp-formula" rid="e6">Eq.&#x20;6</xref>.</p>
</list-item>
<list-item>
<label>
<bold>Step 7:</bold>
</label>
<p>Repeat steps <xref ref-type="disp-formula" rid="e2">Eqs 2</xref>&#x2013;<xref ref-type="disp-formula" rid="e6">6</xref> to calculate the input and output of each layer by&#x20;layer.</p>
</list-item>
</list>
</p>
</statement>
</p>
</sec>
</sec>
<sec id="s4">
<title>4 Experiments and Results</title>
<sec id="s4-1">
<title>4.1 Data Sets</title>
<p>In order to train and classify CNN, we used two labeled lung data sets. One is the NLST data set, and the other is the provided cooperative hospital data&#x20;set.</p>
<p>
<italic>NLST</italic> (The Landmark National Lung Screening Trial) data set. The NLST is a randomized, multisite trial that examined lung cancer&#x2013;specific mortality among participants in an asymptomatic high-risk cohort. Subjects underwent screening with the use of low-dose CT or chest X-ray. More than 53,000 participants each underwent three annual screenings from 2002 to 2007 (approximately 25,500 in the LDCT study arm), with follow-up postscreening through 2009. Lung cancers identified as pulmonary nodules were confirmed by diagnostic procedures (e.g., biopsy, cytology); participants with confirmed lung cancer were subsequently removed from the trial for treatment through 2009. NLST contains 421 CT scans annotated by four radiological experts voxel-wise.</p>
<p>The cooperative hospital had CT images of the lungs of 267 patients, a total of 1,837 cases. The pulmonary CT images of the cooperation hospital were taken from the positron emission tomography (PET)/CT center of a hospital in Shanxi Province in January 2011 and January 2017. The medical equipment used was Discovery ST16 PET/CT of GE. The CT image acquisition parameters were as follows: 150&#xa0;mA, 140&#xa0;kV, layer thickness 3.75&#xa0;mm, and image resolution 512&#x20;&#xd7; 512. Under the diagnosis of two professional radiologists, the nodule location was marked, and all cases were marked with 1 and 0, respectively.</p>
</sec>
<sec id="s4-2">
<title>4.2 Input Description</title>
<p>We determined the size of the receptive field used in our framework by analyzing the size distribution of pulmonary nodules. Firstly, we observed that the diameter density peak of small nodules was about 9 voxels in X and Y dimensions and about 4 voxels in Z dimension. We set the first network, Archi-1, with an acceptance domain of 30&#x20;&#xd7; 30&#x20;&#xd7; 10 (voxels). This receiving domain can contain small pulmonary nodules in the appropriate context, and it covers 85% of all nodules in the data set. This can be performed well under normal circumstances, most often in patients. The purpose of this window size is to provide rich background information for small nodules and appropriate background information for medium-sized lesions. For some large nodules, it can usually include their main parts and exclude some marginal areas. Finally, we constructed an overall acceptance domain of 40&#x20;&#xd7; 40&#x20;&#xd7; 10. According to our statistical analysis, the boundary of this model is more than 99% of nodules, except for several outliers.</p>
</sec>
<sec id="s4-3">
<title>4.3 Classification Accuracy Comparison of 3D CNN Feature Extraction Methods With Different Parameters</title>
<p>This article adopts the method of uniform random sampling; the NLST data set is divided into training set validation set and test set. Three parts will be 1 over 10 of the NLST data set as a test set; the rest of the data according to speak is divided into training set and test set because the model in clinical practice needs to detect significant differences of data and training data, so we use team hospital to provide the data set and the NLST test set as a test set to select the training program.</p>
<p>Training process, from the positive and negative sample dropout layer and maxnorm regularization, weight initialization, data expansion four aspects to experiment on the two-validation set to explore the four aspects of the influence of different combination for the model to detect lung nodules on the NLST test and cooperation hospital test sets of prediction results, and the network parameters as shown in <xref ref-type="table" rid="T2">Tables 2</xref>, <xref ref-type="table" rid="T3">3</xref>, which define the sensitivity, specificity, accuracy, and <italic>F</italic> score of the four parameters to evaluate the classification effect of nodules. The dropout rates are 1:20, 1:10, 1:5, 1:3, and&#x20;1:2.</p>
<table-wrap id="T2" position="float">
<label>TABLE 2</label>
<caption>
<p>The classification results and network parameters on NLST test set.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Method</th>
<th align="center">Sensitivity</th>
<th align="center">Specificity</th>
<th align="center">Accuracy</th>
<th align="center">F1 score</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">1:20 &#x2b; Dropout</td>
<td align="char" char=".">0.801</td>
<td align="char" char=".">0.999</td>
<td align="char" char=".">0.905</td>
<td align="char" char=".">0.891</td>
</tr>
<tr>
<td align="left">1:20 &#x2b; Dropout &#x2b; Maxnorm</td>
<td align="char" char=".">0.752</td>
<td align="char" char=".">0.998</td>
<td align="char" char=".">0.883</td>
<td align="char" char=".">0.863</td>
</tr>
<tr>
<td align="left">1:10 &#x2b; Dropout &#x2b; Maxnorm</td>
<td align="char" char=".">0.861</td>
<td align="char" char=".">0.998</td>
<td align="char" char=".">0.921</td>
<td align="char" char=".">0.904</td>
</tr>
<tr>
<td align="left">1:5 &#x2b; Dropout &#x2b; Maxnorm</td>
<td align="char" char=".">0.908</td>
<td align="char" char=".">0.994</td>
<td align="char" char=".">0.949</td>
<td align="char" char=".">0.923</td>
</tr>
<tr>
<td align="left">1:3 &#x2b; Dropout &#x2b; Maxnorm</td>
<td align="char" char=".">0.917</td>
<td align="char" char=".">0.994</td>
<td align="char" char=".">0.953</td>
<td align="char" char=".">0.913</td>
</tr>
<tr>
<td align="left">1:2 &#x2b; Dropout &#x2b; Maxnorm</td>
<td align="char" char=".">0.924</td>
<td align="char" char=".">0.991</td>
<td align="char" char=".">0.957</td>
<td align="char" char=".">0.924</td>
</tr>
<tr>
<td align="left">1:2 &#x2b; Dropout &#x2b; Maxnorm &#x2b; Lecun</td>
<td align="char" char=".">0.932</td>
<td align="char" char=".">
<bold>0.989</bold>
</td>
<td align="char" char=".">0.954</td>
<td align="char" char=".">0.917</td>
</tr>
<tr>
<td align="left">1:2 &#x2b; Dropout &#x2b; Maxnorm &#x2b; Lecun &#x2b; Aug</td>
<td align="char" char=".">0.943</td>
<td align="char" char=".">
<bold>0.985</bold>
</td>
<td align="char" char=".">
<bold>0.965</bold>
</td>
<td align="char" char=".">0.929</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>The bold values is the best performance.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<table-wrap id="T3" position="float">
<label>TABLE 3</label>
<caption>
<p>The classification results and network parameters on cooperative hospital test&#x20;set.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Method</th>
<th align="center">Sensitivity</th>
<th align="center">Specificity</th>
<th align="center">Accuracy</th>
<th align="center">F1score</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">1:20 &#x2b; Dropout</td>
<td align="char" char=".">0.711</td>
<td align="char" char=".">0.908</td>
<td align="char" char=".">0.815</td>
<td align="char" char=".">0.864</td>
</tr>
<tr>
<td align="left">1:20 &#x2b; Dropout &#x2b; Maxnorm</td>
<td align="char" char=".">0.705</td>
<td align="char" char=".">0.900</td>
<td align="char" char=".">0.800</td>
<td align="char" char=".">0.848</td>
</tr>
<tr>
<td align="left">1:10 &#x2b; Dropout &#x2b; Maxnorm</td>
<td align="char" char=".">0.721</td>
<td align="char" char=".">0.908</td>
<td align="char" char=".">0.817</td>
<td align="char" char=".">0.864</td>
</tr>
<tr>
<td align="left">1:5 &#x2b; Dropout &#x2b; Maxnorm</td>
<td align="char" char=".">0.717</td>
<td align="char" char=".">0.907</td>
<td align="char" char=".">0.813</td>
<td align="char" char=".">0.871</td>
</tr>
<tr>
<td align="left">1:3 &#x2b; Dropout &#x2b; Maxnorm</td>
<td align="char" char=".">0.709</td>
<td align="char" char=".">0.886</td>
<td align="char" char=".">0.799</td>
<td align="char" char=".">0.862</td>
</tr>
<tr>
<td align="left">1:2 &#x2b; Dropout &#x2b; Maxnorm</td>
<td align="char" char=".">0.698</td>
<td align="char" char=".">0.870</td>
<td align="char" char=".">0.779</td>
<td align="char" char=".">0.952</td>
</tr>
<tr>
<td align="left">1:2 &#x2b; Dropout &#x2b; Maxnorm &#x2b; Lecun</td>
<td align="char" char=".">0.760</td>
<td align="char" char=".">0.943</td>
<td align="char" char=".">0.851</td>
<td align="char" char=".">0.878</td>
</tr>
<tr>
<td align="left">1:2 &#x2b; Dropout &#x2b; Maxnorm &#x2b; Lecun &#x2b; Aug</td>
<td align="char" char=".">0.814</td>
<td align="char" char=".">
<bold>0.946</bold>
</td>
<td align="char" char=".">0.880</td>
<td align="char" char=".">0.901</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>The bold values is the best performance.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>First, it can be seen from <xref ref-type="table" rid="T2">Tables 2</xref>, <xref ref-type="table" rid="T3">3</xref>, when the samples are rare, even in the process of testing, all samples to sample more than one, and the same accuracy can be higher. Thus, the balance of positive and negative samples in the training is very important in this article. The main purpose of this model from the hundreds of thousands of pieces of chest CT image sequence forecasts suggestive of benign and malignant lesion area is for the doctor to prescreen in the&#x20;end. The bold values is the best performance.</p>
<p>It can be seen from <xref ref-type="table" rid="T4">Tables 4 </xref>and <xref ref-type="table" rid="T5">5</xref> that the accuracy of the basic RNN tanh-RNN can reach 87.1%, which verifies that the RNN has the ability of learning and discriminating features. Support Vector Machines (SVM) is a traditional feature extraction and classification method. As it is unable to learn deep hidden features and their existing relationships, its accuracy is relatively low. However, the T-LSTM network proposed in this article is higher than RNN, which proves that considering the relevant continuous changes of things is helpful to further improve the accuracy of prediction. The bold values is the best performance.</p>
<table-wrap id="T4" position="float">
<label>TABLE 4</label>
<caption>
<p>Comparison of prediction performance of different methods.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Algorithm</th>
<th align="center">ACC (%)</th>
<th align="center">Pre (%)</th>
<th align="center">Rec (%)</th>
<th align="center">
<italic>F</italic> score (s)</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">SVM</td>
<td align="char" char=".">0.812</td>
<td align="char" char=".">0.818</td>
<td align="char" char=".">0.813</td>
<td align="char" char=".">0.819</td>
</tr>
<tr>
<td align="left">tanh-RNN</td>
<td align="char" char=".">0.871</td>
<td align="char" char=".">0.936</td>
<td align="char" char=".">0.778</td>
<td align="char" char=".">0.874</td>
</tr>
<tr>
<td align="left">LSTM</td>
<td align="char" char=".">0.911</td>
<td align="char" char=".">0.943</td>
<td align="char" char=".">0.875</td>
<td align="char" char=".">0.903</td>
</tr>
<tr>
<td align="left">T-LSTM</td>
<td align="char" char=".">
<bold>0.928</bold>
</td>
<td align="char" char=".">
<bold>0.948</bold>
</td>
<td align="char" char=".">
<bold>0.927</bold>
</td>
<td align="char" char=".">
<bold>0.938</bold>
</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>The bold values is the best performance.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<table-wrap id="T5" position="float">
<label>TABLE 5</label>
<caption>
<p>Results for all models, AUROC, and specificity at sensitivity (SPC@SEN) of 0.87, with 95% confidence interval (CI) displayed in brackets.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left"/>
<th align="center">AUROC [CI]</th>
<th align="center">SPC@SEN 0.87 [CI]</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">LSTM</td>
<td align="center">0.82 [0.732&#x2013;0.821]</td>
<td align="center">0.62 [0.401&#x2013;0.705]</td>
</tr>
<tr>
<td align="left">xgb</td>
<td align="center">0.84 [0.789&#x2013;0.880]</td>
<td align="center">0.75 [0.534&#x2013;0.721]</td>
</tr>
<tr>
<td align="left">BI-LSTM</td>
<td align="center">0.90 [0.802&#x2013;0.908]</td>
<td align="center">0.72 [0.543&#x2013;0.813]</td>
</tr>
<tr>
<td align="left">T-LSTM</td>
<td align="center">
<bold>0.93[0.825&#x2013;0.921]</bold>
</td>
<td align="center">0.78 [0.561&#x2013;0.921]</td>
</tr>
<tr>
<td align="left">RNN</td>
<td align="center">0.88 [0.744&#x2013;0.851]</td>
<td align="center">0.59 [0.371&#x2013;0.752]</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>&#x2a;<italic>p</italic>&#x20;&#x3c; 1e-6 compared with RNN. The bold values is the best performance.</p>
</fn>
</table-wrap-foot>
</table-wrap>
</sec>
<sec id="s4-4">
<title>4.4 Discussion of the Number of LSTM Layers</title>
<p>The number of network layers directly affects the ability of the network to extract the characteristics of lung nodules. Theoretically, the more hidden layers, the more complex the network structure, making the network have a strong feature extraction ability, and the higher the accuracy. However, blindly increasing the number of network layers will result in increased difficulty of network training, greatly prolonged learning time, and poor accuracy. In this article, the network structure with different hidden layers is studied to ensure that other parameters of the network remain unchanged, and the average value is calculated 10&#x20;times per iteration. Generally speaking, the more layers of LSTM module, the stronger the learning ability of higher-level time representation. At the same time, a layer of ordinary neural network is added to reduce the dimension of the output results.</p>
<p>As can be seen from <xref ref-type="fig" rid="F5">Figure&#x20;5</xref>, the prediction accuracy increases first and then decreases with the increase of the number of network layers. When the number of network layers is 4, the overall accuracy is higher than other values. When the number of layers in the network is 6, because the number of layers is too deep and difficult to converge, and at the same time, the high-level abstract feature information weakens the differentiation of benign and malignant nodules, the result will fall into the local extreme value, and the accuracy is reduced.</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption>
<p>Layer number experimental result diagram.</p>
</caption>
<graphic xlink:href="fbioe-10-791424-g005.tif"/>
</fig>
</sec>
<sec id="s4-5">
<title>4.5 Comparison of Convergence Effect of T-LSTM</title>
<p>This section will compare the performance of the T-LSTM and the Bi-directional Long Short-Term Memory (BI-LSTM) LSTM in the training process. In theory, the BI-LSTM model takes about twice as much time as the LSTM because of its bidirectional structure. The single-cycle time of the T-LSTM is approximately 1.4&#x20;times as much as the LSTM due to the fact that the input data of the T-LSTM are more than those of the LSTM as shown in <xref ref-type="fig" rid="F6">Figures 6</xref>&#x2013;<xref ref-type="fig" rid="F8">8</xref>, in the training process of neural network, although the LSTM converged faster than T-LSTM and BI-LSTM at the beginning; the time of BI-LSTM was only 1.5&#x20;times that of LSTM and that of T-LSTM was only 1.2&#x20;times that of LSTM due to the impact of data reading speed and other factors. After some periodic training, when LSTM and BI-LSTM gradually approach a constant value, T-LSTM can continue to converge. From the perspective of recognition effect, T-LSTM performs better than the other two. From the perspective of model convergence and recognition effect, the validity of time-modulated recursive neural network structure is proven.</p>
<fig id="F6" position="float">
<label>FIGURE 6</label>
<caption>
<p>Comparison of convergence LER results between T-LSTM and BI-LSTM and LSTM.</p>
</caption>
<graphic xlink:href="fbioe-10-791424-g006.tif"/>
</fig>
<fig id="F7" position="float">
<label>FIGURE 7</label>
<caption>
<p>Comparison of convergence Rec results between T-LSTM and BI-LSTM and LSTM.</p>
</caption>
<graphic xlink:href="fbioe-10-791424-g007.tif"/>
</fig>
<fig id="F8" position="float">
<label>FIGURE 8</label>
<caption>
<p>ROC curve of each model. Blue is LSTM; orange is gradient boost (xgb); green is BiLSTM; red is T-LSTM; and purple is RNN.</p>
</caption>
<graphic xlink:href="fbioe-10-791424-g008.tif"/>
</fig>
</sec>
<sec id="s4-6">
<title>4.6 Comparison of Prediction Rates Among Different Classifiers</title>
<p>The RNN classifier does not add <italic>a priori</italic> knowledge. The AUC under the receiver operating characteristic (ROC curve; AUROC) obtained on the evaluation set is 0.88, the sensitivity is 0.87, and the specificity is 0.59. LSTM did not improve the accuracy and decreased slightly compared with RNN (<xref ref-type="table" rid="T4">Table&#x20;4</xref>). BI-LSTM increased AUROC to 0.90 and specificity to 0.64, which was not statistically significant. The improvement obtained by gradient boosting was more significant (AUROC 0.84, specificity 0.75, <italic>p</italic>&#x20;&#x3c; 1e&#x2212;6). The T-LSTM network further improved the performance, with AUROC of 0.93, specificity of 0.78, and sensitivity of 0.87. The ROC curves of the five classifiers are given in <xref ref-type="fig" rid="F8">Figure&#x20;8</xref>.</p>
<p>We can see that because the ability of network to learn from image sequences is limited by depth, RNN is not as good as BI-LSTM. In future work, we intend to validate our results on a larger evaluation set. The further improvement of this work is to train 3D CNN and T-LSTM networks at the same time to realize the joint optimization of the whole classification architecture. In addition, we will consider the role of clinical information in guiding classification. Finally, we can also evaluate the effect of using multiple <italic>a priori</italic> knowledge or neighborhood knowledge in the training set. In conclusion, combining long-time sequence image research in the deep learning analysis framework can improve the classification performance and enhance radiologists&#x2019; confidence in the reliability of decision support technology.</p>
</sec>
</sec>
<sec id="s5">
<title>5 Discussion</title>
<p>It is reported that deep learning algorithm can achieve high performance in medical image classification task (<xref ref-type="bibr" rid="B27">Kooi and Karssemeijer, 2017</xref>; <xref ref-type="bibr" rid="B37">Ribli et&#x20;al., 2018</xref>). However, the current algorithm is still lower than the average level of human radiologists in real-world data. One explanation for this gap is that radiologists add additional information to their diagnostic analysis, such as nonimage clinical information and patient specific information. We address the latter by allowing our algorithm to analyze current and previous studies. Most of the literatures in this field do not take into account the relevant characteristics and information of patient time series, so it is difficult to accurately compare the performance. On different data sets, AUROC values for cancer classification ranged from 0.79 to 0.95. The AUROCs with time information and without time information are 0.82 and 0.93, respectively, which is different from the related work (<xref ref-type="bibr" rid="B27">Kooi and Karssemeijer, 2017</xref>), reflecting the significant benefits of using previous studies. The advantage of our method is that it only needs to comprehensively label each pulmonary nodule without expensive local lesion description. The experimental results show that it is not enough to simply classify the images separately; only by training the classification algorithm on the long-time sequence image can it be improved.</p>
<p>It can be seen in <xref ref-type="table" rid="T4">Table&#x20;4</xref>, there are two serious problems in RNN: gradient explosion and gradient disappearance; thus, the follow-up training results are not very good. LSTM improves the gradient updating process, which is mainly generated by the accumulation of the output of each gate, so as to avoid the problems of gradient explosion and gradient disappearance caused by accumulation and multiplication such as RNN. Bidirectional LSTM is actually the integration of two LSTM (forward and backward) to enable them to extract information from the above and below at the same time. The main integration methods are direct splicing concatenated and weighted summation. Adding nonlinear characteristics can also fit the data better. The training system that provides the highest performance is T-LSTM, which trains based on features of 3D CNN extracted. T-LSTM solution is also scalable for analyzing multiple <italic>a priori</italic> sequences, and we will further study how to increase scalability and robustness in the future. In <xref ref-type="table" rid="T2">Table&#x20;2</xref>, there is a lower probability of specificity after data enhancement than without data enhancement, which may be related to the setting of data enhancement parameters. It can reduce the overfitting of data, but also depends on enhanced effects and methods. Although the proposed method has a certain reduction, it is within a reasonable range. In <xref ref-type="fig" rid="F5">Figures 5</xref>, <xref ref-type="fig" rid="F6">6</xref>, the convergence speed of the proposed method is slower than that of LSTM at the beginning, but it can achieve the best convergence effect after 35 iterations. This shows that our method can realize effective processing and analysis for data with more time information.</p>
<p>However, there are still some limitations. If the time span of LSTM is very large and the network is very deep, this calculation will be very time-consuming. Meanwhile, this network structure also has certain limitations in efficiency and scalability. In addition, there is the issue of data size. An LSTM is a neural network and like any neural network requires a large amount of data to be trained on properly. The information with a time series needs to traverse all the cells before entering the current processing unit. This generates vanishing gradient. LSTM does not completely solve this problem. The methods proposed in this article tend to do better on unstable time series with more fixed components because of their inherent ability to quickly adapt to sharp changes in trends. However, this method can only make short-term prediction, and remote prediction may be invalid. This is also one of the limitations of the proposed method. In future work, we will consider how to better learn on medical small sample data sets. And we will try to improve the robustness and generalization of the algorithm so that the model can be used in more different scenarios and environments.</p>
</sec>
<sec id="s6">
<title>6 Conclusion</title>
<p>In this article, we have used and substantially extended LSTM in the 3D spatial&#x2013;temporal domain for the task of modeling 3D longitudinal pulmonary nodule data. The novel 3D CNNs and T-LSTM network jointly learn the interslice structures, the interslice 3D contexts, and the temporal dynamics. Quantitative results of notably higher accuracies than the original RNN are reported, using several metrics on predicting the future tumor volumes. Compared with the most recent 2D &#x2b; time deep learning&#x2013;based tumor growth prediction models (<xref ref-type="bibr" rid="B31">Missrie et&#x20;al., 2018</xref>; <xref ref-type="bibr" rid="B2">Audrey et&#x20;al., 2019</xref>), our new approach directly works on 3D imaging space and incorporates clinical factors in an end-to-end trainable manner. This method can also detect the benign and malignant pulmonary nodules. Our experiments are conducted on the largest longitudinal lung data set (421 patients) to date and demonstrate the validity of our proposed method. This method enables efficient and effective 3D medical image segmentation with only sparse manual image annotations required. The presented prediction model can potentially enable other applications of medical sequence imaging applications. Gradient extinction can be remedied with the LSTM module, which is currently considered a multiswitched gateway, a bit like ResNet. Because LSTM can bypass some cells and memorize long steps, LSTM can solve the gradient disappearance problem to some extent. This method can provide technical support for processing medical image data or bioinformatics data with time information in the future.</p>
</sec>
</body>
<back>
<sec id="s7">
<title>Data Availability Statement</title>
<p>The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.</p>
</sec>
<sec id="s8">
<title>Author Contributions</title>
<p>MW contributed to conception and design of the study. XL performed the statistical analysis and wrote the first draft of the manuscript. RA wrote sections of the manuscript. All authors contributed to manuscript revision, read, and approved the submitted version.</p>
</sec>
<sec id="s9">
<title>Funding</title>
<p>This work was supported by National Natural Science Foundation of China (grant numbers 61872261, 61972274).</p>
</sec>
<sec sec-type="COI-statement" id="s10">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s11">
<title>Publisher&#x2019;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Aaron</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Miller</surname>
<given-names>J.&#x20;B.</given-names>
</name>
<name>
<surname>Christin</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Jeffrey</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>2018</year>). <source>Neuroscience Learning from Longitudinal Cohort Studies of Alzheimer&#x2019;s Disease: Lessons for Disease-Modifying Drug Programs and an Introduction to the center for Neurodegeneration and Translational Neuroscience</source>. <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Alzheimers &#x26; Dementia Translational Research &#x26; Clinical Interventions</publisher-name>. <comment>S2352873718300350&#x2013;</comment>. </citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Audrey</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Aberle</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Hsu</surname>
<given-names>W.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>External Validation and Recalibration of the Brock Model to Predict Probability of Cancer in Pulmonary Nodules Using Nlst Data</article-title>. <source>Thorax</source> <volume>74</volume>. <pub-id pub-id-type="doi">10.1136/thoraxjnl-2018-212413</pub-id> </citation>
</ref>
<ref id="B4">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Baytas</surname>
<given-names>I. M.</given-names>
</name>
<name>
<surname>Xiao</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2017</year>). &#x201c;<article-title>Patient Subtyping via Time-Aware Lstm Networks</article-title>,&#x201d; in <source>The 23rd ACM SIGKDD International Conference</source>. <pub-id pub-id-type="doi">10.1145/3097983.3097997</pub-id> </citation>
</ref>
<ref id="B5">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Bodla</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Zheng</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>J.&#x20;C.</given-names>
</name>
<name>
<surname>Chellappa</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2017</year>). &#x201c;<article-title>Deep Heterogeneous Feature Fusion for Template-Based Face Recognition</article-title>,&#x201d; in <conf-name>2017 IEEE Winter Conference on Applications of Computer Vision (WACV)</conf-name>. <pub-id pub-id-type="doi">10.1109/WACV.2017.71</pub-id> </citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Boudjemaa</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Ouaar</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Oliva</surname>
<given-names>D.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Fractional L&#xe9;vy Flight Bat Algorithm for Global Optimisation</article-title>. <source>Ijbic</source> <volume>15</volume> (<issue>2</issue>), <fpage>100</fpage>&#x2013;<lpage>112</lpage>. <pub-id pub-id-type="doi">10.1504/ijbic.2020.10028011</pub-id> </citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cai</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Geng</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Cui</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>W.</given-names>
</name>
<etal/>
</person-group> (<year>2021</year>). <article-title>A Sharding Scheme Based Many-objective Optimization Algorithm for Enhancing Security in Blockchain-Enabled Industrial Internet of Things</article-title>. <source>IEEE Trans. Ind. Inform.</source> <volume>17</volume> (<issue>1</issue>), <fpage>7650</fpage>&#x2013;<lpage>7658</lpage>. <pub-id pub-id-type="doi">10.1109/tii.2021.3051607</pub-id> </citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cai</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Cao</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Ren</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Cui</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>W.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Multi-objective Evolutionary 3D Face Reconstruction Based on Improved Encoder-Decoder Network</article-title>. <source>Inf. Sci.</source> <volume>581</volume>, <fpage>233</fpage>&#x2013;<lpage>248</lpage>. <pub-id pub-id-type="doi">10.1016/j.ins.2021.09.024</pub-id> </citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chandra</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Competition and Collaboration in Cooperative Coevolution of Elman Recurrent Neural Networks for Time-Series Prediction</article-title>. <source>IEEE Trans. Neural Networks Learn. Syst.</source> <volume>26</volume> (<issue>12</issue>), <fpage>1</fpage>. <pub-id pub-id-type="doi">10.1109/tnnls.2015.2404823</pub-id> </citation>
</ref>
<ref id="B10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Che</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Purushotham</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Cho</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Sontag</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Recurrent Neural Networks for Multivariate Time Series with Missing Values</article-title>. <source>Sci. Rep.</source> <volume>8</volume> (<issue>1</issue>), <fpage>6085</fpage>. <pub-id pub-id-type="doi">10.1038/s41598-018-24271-9</pub-id> </citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cheng</surname>
<given-names>L. F.</given-names>
</name>
<name>
<surname>Darnell</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Dumitrascu</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Chivers</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Draugelis</surname>
<given-names>M. E.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>K.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>Sparse Multi-Output Gaussian Processes for Medical Time Series Prediction</article-title>. <source>BMC Med. Inform. Decis. Making</source> <volume>20</volume>. <pub-id pub-id-type="doi">10.1186/s12911-020-1069-4</pub-id> </citation>
</ref>
<ref id="B12">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Christo</surname>
<given-names>V. R. E.</given-names>
</name>
<name>
<surname>Nehemiah</surname>
<given-names>H. K.</given-names>
</name>
<name>
<surname>Nahato</surname>
<given-names>K. B.</given-names>
</name>
<name>
<surname>Brighty</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Kannan</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Computer Assisted Medical Decision-Making System Using Genetic Algorithm and Extreme Learning Machine for Diagnosing Allergic Rhinitis</article-title>. <source>Ijbic</source> <volume>16</volume> (<issue>3</issue>), <fpage>148</fpage>&#x2013;<lpage>157</lpage>. <pub-id pub-id-type="doi">10.1504/ijbic.2020.111279</pub-id> </citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cui</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Cai</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Hybrid many-objective Cuckoo Search Algorithm with L&#xe9;vy and Exponential Distributions</article-title>. <source>Memetic Comp.</source> <volume>12</volume> (<issue>3</issue>), <fpage>251</fpage>&#x2013;<lpage>265</lpage>. <pub-id pub-id-type="doi">10.1007/s12293-020-00308-3</pub-id> </citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cui</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Cao</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Cai</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Malicious Code Detection under 5G HetNets Based on a Multi-Objective RBM Model</article-title>. <source>IEEE Netw.</source> <volume>35</volume> (<issue>2</issue>), <fpage>82</fpage>&#x2013;<lpage>87</lpage>. <pub-id pub-id-type="doi">10.1109/mnet.011.2000331</pub-id> </citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Deng</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Song</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>An Effective Improved Co-evolution Ant colony Optimisation Algorithm with Multi-Strategies and its Application</article-title>. <source>Ijbic</source> <volume>16</volume> (<issue>3</issue>), <fpage>158</fpage>&#x2013;<lpage>170</lpage>. <pub-id pub-id-type="doi">10.1504/ijbic.2020.10033314</pub-id> </citation>
</ref>
<ref id="B16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Donahue</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Hendricks</surname>
<given-names>L. A.</given-names>
</name>
<name>
<surname>Guadarrama</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Rohrbach</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Venugopalan</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Saenko</surname>
<given-names>K.</given-names>
</name>
<etal/>
</person-group> (<year>2015</year>). <article-title>Long-term Recurrent Convolutional Networks for Visual Recognition and Description</article-title>. <source>Elsevier</source>. </citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Duffy</surname>
<given-names>S. W.</given-names>
</name>
<name>
<surname>Field</surname>
<given-names>J.&#x20;K.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Mortality Reduction with Low-Dose Ct Screening for Lung Cancer</article-title>. <source>New Engl. J.&#x20;Med.</source> <volume>382</volume> (<issue>6</issue>). </citation>
</ref>
<ref id="B18">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Fragkiadaki</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Levine</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Felsen</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Malik</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2016</year>). <source>Recurrent Network Models for Human Dynamics</source>. <publisher-loc>Santiago, Chile</publisher-loc>: <publisher-name>IEEE</publisher-name> <volume>2</volume>, <fpage>18</fpage>. </citation>
</ref>
<ref id="B19">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Gao</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Pan</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Xie</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Han</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2018</year>). &#x201c;<article-title>Brain Disease Diagnosis Using Deep Learning Features from Longitudinal MR Images: Second International Joint Conference, Apweb-Waim 2018, macau, china, July 23&#x2013;25, 2018</article-title>,&#x201d; in <source>Web and Big Data</source>, <fpage>327</fpage>&#x2013;<lpage>339</lpage>. <comment>proceedings, part i</comment>. <pub-id pub-id-type="doi">10.1007/978-3-319-96890-2_27</pub-id> </citation>
</ref>
<ref id="B20">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Grano</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2019</year>). &#x201c;<article-title>Getting Aspectual-Guo under Control in Mandarin Chinese: An Experimental Investigation</article-title>,&#x201d; in <source>Proceedings of the 30th North American Conference on Chinese Linguistics (NACCL-30)</source>, <volume>Vol. 1</volume>, <fpage>208</fpage>&#x2013;<lpage>215</lpage>. </citation>
</ref>
<ref id="B21">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Graves</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2012</year>). <source>Long Short-Term Memory</source>. <publisher-loc>Berlin Heidelberg</publisher-loc>: <publisher-name>Springer Berlin Heidelberg</publisher-name>. </citation>
</ref>
<ref id="B22">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>He</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Ren</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2016</year>). &#x201c;<article-title>Deep Residual Learning for Image Recognition</article-title>,&#x201d; in <conf-name>Proceeding of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</conf-name>, <conf-loc>Las Vegas, NV, USA</conf-loc>, <conf-date>27-30 June 2016</conf-date> (<publisher-name>IEEE</publisher-name>), <fpage>770</fpage>&#x2013;<lpage>778</lpage>. <pub-id pub-id-type="doi">10.1109/CVPR.2016.90</pub-id> </citation>
</ref>
<ref id="B23">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hu</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Keogh</surname>
<given-names>E.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Classification of Streaming Time Series under More Realistic Assumptions</article-title>. <source>Data Min Knowl Disc</source> <volume>30</volume> (<issue>2</issue>), <fpage>403</fpage>&#x2013;<lpage>437</lpage>. <pub-id pub-id-type="doi">10.1007/s10618-015-0415-0</pub-id> </citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Huimei</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Xingquan</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Ying</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Generalizing Long Short-Term Memory Network for Deep Learning from Generic Data</article-title>. <source>ACM Trans. Knowledge Discov. Data (Tkdd)</source> <volume>14</volume> (<issue>2</issue>), <fpage>1</fpage>&#x2013;<lpage>28</lpage>. <pub-id pub-id-type="doi">10.1145/3366022</pub-id> </citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kamnitsas</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Ledig</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Newcombe</surname>
<given-names>V. F. J.</given-names>
</name>
<name>
<surname>Simpson</surname>
<given-names>J.&#x20;P.</given-names>
</name>
<name>
<surname>Kane</surname>
<given-names>A. D.</given-names>
</name>
<name>
<surname>Menon</surname>
<given-names>D. K.</given-names>
</name>
<etal/>
</person-group> (<year>2016</year>). <article-title>Efficient Multi-Scale 3D CNN with Fully Connected CRF for Accurate Brain Lesion Segmentationfficient Multi-Scale 3d Cnn with Fully Connected Crf for Accurate Brain Lesion Segmen- Tation</article-title>. <source>Med. Image Anal.</source> <volume>36</volume>, <fpage>61</fpage>&#x2013;<lpage>78</lpage>. <pub-id pub-id-type="doi">10.1016/j.media.2016.10.004</pub-id> </citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Khusnuliawati</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Fatichah</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Soelaiman</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Multi-feature Fusion Using Sift and Lebp for finger Vein Recognition</article-title>. <source>TELKOMNIKA (Telecommunication Computing Electronics and Control)</source> <volume>15</volume>, <fpage>478</fpage>. <pub-id pub-id-type="doi">10.12928/telkomnika.v15i1.4443</pub-id> </citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kooi</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Karssemeijer</surname>
<given-names>N.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Classifying Symmetrical Differences and Temporal Change in Mammogra- Phy Using Deep Neural Networks</article-title>. <source>J.&#x20;Med. Imaging (Bellingham)</source> <volume>4</volume> (<issue>4</issue>), <fpage>044501</fpage>. <pub-id pub-id-type="doi">10.1117/1.JMI.4.4.044501</pub-id> </citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Koutn&#xed;k</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Greff</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Gomez</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Schmidhuber</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>A Clockwork Rnn</article-title>. <source>Computer Sci.</source>, <fpage>1863</fpage>&#x2013;<lpage>1871</lpage>. </citation>
</ref>
<ref id="B29">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Feng</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2015</year>). &#x201c;<article-title>Using Time Series Analysis to Forecast Emergency Patient Arrivals in Ct Department</article-title>,&#x201d; in <conf-name>Proceeding of the 2015&#x20;12th International Conference on Service Systems &#x26; Service Management</conf-name>, <conf-loc>Guangzhou, China</conf-loc>, <conf-date>22-24 June 2015</conf-date>. <publisher-name>IEEE</publisher-name>. <pub-id pub-id-type="doi">10.1109/icsssm.2015.7170134</pub-id> </citation>
</ref>
<ref id="B30">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>X. A.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>Automatic Seizure Detection Using Fully Convolutional Nested Lstm</article-title>. <source>Int. J.&#x20;Neural Syst.</source> <volume>30</volume> (<issue>04</issue>), <fpage>1250034</fpage>&#x2013;<lpage>1253520</lpage>. <pub-id pub-id-type="doi">10.1142/S0129065720500197</pub-id> </citation>
</ref>
<ref id="B31">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Missrie</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Hochhegger</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Zanon</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Capobianco</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>C&#xe9;sar</surname>
<given-names>d. M. N. A.</given-names>
</name>
<name>
<surname>Pereira</surname>
<given-names>M. R.</given-names>
</name>
<etal/>
</person-group> (<year>2018</year>). <article-title>Small Low-Risk Pulmonary Nodules on Chest Digital Radiog Raphy: Can We Predict whether the Nodule Is Benign?</article-title> <source>Clin. Radiol.</source> <volume>73</volume> (<issue>10</issue>), <fpage>902</fpage>&#x2013;<lpage>906</lpage>. <comment>S0009926018302277&#x2013;</comment>. <pub-id pub-id-type="doi">10.1016/j.crad.2018.06.002</pub-id> </citation>
</ref>
<ref id="B32">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Nagaratnam</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Nagaratnam</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Cheuk</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>2018</year>). <source>Lung Cancer in the Elderly</source>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer</publisher-name>. </citation>
</ref>
<ref id="B33">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Oh</surname>
<given-names>D. Y.</given-names>
</name>
<name>
<surname>Kim</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>K. J.</given-names>
</name>
</person-group> (<year>2019</year>). &#x201c;<article-title>Longitudinal Change Detection on Chest X-Rays Using Geometric Correlation Maps</article-title>,&#x201d; in <source>Medical Image Computing and Computer Assisted Intervention &#x2013; MICCAI 2019</source>. <publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer Nature</publisher-name>, <fpage>748</fpage>&#x2013;<lpage>756</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-030-32226-7_83</pub-id> </citation>
</ref>
<ref id="B34">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Onisko</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Druzdzel</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Austin</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>How to Interpret the Results of Medical Time Series Data Analysis: Classical Statistical Approaches versus Dynamic Bayesian Network Modeling</article-title>. <source>J.&#x20;Pathol. Inform.</source> <volume>7</volume> (<issue>1</issue>), <fpage>50</fpage>. <pub-id pub-id-type="doi">10.4103/2153-3539.197191</pub-id> </citation>
</ref>
<ref id="B35">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Qiang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Ji</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Automated Lung Nodule Segmentation Using an Active Contour Model Based on Pet/ct Images</article-title>. <source>J.&#x20;Comput. Theor. Nanoscience</source> <volume>12</volume> (<issue>8</issue>), <fpage>1972</fpage>&#x2013;<lpage>1976</lpage>. <pub-id pub-id-type="doi">10.1166/jctn.2015.4216</pub-id> </citation>
</ref>
<ref id="B36">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ren</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>He</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Girshick</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Faster R-Cnn: Towards Real-Time Object Detection with Region Proposal Networks</article-title>. <source>IEEE Trans. Pattern Anal. Mach Intell.</source> <volume>39</volume> (<issue>6</issue>), <fpage>1137</fpage>&#x2013;<lpage>1149</lpage>. <pub-id pub-id-type="doi">10.1109/TPAMI.2016.2577031</pub-id> </citation>
</ref>
<ref id="B37">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ribli</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Horv&#xe1;th</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Unger</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Pollner</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Csabai</surname>
<given-names>I.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Detecting and Classifying Lesions in Mammograms with Deep Learning</article-title>. <source>Sci. Rep.</source> <volume>8</volume> (<issue>1</issue>), <fpage>4165</fpage>. <pub-id pub-id-type="doi">10.1038/s41598-018-22437-z</pub-id> </citation>
</ref>
<ref id="B38">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Santeramo</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Withey</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Montana</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>2018</year>). &#x201c;<article-title>Longitudinal Detection of Radiological Abnormalities with Time-Modulated Lstm</article-title>,&#x201d; in <source>Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support</source>, <fpage>326</fpage>&#x2013;<lpage>333</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-030-00889-5_37</pub-id> </citation>
</ref>
<ref id="B3">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shi</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Smith</surname>
<given-names>C. D.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Nonlinear Feature Transformation and Deep Fusion for Alzheimer&#x2019;s Disease Staging Analysis</article-title>. <source>Pattern Recognition</source> <volume>63</volume>, <fpage>487</fpage>. <pub-id pub-id-type="doi">10.1016/j.patcog.2016.09.032</pub-id> </citation>
</ref>
<ref id="B39">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shinde</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Prasad</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Saboo</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Kaushick</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Saini</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Pal</surname>
<given-names>P. K.</given-names>
</name>
<etal/>
</person-group> (<year>2019</year>). <article-title>Predictive Markers for Parkinson&#x27;s Disease Using Deep Neural Nets on Neuromelanin Sensitive MRI</article-title>. <source>Neuroimage Clin.</source> <volume>22</volume>, <fpage>101748</fpage>. <pub-id pub-id-type="doi">10.1016/j.nicl.2019.101748</pub-id> </citation>
</ref>
<ref id="B40">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Taillant</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Avila-Vilchis</surname>
<given-names>J.&#x20;C.</given-names>
</name>
<name>
<surname>Allegrini</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Bricault</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Cinquin</surname>
<given-names>P.</given-names>
</name>
</person-group> (<year>2004</year>). &#x201c;<article-title>Ct and Mr Compatible Light Puncture Robot: Architectural Design and First Experiments</article-title>,&#x201d; in <source>Medical Image Computing &#x26; Computer-Assisted Intervention-Miccai, International Conference Saint-Malo</source> (<publisher-loc>France</publisher-loc>: <publisher-name>Springer</publisher-name>). <pub-id pub-id-type="doi">10.1007/978-3-540-30136-3_19</pub-id> </citation>
</ref>
<ref id="B41">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xiao</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Chuntian</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Peng</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Jun</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Huijie</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Su</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Object Classification via Feature Fusion Based Marginalized Kernels</article-title>. <source>Geosci. Remote Sensing Lett. IEEE</source> <volume>12</volume>, <fpage>8</fpage>&#x2013;<lpage>12</lpage>. <pub-id pub-id-type="doi">10.1109/LGRS.2014.2322953</pub-id> </citation>
</ref>
<ref id="B42">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yang</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Cui</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Ming</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Wen</surname>
<given-names>Z.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Variable-grouping-based Exponential Crossover for Differential Evolution Algorithm</article-title>. <source>Ijbic</source> <volume>15</volume> (<issue>3</issue>), <fpage>147</fpage>&#x2013;<lpage>158</lpage>. <pub-id pub-id-type="doi">10.1504/ijbic.2020.107486</pub-id> </citation>
</ref>
<ref id="B43">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Ye</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Guo</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Vanasse</surname>
<given-names>A.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>Survival Neural Networks for Time-To-Event Prediction in Longitudinal Study</article-title>. <source>Knowledge Inf. Syst.</source> <volume>62</volume> (<issue>10</issue>). <pub-id pub-id-type="doi">10.1007/s10115-020-01472-1</pub-id> </citation>
</ref>
<ref id="B44">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Xia</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Cui</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Pulmonary Nodule Detection in Medical Images: A Survey</article-title>. <source>Biomed. signal Process. Control</source> <volume>43</volume> (<issue>MAY</issue>), <fpage>138</fpage>&#x2013;<lpage>147</lpage>. <pub-id pub-id-type="doi">10.1016/j.bspc.2018.01.011</pub-id> </citation>
</ref>
<ref id="B45">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Onieva</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Perallos</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Osaba</surname>
<given-names>E.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Genetic Optimised Serial Hierarchical Fuzzy Classifier for Breast Cancer Diagnosis</article-title>. <source>Ijbic</source> <volume>15</volume> (<issue>3</issue>), <fpage>194</fpage>&#x2013;<lpage>205</lpage>. <pub-id pub-id-type="doi">10.1504/ijbic.2020.107490</pub-id> </citation>
</ref>
<ref id="B46">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Nominal Property Concepts and Substance Possession in Mandarin Chinese</article-title>. <source>J.&#x20;East. Asian Linguist</source> <volume>29</volume>, <fpage>393</fpage>&#x2013;<lpage>434</lpage>. <pub-id pub-id-type="doi">10.1007/s10831-020-09214-8</pub-id> </citation>
</ref>
<ref id="B47">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2021</year>). <source>Subjectivity and Nominal Property Concepts in Mandarin Chinese</source>. <publisher-name>ProQuest Dissertation Publishing</publisher-name>: <publisher-loc>New Jersy, USA</publisher-loc>. <comment>[Doctoral dissertation, Indiana University]</comment>. </citation>
</ref>
<ref id="B48">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Cao</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Cui</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>A Many-objective Optimization Based Intelligent Intrusion Detection Algorithm for Enhancing Security of Vehicular Networks in 6G</article-title>. <source>IEEE Trans. Veh. Technol.</source> <volume>70</volume> (<issue>6</issue>), <fpage>5234</fpage>&#x2013;<lpage>5243</lpage>. <pub-id pub-id-type="doi">10.1109/tvt.2021.3057074</pub-id> </citation>
</ref>
<ref id="B49">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhao</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Du</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Spectral-Spatial Feature Extraction for Hyperspectral Image Classification: A Dimension Reduction and Deep Learning Approach</article-title>. <source>IEEE Trans. Geosci. Remote Sensing</source> <volume>54</volume> (<issue>8</issue>), <fpage>4544</fpage>&#x2013;<lpage>4554</lpage>. <pub-id pub-id-type="doi">10.1109/tgrs.2016.2543748</pub-id> </citation>
</ref>
</ref-list>
</back>
</article>