<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Neuroinform.</journal-id>
<journal-title>Frontiers in Neuroinformatics</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Neuroinform.</abbrev-journal-title>
<issn pub-type="epub">1662-5196</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fninf.2021.778552</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Neuroscience</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>DR-IIXRN : Detection Algorithm of Diabetic Retinopathy Based on Deep Ensemble Learning and Attention Mechanism</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Ai</surname> <given-names>Zhuang</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="author-notes" rid="fn001"><sup>&#x02020;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1480173/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Huang</surname> <given-names>Xuan</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
<xref ref-type="author-notes" rid="fn001"><sup>&#x02020;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/801309/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Fan</surname> <given-names>Yuan</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/489260/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Feng</surname> <given-names>Jing</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1249397/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Zeng</surname> <given-names>Fanxin</given-names></name>
<xref ref-type="aff" rid="aff4"><sup>4</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1313452/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Lu</surname> <given-names>Yaping</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="corresp" rid="c002"><sup>&#x0002A;</sup></xref>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Department of Research and Development, Sinopharm Genomics Technology Co., Ltd.</institution>, <addr-line>Jiangsu</addr-line>, <country>China</country></aff>
<aff id="aff2"><sup>2</sup><institution>Department of Ophthalmology, Beijing Chao-Yang Hospital, Capital Medical University</institution>, <addr-line>Beijing</addr-line>, <country>China</country></aff>
<aff id="aff3"><sup>3</sup><institution>Medical Research Center, Beijing Chao-Yang Hospital, Capital Medical University</institution>, <addr-line>Beijing</addr-line>, <country>China</country></aff>
<aff id="aff4"><sup>4</sup><institution>Department of Clinical Research Center, Dazhou Central Hospital</institution>, <addr-line>Sichuan</addr-line>, <country>China</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Yu-Dong Zhang, University of Leicester, United Kingdom</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Gauri Borkhade, Padmashree Dr. D.Y. Patil University, India; Siyuan Lu, University of Leicester, United Kingdom</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Fanxin Zeng <email>fanxinly&#x00040;163.com</email></corresp>
<corresp id="c002">Yaping Lu <email>luyaping&#x00040;sinopharm.com</email></corresp>
<fn fn-type="equal" id="fn001"><p>&#x02020;These authors have contributed equally to this work and share first authorship</p></fn></author-notes>
<pub-date pub-type="epub">
<day>24</day>
<month>12</month>
<year>2021</year>
</pub-date>
<pub-date pub-type="collection">
<year>2021</year>
</pub-date>
<volume>15</volume>
<elocation-id>778552</elocation-id>
<history>
<date date-type="received">
<day>17</day>
<month>09</month>
<year>2021</year>
</date>
<date date-type="accepted">
<day>16</day>
<month>11</month>
<year>2021</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2021 Ai, Huang, Fan, Feng, Zeng and Lu.</copyright-statement>
<copyright-year>2021</copyright-year>
<copyright-holder>Ai, Huang, Fan, Feng, Zeng and Lu</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract><p>Diabetic retinopathy (DR) is one of the common chronic complications of diabetes and the most common blinding eye disease. If not treated in time, it might lead to visual impairment and even blindness in severe cases. Therefore, this article proposes an algorithm for detecting diabetic retinopathy based on deep ensemble learning and attention mechanism. First, image samples were preprocessed and enhanced to obtain high quality image data. Second, in order to improve the adaptability and accuracy of the detection algorithm, we constructed a holistic detection model DR-IIXRN, which consists of Inception V3, InceptionResNet V2, Xception, ResNeXt101, and NASNetLarge. For each base classifier, we modified the network model using transfer learning, fine-tuning, and attention mechanisms to improve its ability to detect DR. Finally, a weighted voting algorithm was used to determine which category (normal, mild, moderate, severe, or proliferative DR) the images belonged to. We also tuned the trained network model on the hospital data, and the real test samples in the hospital also confirmed the advantages of the algorithm in the detection of the diabetic retina. Experiments show that compared with the traditional single network model detection algorithm, the auc, accuracy, and recall rate of the proposed method are improved to 95, 92, and 92%, respectively, which proves the adaptability and correctness of the proposed method.</p></abstract>
<kwd-group>
<kwd>diabetic retinopathy</kwd>
<kwd>image processing</kwd>
<kwd>ensemble learning</kwd>
<kwd>deep learning</kwd>
<kwd>attention mechanism</kwd>
</kwd-group>
<counts>
<fig-count count="7"/>
<table-count count="12"/>
<equation-count count="11"/>
<ref-count count="67"/>
<page-count count="16"/>
<word-count count="10429"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>1. Introduction</title>
<p>The number of people suffering from diabetes in China is now the first in the world. Studies have found that (Li Y. et al., <xref ref-type="bibr" rid="B33">2020</xref>), from 2007 to 2017, the prevalence of diabetes in China has gradually increased over 10 years and has reached 12.8% in 2017, while the estimated prevalence of prediabetes was up to 35.2%, which indicates that diabetes is an important health problem in China. Diabetic retinopathy (DR) is one of the common chronic complications of diabetes and is also the most common blinding eye disease. In the early stages of the disease, patients may not feel anything unusual, but as the disease progresses, diabetic retinopathy can lead to vision impairment and possibly blindness. DR can be broadly classified into five stages: normal, mild, moderate, severe, and proliferative DR. Typically, an experienced physician will review fundus images to determine the current stage of DR the patient is in. However, doctors in different regions differ greatly, so it is difficult to guarantee the diagnosis, correctness. In remote areas, there may be no relevant doctors, making it impossible to detect such cases. Therefore, we need to use medical image recognition machines to help diagnose this disease. In the real world, the data set of DR is often extremely unbalanced. How to train the medical image recognition machine from the unbalanced data set and find the real patient has always been a research hotspot (Lian et al., <xref ref-type="bibr" rid="B34">2018</xref>; Maistry et al., <xref ref-type="bibr" rid="B40">2020</xref>).</p>
<p>Transfer learning is to transfer the trained model parameters to the new model to help the new model training (Torrey and Shavlik, <xref ref-type="bibr" rid="B57">2010</xref>; Guo et al., <xref ref-type="bibr" rid="B22">2021</xref>). Transfer learning can significantly improve the performance of the model (Kermany et al., <xref ref-type="bibr" rid="B29">2018</xref>). Lu S. et al. (<xref ref-type="bibr" rid="B38">2021</xref>) proposed a new end-to-end novel Coronavirus classification system, neighboring aware graph neural network (NAGNN). In order to obtain good image-level representation, it uses transfer learning in the backbone neural network to acquire features, thus accelerating the training efficiency of the network. Lu S. Y. et al. (<xref ref-type="bibr" rid="B39">2021</xref>) proposed a new computer-aided diagnostic method for the Cerebral Microbleed test. First, a 15-layer FeatureNet is trained to extract features from input samples. Second, the structures after the first fully connected layer in FeatureNet are replaced by three random neural networks for classification: Schmidt neural network, random vector functional-link net, and extreme learning machine. In the training process of these three classifiers, the weight and deviation of FeatureNet&#x00027;s early layers are frozen. Finally, the outputs of the three classifiers are integrated through a majority voting mechanism to obtain better classification performance. Narin et al. (<xref ref-type="bibr" rid="B42">2021</xref>) proposed to use five types of pre-trained convolutional neural network (CNN) models (ResNet50, ResNet101, ResNet152, InceptionV3, and Inception-ResNetV2) to detect patients infected with coronavirus pneumonia based on chest X-ray. The pre-trained ResNet50 model obtained the highest classification performance. Ardakani et al. (<xref ref-type="bibr" rid="B4">2020</xref>) compared 10 famous CNN: AlexNet, VGG-16, VGG-19, SqueezeNet, GoogleNet, Mobilenet-V2, ResNET-18, RESnet-50, Resnet-101, and Xception&#x00027;s classification performance in differentiating infections in COVID-19 and non-COVID-19 groups. Of all the networks, Resnet-101 and Xception performed the best.</p>
<p>Compared with a single algorithm, ensemble learning methods often combine the output of multiple classifiers to achieve better performance (Chen et al., <xref ref-type="bibr" rid="B12">2019</xref>; Minetto et al., <xref ref-type="bibr" rid="B41">2019</xref>; Zheng et al., <xref ref-type="bibr" rid="B66">2019</xref>), and ensemble learning based on CNN has been extensively studied in the medical field (Lin et al., <xref ref-type="bibr" rid="B35">2021</xref>). Fu et al. (<xref ref-type="bibr" rid="B19">2018</xref>) proposed a novel disc-aware ensemble network for automatic screening of glaucoma, which integrates the deep hierarchical context of the global fundus image and the local optic disc region. Considered as a global image stream, segmentation guidance network, local disc region stream and disc polar transformation stream as four branches. Finally, the output probability of different branches is fused as the final screening result. Xiao et al. (<xref ref-type="bibr" rid="B61">2019</xref>) preferentially selects six models, DenseNet121, ResNet101, SENet154, VGG16, DeepTEN, and InceptionV4, as base classifiers. Then, he takes the average of the predictions from all the models and uses them to make the final prediction. Das et al. (<xref ref-type="bibr" rid="B14">2021</xref>) proposed a chest X-ray image detection of COVID-19 patients based on deep CNN, using multiple state-of-the-art CNN models that can make independent predictions after individual training. The models are then combined to predict class values using a new method of weighted average integration techniques. The commonly used ensemble learning fusion method is to average the output probability value of multiple base classifiers to obtain the predicted probability value of the final model. Since the classification ability of each base classifier is inconsistent, the classification ability of each base classifier cannot be extracted by using simple averaging directly. Therefore, constructing a model for DR detection based on deep ensemble learning with attention mechanism encounters the following problems.</p>
<list list-type="order">
<list-item><p>How to handle unbalanced DR dataset and</p></list-item>
<list-item><p>How to set base classifier weight parameters in ensemble learning.</p></list-item>
</list>
</sec>
<sec id="s2">
<title>2. Related Work</title>
<p>Generally, both dichotomy and multiclassification can be used to classify patients with DR (Pires et al., <xref ref-type="bibr" rid="B45">2017</xref>; Araujo et al., <xref ref-type="bibr" rid="B3">2020</xref>; Porwal et al., <xref ref-type="bibr" rid="B46">2020</xref>; Quellec et al., <xref ref-type="bibr" rid="B49">2021</xref>). The dichotomy method can only determine whether a patient suffers from DR, while the multiclassification method can detect the severity of the disease. Here, we introduce several studies of DR detection algorithms applied with these two classification methods.</p>
<p>Binary classification algorithms of DR are widely applied. Qomariah et al. (<xref ref-type="bibr" rid="B48">2019</xref>) proposed to use transfer learning in CNN to extract features, based on which the support vector machine (SVM) algorithm was used to classify DR. Chakrabarty (<xref ref-type="bibr" rid="B10">2018</xref>) published a deep learning method for the detection of DR. It first converts pictures into gray-scale, second resizes the pictures to 1000*1000, then scales the pixel values to between 0 and 1, and, finally, inputs the preprocessed picture into CNN to predict the category to which the pictures belong. Chakrabarty and Chatterjee reported an offbeat technique for diabetic retinopathy detection using computer vision (Chakrabarty and Chatterjee, <xref ref-type="bibr" rid="B11">2019</xref>). It first preprocesses pictures, including grayscale conversion, threshold processing, size adjustment, and pixel scaling. Then, the processed images are input into a CNN to extract features, and finally, the features are input into the SVM algorithm for classification. Herliana et al. (<xref ref-type="bibr" rid="B24">2019</xref>) applied the particle swarm optimization (PSO) method to select the best DR features, and then the selected features were further classified using the classification method of neural network. Gautam et al. (<xref ref-type="bibr" rid="B20">2019</xref>) used MATLAB based image processing to diagnose DR. First, the image is converted to the specified size and then converted to a grayscale image. Then, adaptive histogram equalization is carried out, and the processed image is distinguished with a threshold value. When the pixel value is greater than the threshold value, this part will be changed to white, otherwise, it will be changed to black. Finally, the number of white pixels is counted. After processing several images, the threshold value of the number of white pixels is determined, which is eventually used for category prediction of images. For the classification of retinopathy and non-retinopathy, Roychowdhury et al. (<xref ref-type="bibr" rid="B51">2014</xref>) proposed a computer-aided screening system (DREAM) that combines four machine learning algorithms.</p>
<p>The second approach is to classify DR into multiple categories according to the severity of Kanth et al. (<xref ref-type="bibr" rid="B28">2013</xref>) proposed applying &#x0201C;gray scale conversion,&#x0201D; &#x0201C;histogram equalization,&#x0201D; &#x0201C;application of digital filters,&#x0201D; &#x0201C;gradient magnetics segmentation,&#x0201D; and &#x0201C;finally fuzzy c clustering&#x0201D; to extract three features including the sum, average, and sum of exudates of the white pixels with a value of &#x0201C;1&#x0201D; in the binary image, and then used the multi-layer perceptron to classify images. Zhang et al. (<xref ref-type="bibr" rid="B64">2021</xref>) developed an early fundus abnormality screening system (DeepUWF). Firstly, six image pre-processing techniques were used to enhance the image, and then the image was input into the CNN for image classification. Carrera et al. (<xref ref-type="bibr" rid="B9">2017</xref>) first extracted the information of blood vessel, microaneurysm, and hard exudate based on morphology, and then transformed the information into 8 features. Finally, the 8 features were input into the support vector machine for classification. Wu and Hu (<xref ref-type="bibr" rid="B60">2019</xref>) used flip, fold, and contrast adjustment to do upsampling; then the pictures are input to the VGG19, Resnet50, and Inception V3 networks which are trained in the ImageNet data set for transfer learning; and finally, the predicted category of the picture is obtained. Jayakumari et al. (<xref ref-type="bibr" rid="B26">2020</xref>) firstly normalized the Images and then divided the data set into two categories (normal class and disease class including mild, moderate, severe, and proliferative DR) for model training. Finally, according to the model output probability and the set threshold value, the category described in the picture is judged.</p>
<p>The advantages and disadvantages of the above two methods are summarized in the following <xref ref-type="table" rid="T1">Table 1</xref>.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Difference between dichotomy and multiclassification in diabetic retinopathy (DR).</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Method</bold></th>
<th valign="top" align="left"><bold>Advantages</bold></th>
<th valign="top" align="left"><bold>Disadvantages</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Dichotomy</td>
<td valign="top" align="left">High accuracy</td>
<td valign="top" align="left">Prone to under or overtreatment</td>
</tr>
<tr>
<td valign="top" align="left">Multiclassification</td>
<td valign="top" align="left">Can give doctors more accuracy staging of the disease to get the optimal treatment plan</td>
<td valign="top" align="left">Small differences in images between different levels of the disease, average accuracy</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>In this study, we performed multiclassification for the DR dataset. The main contributions of this article are the following two aspects.</p>
<list list-type="order">
<list-item><p>To deal with the extremely unbalanced DR dataset, the total number of images to be upsampled is divided equally among each image, and upsampling is performed by three transformations, including left-right symmetric transformation, up-down symmetric transformation, and random-angle rotation, to achieve the equalization of the dataset. This way of processing can solve the large classification error of the model caused by the unbalance.</p></list-item>
<list-item><p>The voting weight of each network model in the ensemble learning is set according to the F1 value validated by the base classifier to improve the detection effect.</p></list-item>
</list>
</sec>
<sec id="s3">
<title>3. Detection Algorithm of DR Based on Deep Ensemble Learning and Attention Mechanism</title>
<sec>
<title>3.1. System Architecture</title>
<p>In this study, we constructed a deep ensemble learning algorithm called DR-IIXRN, whose structure is illustrated in <xref ref-type="fig" rid="F1">Figure 1</xref> and <xref ref-type="table" rid="T9">Algorithm 1</xref>. The programs of DR-IIXRN can be divided into three modules: image loading and preprocessing, image enhancement, and model building and prediction.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>Flow chart of system architecture.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-15-778552-g0001.tif"/>
</fig>
<table-wrap position="float" id="T9">
<label>Algorithm 1</label>
<caption><p>Experimental structure</p></caption>
<table frame="hsides" rules="groups">
<tbody>
<tr>
<td align="left" valign="top">Input: &#x000A0;<italic>Data</italic> (the dataset), <italic>Base</italic>_<italic>classifier</italic>_<italic>list</italic>=[L1, L2, L3, L4, L5].</td>
</tr>
<tr>
<td align="left" valign="top">Output: &#x000A0;Test set prediction category for each test set sample <italic>T</italic> = <italic>T</italic><sub>1</sub>, <italic>T</italic><sub>2</sub>, &#x02026;, <italic>T</italic><sub><italic>N</italic></sub></td>
</tr>
<tr>
<td align="left" valign="top">
<list list-type="simple">
<list-item><p>1: &#x000A0;<xref ref-type="table" rid="T10">Algorithm 2</xref> is used to preprocess the image of Dataset <italic>Data</italic> and obtain <italic>Data</italic>_<italic>process</italic>.</p></list-item>
<list-item><p>2: &#x000A0;The <italic>Data</italic>_<italic>process</italic> is divided into the training set <italic>data</italic>_<italic>train</italic>, the validation set <italic>data</italic>_<italic>valid</italic> and the test set <italic>data</italic>_<italic>test</italic> in a 3:1:1 ratio.</p></list-item>
<list-item><p>3: &#x000A0;<xref ref-type="table" rid="T11">Algorithm 3</xref> is used to perform image enhancement operation on the training set <italic>data</italic>_<italic>train</italic> image to obtain <italic>data</italic>_<italic>train</italic>_<italic>process</italic>.</p></list-item>
<list-item><p>4: &#x000A0;<xref ref-type="table" rid="T12">Algorithm 4</xref> is used to calculate the balanced F score of the base classifier, and <italic>F</italic>1_<italic>list</italic> and <italic>Test</italic>_<italic>pro</italic> are obtained.</p></list-item>
<list-item><p>5: &#x000A0;Calculate the weight value of each base classifier.
<disp-formula id="E1"><label>(1)</label><mml:math id="M1"><mml:mtable class="eqnarray" columnalign="right center left"><mml:mtr><mml:mtd><mml:mi>&#x003BB;</mml:mi><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>F</mml:mi><mml:mn>1</mml:mn><mml:mtext>_</mml:mtext><mml:mi>l</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mstyle displaystyle="true"><mml:msubsup><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>q</mml:mi></mml:mrow></mml:msubsup></mml:mstyle><mml:mi>F</mml:mi><mml:mn>1</mml:mn><mml:mtext>_</mml:mtext><mml:mi>l</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac><mml:mo>&#x0002B;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>F</mml:mi><mml:mn>1</mml:mn><mml:mtext>_</mml:mtext><mml:mi>l</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:mfrac><mml:mrow><mml:mstyle displaystyle="true"><mml:msubsup><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>q</mml:mi></mml:mrow></mml:msubsup></mml:mstyle><mml:mi>F</mml:mi><mml:mn>1</mml:mn><mml:mtext>_</mml:mtext><mml:mi>l</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>q</mml:mi></mml:mrow></mml:mfrac></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>*</mml:mo><mml:mi>n</mml:mi></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p>
<list list-type="simple">
<list-item><p>In this formula, q represents the number of base classifiers, n is the parameter that represents the gap between the good and bad algorithms.</p></list-item></list></list-item>
<list-item><p>6: &#x000A0;Calculate the final probability values for each category in the test set sample.
<disp-formula id="E2"><label>(2)</label><mml:math id="M2"><mml:mtable class="eqnarray" columnalign="right center left"><mml:mtr><mml:mtd><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mstyle displaystyle="true"><mml:msubsup><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mn>5</mml:mn></mml:mrow></mml:msubsup></mml:mstyle><mml:mi>&#x003BB;</mml:mi><mml:mi>i</mml:mi><mml:mo>*</mml:mo><mml:mi>T</mml:mi><mml:mi>e</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi><mml:mtext>_</mml:mtext><mml:mi>p</mml:mi><mml:mi>r</mml:mi><mml:mi>o</mml:mi><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p>
<list list-type="simple">
<list-item><p>Where i represents each base classifier.</p></list-item></list></list-item>
<list-item><p>7: &#x000A0;Test Set Selection Process</p>
<list list-type="simple">
<list-item><p>In the test, the category with the largest probability value was selected to determine the category <italic>T</italic><sub><italic>m</italic></sub> of the final test sample, Wherein, m is the sample number of the test set.</p></list-item></list></list-item>
</list>
</td>
</tr>
</tbody>
</table>
</table-wrap>
<table-wrap position="float" id="T10">
<label>Algorithm 2</label>
<caption><p>Image preprocessing</p></caption>
<table frame="hsides" rules="groups">
<tbody>
<tr>
<td align="left" valign="top"><bold>Input</bold>: &#x000A0;sample dataset : <italic>Data</italic>.</td>
</tr>
<tr>
<td align="left" valign="top"><bold>Output</bold>: &#x000A0;processed dataset : <italic>Data</italic>_<italic>process</italic>.</td>
</tr>
<tr>
<td align="left" valign="top">
<list list-type="simple">
<list-item><p>1: &#x000A0;Define the list of stored images after preprocessing:<italic>Data</italic>_<italic>process</italic>=[].</p></list-item>
<list-item><p>2: &#x000A0;<bold>for</bold> <italic>image</italic>&#x02192;<italic>Data</italic> <bold>do</bold></p></list-item>
<list-item><p>3: &#x000A0; Cut pixels in an image where the pixel values in the entire row or column are all below 7.</p></list-item>
<list-item><p>4: &#x000A0; Get the smaller value of image height and image width.</p></list-item>
<list-item><p>5: &#x000A0; The center point of the image is taken as the center of the circle, and the smaller value of height and width is taken as the diameter to determine a circle. All pixel values in the non-circular region are replaced with 0.</p></list-item>
<list-item><p>6: &#x000A0; Cut pixels in an image where the pixel values in the entire row or column are all below 7.</p></list-item>
<list-item><p>7: &#x000A0; Scale Image to 299.</p></list-item>
<list-item><p>8: &#x000A0; Add the Image to the <italic>Data</italic>_<italic>process</italic>.</p></list-item>
<list-item><p>9: &#x000A0;<bold>end for</bold></p></list-item>
<list-item><p>10: &#x000A0;<bold>return</bold> <italic>Data</italic>_<italic>process</italic>.</p></list-item>
</list></td>
</tr>
</tbody>
</table>
</table-wrap>
<table-wrap position="float" id="T11">
<label>Algorithm 3</label>
<caption><p>Image enhancement</p></caption>
<table frame="hsides" rules="groups">
<tbody>
<tr>
<td align="left" valign="top"><bold>Input</bold>: &#x000A0;Training dataset: <italic>data</italic>_<italic>train</italic>.</td>
</tr>
<tr>
<td align="left" valign="top"><bold>Output</bold>: &#x000A0;processed training dataset: <italic>data</italic>_<italic>train</italic>_<italic>process</italic>.</td>
</tr>
<tr>
<td align="left" valign="top">
<list list-type="simple">
<list-item><p>1: &#x000A0;Define the enhanced storage list of images:<italic>data</italic>_<italic>train</italic>_<italic>process</italic>=[].</p></list-item>
<list-item><p>2: &#x000A0;Get a set of images for each category in <italic>data</italic>_<italic>train</italic>:data_train0, data_train1, data_train2, data_train3, data_train4.</p></list-item>
<list-item><p>3: &#x000A0;<italic>data</italic>_<italic>train</italic>_<italic>list</italic>=(data_train1, data_train2, data_train3, data_train4).</p></list-item>
<list-item><p>4: &#x000A0;<bold>for</bold> <italic>data</italic>_<italic>train</italic>_<italic>i</italic>&#x02192;<italic>data</italic>_<italic>train</italic>_<italic>list</italic> <bold>do</bold></p></list-item>
<list-item><p>5: &#x000A0; Calculate the difference between the sample sizes of category <italic>data</italic>_<italic>train</italic>_<italic>i</italic> and category <italic>data</italic>_<italic>train</italic>0 (normal sample): <italic>numSub</italic>.</p></list-item>
<list-item><p>6: &#x000A0; According to <italic>data</italic>_<italic>train</italic>_<italic>i</italic> and <italic>numSub</italic>, calculate the number of images to be upsampled for each image:<italic>numAdd</italic>.</p></list-item>
<list-item><p>7: &#x000A0; <bold>for</bold> <italic>image</italic>&#x02192;<italic>data</italic>_<italic>train</italic>_<italic>i</italic> <bold>do</bold></p></list-item>
<list-item><p>8: &#x000A0; Left-right symmetrical transformation, up-down symmetric transformation, and random-angle rotation transformation on image to stack <italic>numAdd</italic> image.</p></list-item>
<list-item><p>9: &#x000A0; Add the stack image to the <italic>data</italic>_<italic>train</italic>_<italic>process</italic>.</p></list-item>
<list-item><p>10: &#x000A0; <bold>end for</bold></p></list-item>
<list-item><p>11: &#x000A0;<bold>end for</bold></p></list-item>
<list-item><p>12: &#x000A0;<bold>return</bold> <italic>data</italic>_<italic>train</italic>_<italic>process</italic>.</p></list-item>
</list></td>
</tr>
</tbody>
</table>
</table-wrap>
<table-wrap position="float" id="T12">
<label>Algorithm 4</label>
<caption><p>Calculate the base classifier F1 value</p></caption>
<table frame="hsides" rules="groups">
<tbody>
<tr>
<td align="left" valign="top"><bold>Input</bold>: &#x000A0;Base classifier list:<italic>Base</italic>_<italic>classifier</italic>_<italic>list</italic>, training set: <italic>data</italic>_<italic>train</italic>_<italic>process</italic>, validation set:<italic>data</italic>_<italic>valid</italic>, test set: <italic>data</italic>_<italic>test</italic>.</td>
</tr>
<tr>
<td align="left" valign="top"><bold>Output</bold>: &#x000A0;Base classifier F1 value: <italic>F</italic>1_<italic>list</italic>,the initial prediction probability value of the test set sample:<italic>Test</italic>_<italic>pro</italic>.</td>
</tr>
<tr>
<td align="left" valign="top">
<list list-type="simple">
<list-item><p>1: &#x000A0;Defines a list of base classifier F1 values: <italic>F</italic>1_<italic>list</italic>=[].</p></list-item>
<list-item><p>2: &#x000A0;Define a list of base classifier models:<italic>Model</italic>_<italic>list</italic>=[].</p></list-item>
<list-item><p>3: &#x000A0;Define the list of initial probability values for the test set sample:<italic>Test</italic>_<italic>pro</italic>=[].</p></list-item>
<list-item><p>4: &#x000A0;<bold>for</bold> <italic>base</italic>_<italic>classifier</italic>&#x02192;<italic>Base</italic>_<italic>classifier</italic>_<italic>list</italic> <bold>do</bold></p></list-item>
<list-item><p>5: &#x000A0; Remove the top layer in base_classifier, load the weight parameters of the corresponding model in &#x0201C;imageNet,&#x0201D; and get the model:base_model.</p></list-item>
<list-item><p>6: &#x000A0; Add the CBAM attention mechanism module after base_model.</p></list-item>
<list-item><p>7: &#x000A0; Add the classification model output layers module after base_model. Get the model:model.</p></list-item>
<list-item><p>8: &#x000A0; The training set <italic>data</italic>_<italic>train</italic>_<italic>process</italic> and the validation set <italic>data</italic>_<italic>valid</italic> are trained in the model.</p></list-item>
<list-item><p>9: &#x000A0; Remove the limitation in base_model that the training parameters remain the same, and re-train again. The model is added to <italic>Model</italic>_<italic>list</italic> at the end of the training.</p></list-item>
<list-item><p>10: &#x000A0;<bold>end for</bold></p></list-item>
<list-item><p>11: &#x000A0;<bold>for</bold> <italic>model</italic>&#x02192;<italic>Model</italic>_<italic>list</italic> <bold>do</bold></p></list-item>
<list-item><p>12: &#x000A0; The validation set <italic>data</italic>_<italic>valid</italic> is tested in the model to get the F1_score, and the result is added to the <italic>F</italic>1_<italic>list</italic>.</p></list-item>
<list-item><p>13: &#x000A0; The test set <italic>data</italic>_<italic>test</italic> is tested in the model to get the category prediction probability value, which is added to <italic>Test</italic>_<italic>pro</italic>.</p></list-item>
<list-item><p>14: &#x000A0;<bold>end for</bold></p></list-item>
<list-item><p>15: &#x000A0;<bold>return</bold> <italic>F</italic>1_<italic>liste, Test</italic>_<italic>pro</italic>.</p></list-item>
</list></td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec>
<title>3.2. Dataset</title>
<p>The dataset used in this article is from the Diabetic Retinopathy Detection Competition in the Data Modeling and Data Analysis Competition Platform (Kaggle:<ext-link ext-link-type="uri" xlink:href="https://www.kaggle.com/c/diabetic-retinopathy-detection/data">https://www.kaggle.com/c/diabetic-retinopathy-detection/data</ext-link>). A total of 35,126 image samples were collected and classified into five categories: normal, mild, moderate, severe, and proliferative DR. The sample size of each category was 25,810, 2,443, 5,292, 873, and 708, respectively. The sample distribution is shown in <xref ref-type="fig" rid="F2">Figure 2A</xref>. A typical example of each category is shown in <xref ref-type="fig" rid="F2">Figure 2C</xref>. The data set is divided into a training set, validation set, and test set according to the ratio of 3:1:1, so the sample size of each set is 21,074, 7,026, and 7,026, respectively. The sample distribution is shown in <xref ref-type="fig" rid="F2">Figure 2B</xref> below.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>Datasets. <bold>(A)</bold> Datasets sample distribution, <bold>(B)</bold> Set sample distribution, and <bold>(C)</bold> Example of each category.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-15-778552-g0002.tif"/>
</fig>
</sec>
<sec>
<title>3.3. Image Preprocessing</title>
<p>The image preprocessing is demonstrated in <xref ref-type="fig" rid="F3">Figure 3</xref>.</p>
<list list-type="order">
<list-item><p>Cut pixels in an image where the pixel values in the entire row or column are all below 7. Pixels with pixel values below 7 are all black, which does not help the subsequent data analysis. In general, removing pixels with lower pixel values in the image can reduce the time complexity of the network model. This step can remove the non-eye region as much as possible and increase the robustness of the algorithm. Therefore, as shown in <xref ref-type="fig" rid="F3">Figure 3</xref>, we delete the pixel rows and columns in which pixel values are all lower than 7 in the picture (<xref ref-type="fig" rid="F3">Figure 3A</xref>) to get the picture (<xref ref-type="fig" rid="F3">Figure 3B</xref>).</p></list-item>
<list-item><p>Determine a circle by taking the image center as the center and the smaller value between the image height and width as the diameter. Fill the area outside the circle with pixel values as 0 to obtain the image with the circular eye area retained. Based on the roughly circular nature of fundus images, we keep only the circular part of the image and remove redundant information that is not helpful for diabetic retinal classification, which can both improve the accuracy of the classification model (network model or base classifier) and reduce the time complexity. Referring to <xref ref-type="fig" rid="F3">Figure 3</xref>, we take the center point of <xref ref-type="fig" rid="F3">Figure 3B</xref> as the center of the circle, and the smaller value between the height and the width as the diameter to determine the circle and then fill the area outside the circle with pixel values as 0 and get the picture (<xref ref-type="fig" rid="F3">Figure 3C</xref>).</p></list-item>
<list-item><p>For the image with the circular eye area, delete the pixels in an image where the pixel values in an entire row or column is all below 7 again and scale the image to 299*299px. Since the radius of the circle is determined by the smaller value of the width and height of the image in step (2), there may be some &#x0201C;eyeball&#x0201D; pixels that are not included in the determined circle and are filled with a pixel value of 0. Therefore, in this step, it is necessary to cut the pixels where the pixel values in the entire row or column are all below 7 again. As shown in <xref ref-type="fig" rid="F3">Figure 3</xref>, we delete the pixel rows and columns in which pixel values are all lower than 7 in the picture (<xref ref-type="fig" rid="F3">Figure 3C</xref>) to get the picture (<xref ref-type="fig" rid="F3">Figure 3D</xref>). Further, both the width and height of the image (<xref ref-type="fig" rid="F3">Figure 3D</xref>) are scaled to 299 px to get the image (<xref ref-type="fig" rid="F3">Figure 3E</xref>).</p></list-item>
</list>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>Image processing schematic diagram of each stage. <bold>(A)</bold> Original fundus image. <bold>(B)</bold> Image obtained by deleting entire rows and columns of pixels with a pixel value less than 7. <bold>(C)</bold> Take the center point of B as the center of the circle, and the smaller value between the height and the width as the diameter to determine the circle and then fill the area outside the circle with pixel values as 0. <bold>(D)</bold> On the basis of C, the image obtained after deleting entire rows and columns of pixels with a pixel value less than 7. <bold>(E)</bold> Zooms the image to the specified size. <bold>(F)</bold> Image obtained by three image transformations on the basis of E. <bold>(G)</bold> Take the center point of F as the center of the circle, and the smaller value between the height and the width as the diameter to determine the circle and then fill the area outside the circle with pixel values as 0.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-15-778552-g0003.tif"/>
</fig>
</sec>
<sec>
<title>3.4. Image Enhancement</title>
<sec>
<title>3.4.1. Picture Sampling</title>
<p>First, the number of pictures that need to be upsampled for each picture in each diseased class (class 1, 2, 3, 4) was calculated according to the following formula.</p>
<disp-formula id="E3"><label>(3)</label><mml:math id="M3"><mml:mtable class="eqnarray" columnalign="right center left"><mml:mtr><mml:mtd><mml:mi>A</mml:mi><mml:mi>d</mml:mi><mml:msub><mml:mrow><mml:mi>d</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mn>2</mml:mn><mml:mo>,</mml:mo><mml:mn>3</mml:mn><mml:mo>,</mml:mo><mml:mn>4</mml:mn></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Where <italic>Add</italic><sub><italic>i</italic></sub> is the number of images to be added per image in class i, <italic>N</italic><sub>0</sub> is the number of pictures of class 0 in the training set, and <italic>N</italic><sub><italic>i</italic></sub> is the number of pictures of class i in the training set.</p>
<p>Second, random-upsampling was performed to increase the number of images in each class to obtain a balanced dataset. The upsampling numbers for each category are listed in <xref ref-type="table" rid="T2">Table 2</xref> below.</p>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Upsampling image number information.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Category</bold></th>
<th valign="top" align="center"><bold>Number of</bold></th>
<th valign="top" align="center"><bold>Number of images</bold></th>
<th valign="top" align="center"><bold>Number of images</bold></th>
</tr>
<tr>
<th/>
<th valign="top" align="center"><bold>original images</bold></th>
<th valign="top" align="center"><bold>added per picture</bold></th>
<th valign="top" align="center"><bold>after upsampling</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">0</td>
<td valign="top" align="center">15,472</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">15,472</td>
</tr>
<tr>
<td valign="top" align="left">1</td>
<td valign="top" align="center">1,445</td>
<td valign="top" align="center">9</td>
<td valign="top" align="center">14,450</td>
</tr>
<tr>
<td valign="top" align="left">2</td>
<td valign="top" align="center">3,183</td>
<td valign="top" align="center">3</td>
<td valign="top" align="center">12,732</td>
</tr>
<tr>
<td valign="top" align="left">3</td>
<td valign="top" align="center">553</td>
<td valign="top" align="center">26</td>
<td valign="top" align="center">14,931</td>
</tr>
<tr>
<td valign="top" align="left">4</td>
<td valign="top" align="center">421</td>
<td valign="top" align="center">35</td>
<td valign="top" align="center">15,156</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The image transformation methods in upsampling include left-right symmetrical transformation, up-down symmetric transformation, and random-angle rotation transformation. The left-right symmetric transformation refers to the mirror image inversion with the vertical centerline of the image as the axis of symmetry. The up-down symmetric transformation refers to the image inversion with the horizontal centerline of the picture as the axis of symmetry. The random-angle rotation transformation means rotating an image with a random angle with the center of the picture as the center of the circle. As shown in <xref ref-type="fig" rid="F3">Figure 3</xref>, three sampled pictures (<xref ref-type="fig" rid="F3">Figure 3F</xref>) can be obtained from picture (<xref ref-type="fig" rid="F3">Figure 3E</xref>) by the three transformation methods.</p>
</sec>
<sec>
<title>3.4.2. Deletion of Unnecessary Areas</title>
<p>During the random-angle rotation transformation, some pixels outside the &#x0201C;eyeball&#x0201D; area may be filled with pixel values higher than 7. As shown in the third picture in <xref ref-type="fig" rid="F3">Figure 3F</xref>, bright bars near the four corners appear. Therefore, this step is to remove these bright pixels. For each upsampled image, determine a circle with the center of the image as the center of the circle and the half of the side length as the radius and then fill the area outside the circle with pixel values as 0. As shown in <xref ref-type="fig" rid="F3">Figure 3</xref>, unnecessary areas of the three sampling images in <xref ref-type="fig" rid="F3">Figure 3F</xref> were removed and three upsampling images (<xref ref-type="fig" rid="F3">Figure 3G</xref>) with circular eye areas were obtained.</p>
</sec>
</sec>
<sec>
<title>3.5. Introduction to Network Models</title>
<p>We used five top-ranked and widely used architectures trained on the ImageNet Large-scale Visual Recognition Challenge (ILSVRC): Inception V3, InceptionResNet V2, Xception, ResNext101, and NasnetLarge.</p>
<sec>
<title>3.5.1. Inception V3 Network Model</title>
<p>Since Yann LeCun, the father of CNN, built LeNet5 (LeCun et al., <xref ref-type="bibr" rid="B31">1998</xref>), which started the research boom of CNN, many scholars have been working hard in this field. In the 2014 ImageNet Competition, the GoogLeNet (Inception V1) (Szegedy et al., <xref ref-type="bibr" rid="B53">2015</xref>) network proposed by Szegedy and Liu et al. and the VGGNet (Simonyan and Zisserman, <xref ref-type="bibr" rid="B52">2015</xref>) network proposed by Simonyan and Zisserman won the first and second place, respectively. The GoogLeNet with a network depth of 22 layers is much smaller than VGGNet both in terms of network parameters and network size. Inception V2 (Ioffe and Szegedy, <xref ref-type="bibr" rid="B37">2015</xref>) is a network model optimized on the basis of GoogLeNet, and Batch-Normalization (BN) is added to the network model, which accelerates the training speed of the model. Inception V3 (Szegedy et al., <xref ref-type="bibr" rid="B55">2016</xref>) is similar to Inception V2, except that &#x0201C;factorization into small cons&#x0201D; is introduced on the basis of Inception V2, which means any n*n convolution kernel can be disassembled into a combination of size 1*n and n*1, and this operation allows the number of parameters to be greatly reduced. Since the introduction of the Inceptionv3 network, it has been used by a large number of researchers (Dongmei et al., <xref ref-type="bibr" rid="B16">2020</xref>; Li W. et al., <xref ref-type="bibr" rid="B32">2020</xref>). Dong et al. (<xref ref-type="bibr" rid="B15">2020</xref>) proposed a network framework. The network will effectively solve the complexity and individual differences of the cervical cell texture. Combined with the feature of Inception v3 and artificial cell classification algorithm, effectively improve the cervical cells recognition accuracy. Liu et al. (<xref ref-type="bibr" rid="B36">2020</xref>) proposed a classification algorithm based on improved InceptionV3 while Center loss CNN is proposed to improve the accuracy of obscured targets.</p>
</sec>
<sec>
<title>3.5.2. InceptionResNet V2 Network Model</title>
<p>InceptionResNet V2 (Szegedy et al., <xref ref-type="bibr" rid="B54">2017</xref>) is a deep network model proposed by Szegedy et al. Based on Inception, ResNet (He et al., <xref ref-type="bibr" rid="B23">2016</xref>) is introduced to add shallow features to higher-level features through another branch for the purpose of feature reuse and also to avoid the gradient disappearance problem of deep networks. Since the introduction of InceptionResNet V2 network, it has been used by a large number of researchers (Kamble et al., <xref ref-type="bibr" rid="B27">2019</xref>; Peng et al., <xref ref-type="bibr" rid="B44">2020</xref>). Ferreira et al. (<xref ref-type="bibr" rid="B18">2018</xref>) proposes a deep neural network method for classification of breast cancer histology images using InceptionResnet V2 transfer learning. First, the added top layer is trained and some of the previously frozen feature extraction layers are fine-tuned a second time. (Thomas et al., <xref ref-type="bibr" rid="B56">2020</xref>) proposes the use of Inception- resnet-V2 as feature extraction and the extracted features are fed into two different classifier support vector machines and random forest to classify vehicle types.</p>
</sec>
<sec>
<title>3.5.3. Xception Network Model</title>
<p>Xception (Chollet, <xref ref-type="bibr" rid="B13">2017</xref>), a deep network model proposed by Francois Chollet et al. in 2017, is an improved version of Inception V3. It mainly uses depth-wise separable convolution to replace the convolution operation in the original Inception V3, which can improve the performance of the network model to some extent. Since the introduction of the Xception network, it has been used by a large number of researchers (Rismiyati et al., <xref ref-type="bibr" rid="B50">2020</xref>; Wu et al., <xref ref-type="bibr" rid="B59">2020</xref>). Farag et al. (<xref ref-type="bibr" rid="B17">2021</xref>) proposed the use of residual network and Xception network for COVID-19 diagnosis, and the results show that the use of random search-optimized residual network and Xception network can achieve good classification results. Yao et al. (<xref ref-type="bibr" rid="B63">2021</xref>) propose an improved Xception network, in which L2 norm and mean regularization are added to the original Xception network, and the classification indicators of the tuned Xception network are greatly improved.</p>
</sec>
<sec>
<title>3.5.4. ResNeXt101 Network Model</title>
<p>ResNeXt101 is a deep network model put forward by Xie et al. (<xref ref-type="bibr" rid="B62">2017</xref>). The study suggests that increasing cardinality is more effective than increasing depth and width, which improves the accuracy of the model without obviously increasing the order of magnitude of parameters, and also allows it to have fewer hyperparameters, which facilitates model portability. Since the introduction of ResNeXt network, it has been used by a large number of researchers (Kon&#x000E9; and Boulmane, <xref ref-type="bibr" rid="B30">2018</xref>; Pant et al., <xref ref-type="bibr" rid="B43">2020</xref>). Go et al. (<xref ref-type="bibr" rid="B21">2020</xref>) proposes a visualization-based malware analysis method. First, the properties of binary executable files of original malware are converted into grayscale images, and ResNeXt network is used to classify grayscale images to realize malware analysis. Cao et al. (<xref ref-type="bibr" rid="B8">2021</xref>) proposed ResNeXt as a backbone network with few parameters and high accuracy, and then used k-means&#x0002B;&#x0002B; clustering algorithm to get an anchor box that was closer to the real box, which helped the model to carry out regression detection. The accuracy of the improved model was significantly improved.</p>
</sec>
<sec>
<title>3.5.5. NASNetLarge Network Model</title>
<p>NASNetLarge is a deep network model proposed by Zoph et al. (<xref ref-type="bibr" rid="B67">2018</xref>). NASNetLarge can automatically generate network structures without the need to design the network model manually. This model can greatly reduce the number of parameters while ensuring accuracy. Since the introduction of NASNet network, it has been used by a large number of researchers (Ahmed et al., <xref ref-type="bibr" rid="B1">2020a</xref>; Bharati et al., <xref ref-type="bibr" rid="B6">2021</xref>). Bakkali et al. (<xref ref-type="bibr" rid="B5">2020</xref>) proposes a cross-mode deep network, which can capture text content and visual information contained in document images. Image features use NASNetLarge network, text feature extraction uses Bert network, and the cross-mode deep network greatly improves classification indicators. Ahmed et al. (<xref ref-type="bibr" rid="B2">2020b</xref>) proposed to use a framework combining three pre training networks (Xception, Inception-ResNet-V2, and NasNetLarge) and Error Correcting Output Codes (ECOC) to solve the classification of skin damage.</p>
</sec>
</sec>
<sec>
<title>3.6. Attention Mechanism</title>
<p>Convolutional Block Attention Module (CBAM) (Woo et al., <xref ref-type="bibr" rid="B58">2018</xref>) denotes the attention mechanism module of the convolutional module, which is an attention mechanism module that combines spatial and channel dimensions and can achieve better results compared to Squeeze-and-Excitation Networks (SENet) (Hu et al., <xref ref-type="bibr" rid="B25">2020</xref>), an attention mechanism that focuses only on the channel dimension. The network structures of SENet and CBAM added into the block are shown in <xref ref-type="fig" rid="F4">Figure 4A</xref>.</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p>Model modification process. <bold>(A)</bold> Diagram of attention mechanism network structure, <bold>(B)</bold> Model training process.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-15-778552-g0004.tif"/>
</fig>
</sec>
<sec>
<title>3.7. Model Building</title>
<p>The modeling stages are descripted in <xref ref-type="fig" rid="F4">Figure 4B</xref>. The models are mainly composed of Inception V3, InceptionResNet V2, Xception, ResNeXt101, and NASNetLarge. On the basis of these models, the top layer is removed, and the CBAM attention mechanism and model output layers are added. After the model is constructed, transfer learning is performed, keeping the parameters in the network model immutable and only modifying the parameters of the attention mechanism and the output layers module. Once the model is trained, the parameters of the network model are free to be modified, and the training is performed again to obtain a suitable network model for DR.</p>
</sec>
</sec>
<sec id="s4">
<title>4. Experiment</title>
<sec>
<title>4.1. Experimental Conditions</title>
<p>The experimental environment is Linux x86_64, NVIDIA Tesla V100, and 16GB memory. This experiment is based on Python version 3.7.9, TensorFlow version 2.3.0, and Keras version 2.4.3.</p>
</sec>
<sec>
<title>4.2. Evaluation Criteria</title>
<p>To evaluate the performance of the model, the accuracy, recall, precision, and F1-score were calculated.</p>
<p>The combinations of predicted outcomes of the classifier and true categories of samples were classified as True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN). TP means that the classification model predicts positive samples as positive samples. TN means that the classification model predicts negative samples as negative samples. FP means that the classification model predicts negative samples as positive samples. FN means that the classification model predicts positive samples as negative samples.</p>
<disp-formula id="E4"><label>(4)</label><mml:math id="M4"><mml:mtable class="eqnarray" columnalign="right center left"><mml:mtr><mml:mtd><mml:mi>A</mml:mi><mml:mi>c</mml:mi><mml:mi>c</mml:mi><mml:mi>u</mml:mi><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>c</mml:mi><mml:mi>y</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mi>T</mml:mi><mml:mi>N</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mi>T</mml:mi><mml:mi>N</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mi>F</mml:mi><mml:mi>P</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mi>F</mml:mi><mml:mi>N</mml:mi></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="E5"><label>(5)</label><mml:math id="M5"><mml:mtable class="eqnarray" columnalign="right center left"><mml:mtr><mml:mtd><mml:mi>P</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mi>F</mml:mi><mml:mi>P</mml:mi></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="E6"><label>(6)</label><mml:math id="M6"><mml:mtable class="eqnarray" columnalign="right center left"><mml:mtr><mml:mtd><mml:mi>R</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mi>F</mml:mi><mml:mi>N</mml:mi></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="E7"><label>(7)</label><mml:math id="M7"><mml:mtable class="eqnarray" columnalign="right center left"><mml:mtr><mml:mtd><mml:mi>F</mml:mi><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:mi>S</mml:mi><mml:mi>c</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>F</mml:mi><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>2</mml:mn><mml:mo>*</mml:mo><mml:mi>P</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mo>*</mml:mo><mml:mi>R</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>P</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mi>R</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="E8"><label>(8)</label><mml:math id="M8"><mml:mtable class="eqnarray" columnalign="right center left"><mml:mtr><mml:mtd><mml:mi>S</mml:mi><mml:mi>p</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>i</mml:mi><mml:mi>f</mml:mi><mml:mi>i</mml:mi><mml:mi>c</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi><mml:mi>y</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>T</mml:mi><mml:mi>N</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>F</mml:mi><mml:mi>P</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mi>T</mml:mi><mml:mi>N</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>&#x0201C;Accuracy&#x0201D; represents the proportion of all correct judgments of the classifier to the total number of observations. &#x0201C;Precision&#x0201D; is the proportion of subjects correctly predicted to be positive among all positive predictions. &#x0201C;Recall&#x0201D; is a ratio of correct positive predictions to the overall number of positive instances in the dataset. &#x0201C;Specificity&#x0201D; is the proportion of negative instances identified to all negative instances. &#x0201C;F1&#x0201D; is the harmonic mean of precision and recall. AUC is a performance measure for classification problems under various threshold Settings. F1 values range from 0 to 1, and the best value is 1.0, and the worst value is 0.0.</p>
</sec>
<sec>
<title>4.3. Selection of Experimental Settings</title>
<p>In order to investigate the potential of different experimental settings in the image processing module and the model building module in the proposed framework, we conducted experiments for the following different cases and compared the performance of the model under each case. All experiments applied Inception V3 network and CBAM attention mechanism to construct classification models. A different case of the experimental results is shown in <xref ref-type="fig" rid="F6">Figure 6A</xref> and <xref ref-type="table" rid="T3">Table 3</xref>. According to the test results, the classification index using Case F reached the highest level, and this process was also identified as the data preprocessing scheme in this article.</p>
<list list-type="order">
<list-item><p>Case A: No pre-processing steps (using raw data).</p></list-item>
<list-item><p>Case B: On the basis of A, weight parameters are added during model classification.</p></list-item>
<list-item><p>Case C: On the basis of A, the images in the training set are upsampled using horizontal translation, vertical translation, horizontal rotation, vertical rotation, and random angle rotation.</p></list-item>
<list-item><p>Case D: On the basis of A, the images in the training set are upsampled using horizontal rotation, vertical rotation, and random angle rotation.</p></list-item>
<list-item><p>Case E: On the basis of D, the Inception V3 containing network parameters from the competition of &#x0201C;ImageNet&#x0201D; is used as the network model. Here, the network parameters of Inception V3 are not modifiable, distinguishing from the random network parameters used in Case A, B, C, and D.</p></list-item>
<list-item><p>Case F: On the basis of E, the fine-tuning technique is applied.</p></list-item>
</list>
<table-wrap position="float" id="T3">
<label>Table 3</label>
<caption><p>The influence of different cases on evaluation indexes.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Case</bold></th>
<th valign="top" align="left"><bold>Label</bold></th>
<th valign="top" align="center"><bold>Precision</bold></th>
<th valign="top" align="center"><bold>Recall</bold></th>
<th valign="top" align="center"><bold>F1-score</bold></th>
<th valign="top" align="center"><bold>Support</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="middle" align="left" rowspan="5">A</td>
<td valign="top" align="left">No DR</td>
<td valign="top" align="center">0.74</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">0.85</td>
<td valign="top" align="center">5,175</td>
</tr>
<tr>
<td valign="top" align="left">Mild DR</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">493</td>
</tr>
<tr>
<td valign="top" align="left">Moderate DR</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">1,049</td>
</tr>
<tr>
<td valign="top" align="left">Severe DR</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">160</td>
</tr>
<tr>
<td valign="top" align="left">PDR</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">149</td>
</tr>
<tr style="border-top: thin solid #000000;">
<td valign="middle" align="left" rowspan="5">B</td>
<td valign="top" align="left">No DR</td>
<td valign="top" align="center">0.79</td>
<td valign="top" align="center">0.23</td>
<td valign="top" align="center">0.36</td>
<td valign="top" align="center">5,175</td>
</tr>
<tr>
<td valign="top" align="left">Mild DR</td>
<td valign="top" align="center">0.08</td>
<td valign="top" align="center">0.46</td>
<td valign="top" align="center">0.14</td>
<td valign="top" align="center">493</td>
</tr>
<tr>
<td valign="top" align="left">Moderate DR</td>
<td valign="top" align="center">0.14</td>
<td valign="top" align="center">0.01</td>
<td valign="top" align="center">0.02</td>
<td valign="top" align="center">1,049</td>
</tr>
<tr>
<td valign="top" align="left">Severe DR</td>
<td valign="top" align="center">0.03</td>
<td valign="top" align="center">0.19</td>
<td valign="top" align="center">0.06</td>
<td valign="top" align="center">160</td>
</tr>
<tr>
<td valign="top" align="left">PDR</td>
<td valign="top" align="center">0.05</td>
<td valign="top" align="center">0.62</td>
<td valign="top" align="center">0.1</td>
<td valign="top" align="center">149</td>
</tr>
<tr style="border-top: thin solid #000000;">
<td valign="middle" align="left" rowspan="5">C</td>
<td valign="top" align="left">No DR</td>
<td valign="top" align="center">0.75</td>
<td valign="top" align="center">0.97</td>
<td valign="top" align="center">0.85</td>
<td valign="top" align="center">5,175</td>
</tr>
<tr>
<td valign="top" align="left">Mild DR</td>
<td valign="top" align="center">0.11</td>
<td valign="top" align="center">0.01</td>
<td valign="top" align="center">0.02</td>
<td valign="top" align="center">493</td>
</tr>
<tr>
<td valign="top" align="left">Moderate DR</td>
<td valign="top" align="center">0.33</td>
<td valign="top" align="center">0.06</td>
<td valign="top" align="center">0.11</td>
<td valign="top" align="center">1,049</td>
</tr>
<tr>
<td valign="top" align="left">Severe DR</td>
<td valign="top" align="center">0.26</td>
<td valign="top" align="center">0.12</td>
<td valign="top" align="center">0.17</td>
<td valign="top" align="center">160</td>
</tr>
<tr>
<td valign="top" align="left">PDR</td>
<td valign="top" align="center">0.26</td>
<td valign="top" align="center">0.1</td>
<td valign="top" align="center">0.14</td>
<td valign="top" align="center">149</td>
</tr>
<tr style="border-top: thin solid #000000;">
<td valign="middle" align="left" rowspan="5">D</td>
<td valign="top" align="left">No DR</td>
<td valign="top" align="center">0.8</td>
<td valign="top" align="center">0.94</td>
<td valign="top" align="center">0.86</td>
<td valign="top" align="center">5,175</td>
</tr>
<tr>
<td valign="top" align="left">Mild DR</td>
<td valign="top" align="center">0.11</td>
<td valign="top" align="center">0.02</td>
<td valign="top" align="center">0.03</td>
<td valign="top" align="center">493</td>
</tr>
<tr>
<td valign="top" align="left">Moderate DR</td>
<td valign="top" align="center">0.43</td>
<td valign="top" align="center">0.28</td>
<td valign="top" align="center">0.34</td>
<td valign="top" align="center">1,049</td>
</tr>
<tr>
<td valign="top" align="left">Severe DR</td>
<td valign="top" align="center">0.4</td>
<td valign="top" align="center">0.23</td>
<td valign="top" align="center">0.29</td>
<td valign="top" align="center">160</td>
</tr>
<tr>
<td valign="top" align="left">PDR</td>
<td valign="top" align="center">0.53</td>
<td valign="top" align="center">0.21</td>
<td valign="top" align="center">0.3</td>
<td valign="top" align="center">149</td>
</tr>
<tr style="border-top: thin solid #000000;">
<td valign="middle" align="left" rowspan="5">E</td>
<td valign="top" align="left">No DR</td>
<td valign="top" align="center">0.83</td>
<td valign="top" align="center">0.92</td>
<td valign="top" align="center">0.87</td>
<td valign="top" align="center">5,175</td>
</tr>
<tr>
<td valign="top" align="left">Mild DR</td>
<td valign="top" align="center">0.1</td>
<td valign="top" align="center">0.04</td>
<td valign="top" align="center">0.05</td>
<td valign="top" align="center">493</td>
</tr>
<tr>
<td valign="top" align="left">Moderate DR</td>
<td valign="top" align="center">0.52</td>
<td valign="top" align="center">0.45</td>
<td valign="top" align="center">0.48</td>
<td valign="top" align="center">1,049</td>
</tr>
<tr>
<td valign="top" align="left">Severe DR</td>
<td valign="top" align="center">0.46</td>
<td valign="top" align="center">0.32</td>
<td valign="top" align="center">0.37</td>
<td valign="top" align="center">160</td>
</tr>
<tr>
<td valign="top" align="left">PDR</td>
<td valign="top" align="center">0.71</td>
<td valign="top" align="center">0.5</td>
<td valign="top" align="center">0.59</td>
<td valign="top" align="center">149</td>
</tr>
<tr style="border-top: thin solid #000000;">
<td valign="middle" align="left" rowspan="5">F</td>
<td valign="top" align="left">No DR</td>
<td valign="top" align="center">0.83</td>
<td valign="top" align="center">0.92</td>
<td valign="top" align="center">0.87</td>
<td valign="top" align="center">5,175</td>
</tr>
<tr>
<td valign="top" align="left">Mild DR</td>
<td valign="top" align="center">0.1</td>
<td valign="top" align="center">0.04</td>
<td valign="top" align="center">0.05</td>
<td valign="top" align="center">493</td>
</tr>
<tr>
<td valign="top" align="left">Moderate DR</td>
<td valign="top" align="center">0.53</td>
<td valign="top" align="center">0.44</td>
<td valign="top" align="center">0.48</td>
<td valign="top" align="center">1,049</td>
</tr>
<tr>
<td valign="top" align="left">Severe DR</td>
<td valign="top" align="center">0.46</td>
<td valign="top" align="center">0.33</td>
<td valign="top" align="center">0.38</td>
<td valign="top" align="center">160</td>
</tr>
<tr>
<td valign="top" align="left">PDR</td>
<td valign="top" align="center">0.69</td>
<td valign="top" align="center">0.5</td>
<td valign="top" align="center">0.58</td>
<td valign="top" align="center">149</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>In Case A, the original dataset is unbalanced and there are far more normal samples than other types of samples in the dataset, so the model prediction results are completely biased to the side of normal samples and the model does not play any role in sample classification. To deal with the imbalanced data, we tested the following methods in Cases B, C, and D.</p>
<p>In Case B, a penalty for prediction errors in categories with small sample sizes is added to the model, and the weight parameters for each category are calculated as follows:</p>
<disp-formula id="E9"><label>(9)</label><mml:math id="M9"><mml:mtable class="eqnarray" columnalign="right center left"><mml:mtr><mml:mtd><mml:mi>W</mml:mi><mml:mi>e</mml:mi><mml:mi>i</mml:mi><mml:mi>g</mml:mi><mml:mi>h</mml:mi><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>n</mml:mi><mml:mtext>_</mml:mtext><mml:mi>s</mml:mi><mml:mi>a</mml:mi><mml:mi>m</mml:mi><mml:mi>p</mml:mi><mml:mi>l</mml:mi><mml:mi>e</mml:mi><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi><mml:mtext>_</mml:mtext><mml:mi>c</mml:mi><mml:mi>l</mml:mi><mml:mi>a</mml:mi><mml:mi>s</mml:mi><mml:mi>s</mml:mi><mml:mi>e</mml:mi><mml:mi>s</mml:mi><mml:mo>*</mml:mo><mml:mi>b</mml:mi><mml:mi>i</mml:mi><mml:mi>n</mml:mi><mml:mi>c</mml:mi><mml:mi>o</mml:mi><mml:mi>u</mml:mi><mml:mi>n</mml:mi><mml:mi>t</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Where <italic>n</italic>_<italic>samples</italic> represent the total number of picture samples, <italic>n</italic>_<italic>classes</italic> represent the number of categories, and <italic>bincount</italic>(<italic>y</italic>) represents the sample size of each category in the training set. Weight is the weight corresponding to each category. The lower the sample size of the category, the higher its weight. Comparing the results to Case A, we can see that the detection ability for the categories with smaller sample sizes has been slightly improved.</p>
<p>In Case C, we performed horizontal translation, vertical translation, horizontal rotation, vertical rotation, and random angle rotation to carry out the upsampling of the images. The sample size was made approximately the same in each category by up-sampling. Compared with Case B, Case C showed a certain improvement in the detection of late-stage DR, but showed a certain decrease for category 1. However, compared with Case B, the improvement effect for other categories was significant. Therefore, Case C showed an overall improvement compared with Case B.</p>
<p>Since the two transformations, horizontal translation and vertical translation, produce images that differ too much from the images of the actual samples, as shown in <xref ref-type="fig" rid="F5">Figure 5</xref>, and may affect the classification effect, we removed these two types of upsampling in Case D. The classification effect for categories with the small sample size is significantly improved in Case D.</p>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption><p>Horizontal translations.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-15-778552-g0005.tif"/>
</fig>
<p>However, in Case D, we started from scratch to solve the weight parameters of the model. Due to the small sample size, the training for the complex network model was not sufficient. Case E uses the network parameters in the &#x0201C;ImageNet&#x0201D; competition, which is calculated by a large amount of data. The network parameters are very suitable for the model. Therefore, the effect in Case E is improved to some extent compared with that in Case D.</p>
<p>In Case F, compared with Case E, the weighted parameters in Inception V3 were set to be modifiable in training. We fine-tuned parameters so that the model can be suitable for the detection of DR. Therefore, Case F achieved an improvement over Case E.</p>
</sec>
<sec>
<title>4.4. Weighted Voting Model</title>
<p>Ensemble learning combines diverse models (henceforth classifiers) to obtain better predictive performance.</p>
<p>This article uses the weighted voting method mainly through the following steps.</p>
<list list-type="order">
<list-item><p>First, get the F1_score of each base classifier on the validation set. Test the test set in the model, and get the prediction probability value Test_pro of the test set in each category.</p></list-item>
<list-item><p>Calculate the weight value of each base classifier through the following formula. Wherein, F1_list is the F1 score of each base classifier in the verification set, q is the number of base classifiers, so <inline-formula><mml:math id="M10"><mml:mfrac><mml:mrow><mml:mi>F</mml:mi><mml:mn>1</mml:mn><mml:mstyle class="text"><mml:mtext>_</mml:mtext></mml:mstyle><mml:mi>l</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:msubsup><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>q</mml:mi></mml:mrow></mml:msubsup><mml:mi>F</mml:mi><mml:mn>1</mml:mn><mml:mstyle class="text"><mml:mtext>_</mml:mtext></mml:mstyle><mml:mi>l</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:math></inline-formula> represents the percentage of F1 score of each base classifier in the total score, and <inline-formula><mml:math id="M11"><mml:mfrac><mml:mrow><mml:msubsup><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>q</mml:mi></mml:mrow></mml:msubsup><mml:mi>F</mml:mi><mml:msub><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msub><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>q</mml:mi></mml:mrow></mml:mfrac></mml:math></inline-formula> represents the mean of F1 of each base classifier. <inline-formula><mml:math id="M12"><mml:mi>F</mml:mi><mml:mn>1</mml:mn><mml:mstyle class="text"><mml:mtext>_</mml:mtext></mml:mstyle><mml:mi>l</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:mfrac><mml:mrow><mml:msubsup><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>q</mml:mi></mml:mrow></mml:msubsup><mml:mi>F</mml:mi><mml:mn>1</mml:mn><mml:mstyle class="text"><mml:mtext>_</mml:mtext></mml:mstyle><mml:mi>l</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>q</mml:mi></mml:mrow></mml:mfrac></mml:math></inline-formula> represents the difference between each base classifier and the mean, so the n value is to amplify the difference between the base classifiers.
<disp-formula id="E10"><label>(10)</label><mml:math id="M13"><mml:mtable class="eqnarray" columnalign="right center left"><mml:mtr><mml:mtd><mml:mi>&#x003BB;</mml:mi><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>F</mml:mi><mml:mn>1</mml:mn><mml:mtext>_</mml:mtext><mml:mi>l</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mstyle displaystyle="true"><mml:msubsup><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>q</mml:mi></mml:mrow></mml:msubsup></mml:mstyle><mml:mi>F</mml:mi><mml:mn>1</mml:mn><mml:mtext>_</mml:mtext><mml:mi>l</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow></mml:mfrac><mml:mo>&#x0002B;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>F</mml:mi><mml:mn>1</mml:mn><mml:mtext>_</mml:mtext><mml:mi>l</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:mfrac><mml:mrow><mml:mstyle displaystyle="true"><mml:msubsup><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>q</mml:mi></mml:mrow></mml:msubsup></mml:mstyle><mml:mi>F</mml:mi><mml:mn>1</mml:mn><mml:mtext>_</mml:mtext><mml:mi>l</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>q</mml:mi></mml:mrow></mml:mfrac></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>*</mml:mo><mml:mi>n</mml:mi></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p></list-item>
<list-item><p>Calculate the probability of each sample in the test set.
<disp-formula id="E11"><label>(11)</label><mml:math id="M14"><mml:mtable class="eqnarray" columnalign="right center left"><mml:mtr><mml:mtd><mml:mi>T</mml:mi><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>&#x003BB;</mml:mi><mml:mi>i</mml:mi><mml:mo>*</mml:mo><mml:mi>T</mml:mi><mml:mi>e</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi><mml:mtext>_</mml:mtext><mml:mi>p</mml:mi><mml:mi>r</mml:mi><mml:mi>o</mml:mi><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p></list-item>
<list-item><p>After obtaining the predicted probability value of each sample in each category, the category with the largest probability value is the predicted category value.</p></list-item>
</list>
<p>In step 2, the range of n that increases the difference between the base classifiers is generally small. If the data is too large, it will cause the predicted probability values to be exactly the same. We test n values from 10 to 100. Select the most appropriate n to increase the difference between the base classifiers. In the first step of the experiment, n is in the range from 10 to 100. The specific values are 10, 20, 30, 40, 50, 60, 70, 80, 90, and 100. The experimental results are shown in <xref ref-type="fig" rid="F6">Figure 6E</xref> and <xref ref-type="table" rid="T4">Table 4</xref>.</p>
<fig id="F6" position="float">
<label>Figure 6</label>
<caption><p>Comparison of experimental results. <bold>(A)</bold> Influence of different image preprocessing schemes on F1 value. <bold>(B)</bold> Comparison of the performance of DR-IIXRN with Zhao and Bravo proposed algorithms. <bold>(C)</bold> Compared the performance of DR-IIXRN with Wu proposed algorithms. <bold>(D)</bold> Comparison of the performance of DR-IIXRN with the five base classifiers. <bold>(E)</bold> Influence of n value in formula 10 between 10 and 100 on evaluation index. <bold>(F)</bold> Comparison of the performance of DR-IIXRN with Pratt proposed algorithms. <bold>(G)</bold> Influence of n value in formula 10 between 1 and 10 on evaluation index.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-15-778552-g0006.tif"/>
</fig>
<table-wrap position="float" id="T4">
<label>Table 4</label>
<caption><p>The influence of different values of <italic>n</italic> on the evaluation indexes.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Step</bold></th>
<th valign="top" align="center"><bold><italic>N</italic> value</bold></th>
<th valign="top" align="center"><bold>Accuracy</bold></th>
<th valign="top" align="center"><bold>F1_score</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="middle" align="left" rowspan="10">First step</td>
<td valign="top" align="center">10</td>
<td valign="top" align="center">0.7904</td>
<td valign="top" align="center">0.7587</td>
</tr>
<tr>
<td valign="top" align="center">20</td>
<td valign="top" align="center">0.7882</td>
<td valign="top" align="center">0.7573</td>
</tr>
<tr>
<td valign="top" align="center">30</td>
<td valign="top" align="center">0.7865</td>
<td valign="top" align="center">0.7564</td>
</tr>
<tr>
<td valign="top" align="center">40</td>
<td valign="top" align="center">0.7846</td>
<td valign="top" align="center">0.7552</td>
</tr>
<tr>
<td valign="top" align="center">50</td>
<td valign="top" align="center">0.7838</td>
<td valign="top" align="center">0.7549</td>
</tr>
<tr>
<td valign="top" align="center">60</td>
<td valign="top" align="center">0.7842</td>
<td valign="top" align="center">0.7552</td>
</tr>
<tr>
<td valign="top" align="center">70</td>
<td valign="top" align="center">0.7838</td>
<td valign="top" align="center">0.755</td>
</tr>
<tr>
<td valign="top" align="center">80</td>
<td valign="top" align="center">0.7832</td>
<td valign="top" align="center">0.7546</td>
</tr>
<tr>
<td valign="top" align="center">90</td>
<td valign="top" align="center">0.7825</td>
<td valign="top" align="center">0.7542</td>
</tr>
<tr>
<td valign="top" align="center">100</td>
<td valign="top" align="center">0.7816</td>
<td valign="top" align="center">0.7536</td>
</tr>
<tr style="border-top: thin solid #000000;">
<td valign="middle" align="left" rowspan="10">Second step</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">0.7931</td>
<td valign="top" align="center">0.7595</td>
</tr>
<tr>
<td valign="top" align="center">2</td>
<td valign="top" align="center">0.7933</td>
<td valign="top" align="center">0.7599</td>
</tr>
<tr>
<td valign="top" align="center">3</td>
<td valign="top" align="center">0.7933</td>
<td valign="top" align="center">0.7602</td>
</tr>
<tr>
<td valign="top" align="center">4</td>
<td valign="top" align="center">0.7929</td>
<td valign="top" align="center">0.7601</td>
</tr>
<tr>
<td valign="top" align="center">5</td>
<td valign="top" align="center">0.7921</td>
<td valign="top" align="center">0.7596</td>
</tr>
<tr>
<td valign="top" align="center">6</td>
<td valign="top" align="center">0.7916</td>
<td valign="top" align="center">0.7593</td>
</tr>
<tr>
<td valign="top" align="center">7</td>
<td valign="top" align="center">0.7913</td>
<td valign="top" align="center">0.7591</td>
</tr>
<tr>
<td valign="top" align="center">8</td>
<td valign="top" align="center">0.7913</td>
<td valign="top" align="center">0.7591</td>
</tr>
<tr>
<td valign="top" align="center">9</td>
<td valign="top" align="center">0.7913</td>
<td valign="top" align="center">0.7593</td>
</tr>
<tr>
<td valign="top" align="center">10</td>
<td valign="top" align="center">0.7905</td>
<td valign="top" align="center">0.7587</td>
</tr>
</tbody>
</table>
</table-wrap>
<p><xref ref-type="fig" rid="F6">Figure 6E</xref> shows that the model evaluation indexes with n values between 10-100 show a downward trend, mainly because the large n value leads to the consistency of the prediction indexes of most samples when the model predicts samples. So the optimal n is between 1 and 10. In the second step of the experiment, n is in the range from 1 to 10. The specific values are 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10. The experimental results are shown in <xref ref-type="fig" rid="F6">Figure 6G</xref> and <xref ref-type="table" rid="T4">Table 4</xref>.</p>
<p><xref ref-type="fig" rid="F6">Figure 6G</xref> shows that when then value is between 1 and 3, the evaluation index of the model shows an overall upward trend. Increasing n value within a certain range can increase the difference between various base classifiers, thus improving the classification ability of the model. Between 3 and 10, the overall model evaluation index shows a downward trend. As can be seen from Formula 10, when n value is too large, the model will keep the predicted values of most samples consistent. Therefore, the optimal n value is 3, which can widen the gap between each base classifier and make the effect of deep ensemble learning reach the optimal level.</p>
<p>We compared the performance of the DR-IIXRN algorithm with the five base classifiers, as shown in <xref ref-type="fig" rid="F6">Figure 6D</xref> and <xref ref-type="table" rid="T5">Table 5</xref>. In <xref ref-type="fig" rid="F6">Figure 6D</xref>, the horizontal coordinates &#x0201C;0,&#x0201D; &#x0201C;1,&#x0201D; &#x0201C;2,&#x0201D; &#x0201C;3,&#x0201D; and &#x0201C;4&#x0201D; represent &#x0201C;No DR,&#x0201D; &#x0201C;Mild DR,&#x0201D; &#x0201C;Moderate DR,&#x0201D; &#x0201C;Severe DR,&#x0201D; and &#x0201C;PDR,&#x0201D; respectively; &#x0201C;A,&#x0201D; &#x0201C;B,&#x0201D; &#x0201C;C,&#x0201D; &#x0201C;D,&#x0201D; &#x0201C;E,&#x0201D; and &#x0201C;F&#x0201D; represent &#x0201C;Inception V3,&#x0201D; &#x0201C;InceptionResNet V2,&#x0201D; &#x0201C;Xception,&#x0201D; &#x0201C;ResNeXt101,&#x0201D; &#x0201C;NASNetLarge,&#x0201D; and &#x0201C;DR-IIXRN,&#x0201D; respectively. The results show that compared with the basic classifier, deep ensemble learning can effectively compensate for the errors of each base classifier and improve the overall accuracy and F1_score of the classifier.</p>
<table-wrap position="float" id="T5">
<label>Table 5</label>
<caption><p>Deep ensemble learning vs. base classifiers.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Case</bold></th>
<th valign="top" align="center"><bold>Accuracy</bold></th>
<th valign="top" align="left"><bold>Label</bold></th>
<th valign="top" align="center"><bold>F1-score</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="middle" align="left" rowspan="5">Inception V3</td>
<td valign="middle" align="left" rowspan="5">0.77</td>
<td valign="top" align="left">No DR</td>
<td valign="top" align="center">0.87</td>
</tr>
<tr>
<td valign="top" align="left">Mild DR</td>
<td valign="top" align="center">0.05</td>
</tr>
<tr>
<td valign="top" align="left">Moderate DR</td>
<td valign="top" align="center">0.48</td>
</tr>
<tr>
<td valign="top" align="left">Severe DR</td>
<td valign="top" align="center">0.38</td>
</tr>
<tr>
<td valign="top" align="left">PDR</td>
<td valign="top" align="center">0.58</td>
</tr>
<tr style="border-top: thin solid #000000;">
<td valign="middle" align="left" rowspan="5">InceptionResNet V2</td>
<td valign="middle" align="left" rowspan="5">0.78</td>
<td valign="top" align="left">No DR</td>
<td valign="top" align="center">0.88</td>
</tr>
<tr>
<td valign="top" align="left">Mild DR</td>
<td valign="top" align="center">0.07</td>
</tr>
<tr>
<td valign="top" align="left">Moderate DR</td>
<td valign="top" align="center">0.51</td>
</tr>
<tr>
<td valign="top" align="left">Severe DR</td>
<td valign="top" align="center">0.42</td>
</tr>
<tr>
<td valign="top" align="left">PDR</td>
<td valign="top" align="center">0.59</td>
</tr>
<tr style="border-top: thin solid #000000;">
<td valign="middle" align="left" rowspan="5">Xception</td>
<td valign="middle" align="left" rowspan="5">0.78</td>
<td valign="top" align="left">No DR</td>
<td valign="top" align="center">0.88</td>
</tr>
<tr>
<td valign="top" align="left">Mild DR</td>
<td valign="top" align="center">0.05</td>
</tr>
<tr>
<td valign="top" align="left">Moderate DR</td>
<td valign="top" align="center">0.5</td>
</tr>
<tr>
<td valign="top" align="left">Severe DR</td>
<td valign="top" align="center">0.39</td>
</tr>
<tr>
<td valign="top" align="left">PDR</td>
<td valign="top" align="center">0.61</td>
</tr>
<tr style="border-top: thin solid #000000;">
<td valign="middle" align="left" rowspan="5">ResNeXt101</td>
<td valign="middle" align="left" rowspan="5">0.76</td>
<td valign="top" align="left">No DR</td>
<td valign="top" align="center">0.87</td>
</tr>
<tr>
<td valign="top" align="left">Mild DR</td>
<td valign="top" align="center">0.06</td>
</tr>
<tr>
<td valign="top" align="left">Moderate DR</td>
<td valign="top" align="center">0.48</td>
</tr>
<tr>
<td valign="top" align="left">Severe DR</td>
<td valign="top" align="center">0.41</td>
</tr>
<tr>
<td valign="top" align="left">PDR</td>
<td valign="top" align="center">0.55</td>
</tr>
<tr style="border-top: thin solid #000000;">
<td valign="middle" align="left" rowspan="5">NASNetLarge</td>
<td valign="middle" align="left" rowspan="5">0.76</td>
<td valign="top" align="left">No DR</td>
<td valign="top" align="center">0.87</td>
</tr>
<tr>
<td valign="top" align="left">Mild DR</td>
<td valign="top" align="center">0.07</td>
</tr>
<tr>
<td valign="top" align="left">Moderate DR</td>
<td valign="top" align="center">0.5</td>
</tr>
<tr>
<td valign="top" align="left">Severe DR</td>
<td valign="top" align="center">0.38</td>
</tr>
<tr>
<td valign="top" align="left">PDR</td>
<td valign="top" align="center">0.51</td>
</tr>
<tr style="border-top: thin solid #000000;">
<td valign="middle" align="left" rowspan="5">DR-IIXRN</td>
<td valign="middle" align="left" rowspan="5">0.79</td>
<td valign="top" align="left">No DR</td>
<td valign="top" align="center">0.89</td>
</tr>
<tr>
<td valign="top" align="left">Mild DR</td>
<td valign="top" align="center">0.06</td>
</tr>
<tr>
<td valign="top" align="left">Moderate DR</td>
<td valign="top" align="center">0.52</td>
</tr>
<tr>
<td valign="top" align="left">Severe DR</td>
<td valign="top" align="center">0.43</td>
</tr>
<tr>
<td valign="top" align="left">PDR</td>
<td valign="top" align="center">0.6</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec>
<title>4.5. Comparison of the DR-IIXRN Algorithm and Other Classification Algorithms</title>
<p>In order to verify the performance of the DR-IIXRN algorithm, we collected and downloaded several DR detectors applied in the literature published in the last 5 years.</p>
<p>Firstly, the DR-IIXRN algorithm was compared with the algorithm proposed by Harry Pratt (Pratt et al., <xref ref-type="bibr" rid="B47">2016</xref>). Due to the serious imbalance of the data set, the algorithm in this paper and Harry Pratt both perform poorly in the F1 value of the two categories Mild DR and Severe. However, compared with the algorithm proposed by Harry Pratt, in this article, the effect of data imbalance on category Mild DR and category Severe can be greatly reduced by the image up sampling module. By comparing the metrics of recall, precision, specificity, and F1 value, it can be seen from <xref ref-type="table" rid="T6">Table 6</xref>, <xref ref-type="fig" rid="F6">Figure 6F</xref> that the detection capability of DR-IIXRN algorithm has obvious superiority. In <xref ref-type="fig" rid="F6">Figure 6F</xref>, the horizontal coordinates &#x0201C;0,&#x0201D; &#x0201C;1,&#x0201D; &#x0201C;2,&#x0201D; &#x0201C;3,&#x0201D; and &#x0201C;4&#x0201D; stand for &#x0201C;No DR,&#x0201D; &#x0201C;Mild DR,&#x0201D; &#x0201C;Moderate DR,&#x0201D; &#x0201C;Severe DR,&#x0201D; and &#x0201C;PDR,&#x0201D; respectively.</p>
<table-wrap position="float" id="T6">
<label>Table 6</label>
<caption><p>DR-IIXRN vs. Harry Pratt proposed model.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="middle" align="left" rowspan="3"><bold>DR type</bold></th>
<th valign="top" align="center" colspan="2" style="border-bottom: thin solid #000000;"><bold>Recall</bold></th>
<th valign="top" align="center" colspan="2" style="border-bottom: thin solid #000000;"><bold>Precision</bold></th>
<th valign="top" align="center" colspan="2" style="border-bottom: thin solid #000000;"><bold>Specificity</bold></th>
<th valign="top" align="center" colspan="2" style="border-bottom: thin solid #000000;"><bold>F1_score</bold></th>
</tr>
<tr>
<th valign="top" align="center"><bold>DR-</bold></th>
<th valign="top" align="center"><bold>Harry</bold></th>
<th valign="top" align="center"><bold>DR-</bold></th>
<th valign="top" align="center"><bold>Harry</bold></th>
<th valign="top" align="center"><bold>DR-</bold></th>
<th valign="top" align="center"><bold>Harry</bold></th>
<th valign="top" align="center"><bold>DR-</bold></th>
<th valign="top" align="center"><bold>Harry</bold></th>
</tr>
<tr>
<th valign="top" align="center"><bold>IIXRN</bold></th>
<th valign="top" align="center"><bold>Pratt</bold></th>
<th valign="top" align="center"><bold>IIXRN</bold></th>
<th valign="top" align="center"><bold>Pratt</bold></th>
<th valign="top" align="center"><bold>IIXRN</bold></th>
<th valign="top" align="center"><bold>Pratt</bold></th>
<th valign="top" align="center"><bold>IIXRN</bold></th>
<th valign="top" align="center"><bold>Pratt</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">No DR</td>
<td valign="top" align="center">0.95</td>
<td valign="top" align="center">0.95</td>
<td valign="top" align="center">0.83</td>
<td valign="top" align="center">0.78</td>
<td valign="top" align="center">0.47</td>
<td valign="top" align="center">0.19</td>
<td valign="top" align="center">0.89</td>
<td valign="top" align="center">0.85</td>
</tr>
<tr>
<td valign="top" align="left">Mild DR</td>
<td valign="top" align="center">0.04</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0.16</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0.99</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">0.06</td>
<td valign="top" align="center">0</td>
</tr>
<tr>
<td valign="top" align="left">Moderate DR</td>
<td valign="top" align="center">0.46</td>
<td valign="top" align="center">0.23</td>
<td valign="top" align="center">0.61</td>
<td valign="top" align="center">0.4</td>
<td valign="top" align="center">0.95</td>
<td valign="top" align="center">0.93</td>
<td valign="top" align="center">0.52</td>
<td valign="top" align="center">0.29</td>
</tr>
<tr>
<td valign="top" align="left">Severe DR</td>
<td valign="top" align="center">0.34</td>
<td valign="top" align="center">0.78</td>
<td valign="top" align="center">0.59</td>
<td valign="top" align="center">0.52</td>
<td valign="top" align="center">0.99</td>
<td valign="top" align="center">0.99</td>
<td valign="top" align="center">0.43</td>
<td valign="top" align="center">0.1</td>
</tr>
<tr>
<td valign="top" align="left">PDR</td>
<td valign="top" align="center">0.52</td>
<td valign="top" align="center">0.44</td>
<td valign="top" align="center">0.72</td>
<td valign="top" align="center">0.32</td>
<td valign="top" align="center">0.99</td>
<td valign="top" align="center">0.97</td>
<td valign="top" align="center">0.6</td>
<td valign="top" align="center">0.37</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Next, we conducted experiments with algorithms proposed by Bravo (Bravo and Arbelaez, <xref ref-type="bibr" rid="B7">2017</xref>) and Ziyuan Zhao (Zhao et al., <xref ref-type="bibr" rid="B65">2019</xref>) for the same test set as DR-IIXRN. We evaluated and compared these algorithms by three metrics: average of classification accuracy (ACA), macro-averaged F1 (Macro-F1), and Micro-averaged F1 (Micro-F1). The classification confusion matrix is normalized, and then the average value of the diagonal line is calculated to obtain the average value of the classification accuracy(ACA). Macro-F1 and Micro-F1 were used to evaluate the results of multiple classifications. Macro-F1 and Micro-F1 are computed as simple arithmetic means of per-class F1-scores. The Macro F1-score is defined as the mean of class-wise F1-scores, and Micro F1-score is defined as the harmonic mean of the precision and recall.</p>
<p>Compared with the network architecture proposed by Bi-ResNet and Bravo, this article uses network structures with excellent results from the &#x0201C;ImageNet&#x0201D; competition Inception V3, InceptionResNet V2, Xception, ResNext101, and NASNetLarge as part of the DR-IIXRN, and fine-tune each network structure. This fine-tuning operation can greatly reduce the model training time on the premise of ensuring the accuracy (Kermany et al., <xref ref-type="bibr" rid="B29">2018</xref>). The experimental results are shown in <xref ref-type="table" rid="T7">Table 7</xref> and <xref ref-type="fig" rid="F6">Figure 6B</xref>. The DR-IIXRN algorithm showed good performance on ACA and Micro-F1.</p>
<table-wrap position="float" id="T7">
<label>Table 7</label>
<caption><p>The influence of different algorithms on the evaluation index.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Algorithm</bold></th>
<th valign="top" align="center"><bold>ACA</bold></th>
<th valign="top" align="center"><bold>Macro-F1</bold></th>
<th valign="top" align="center"><bold>Micro-F1</bold></th>
<th valign="top" align="center"><bold>Accuracy</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Bravo</td>
<td valign="top" align="center">0.5051</td>
<td valign="top" align="center">0.5081</td>
<td valign="top" align="center">0.5052</td>
<td valign="top" align="center">&#x02013;</td>
</tr>
<tr>
<td valign="top" align="left">Bi-ResNet [Ziyuan Zhao]</td>
<td valign="top" align="center">0.4889</td>
<td valign="top" align="center">0.5503</td>
<td valign="top" align="center">0.4897</td>
<td valign="top" align="center">&#x02013;</td>
</tr>
<tr>
<td valign="top" align="left">RA-Net [Ziyuan Zhao]</td>
<td valign="top" align="center">0.4717</td>
<td valign="top" align="center">0.5268</td>
<td valign="top" align="center">0.4724</td>
<td valign="top" align="center">&#x02013;</td>
</tr>
<tr>
<td valign="top" align="left">BiRA-Net [Ziyuan Zhao]</td>
<td valign="top" align="center">0.5431</td>
<td valign="top" align="center">0.5725</td>
<td valign="top" align="center">0.5436</td>
<td valign="top" align="center">&#x02013;</td>
</tr>
<tr>
<td valign="top" align="left">VGG19 [Yuchen Wu]</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">0.51</td>
</tr>
<tr>
<td valign="top" align="left">Resnet50 [Yuchen Wu]</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">0.49</td>
</tr>
<tr>
<td valign="top" align="left">InceptionV3 [Yuchen Wu]</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">0.61</td>
</tr>
<tr>
<td valign="top" align="left">DR-IIXRN</td>
<td valign="top" align="center">0.6347</td>
<td valign="top" align="center">0.51</td>
<td valign="top" align="center">0.791</td>
<td valign="top" align="center">0.79</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Finally, we compared the DR-IIXRN algorithm with the algorithms proposed by Yuchen Wu (Wu and Hu, <xref ref-type="bibr" rid="B60">2019</xref>) on accuracy. In this article, we use five commonly used network models at the stage of model building and integrate the output results of various models, which can effectively compensate for the error of each base classifier. Therefore, compared with the model proposed by Yuchen Wu, this article can improve the detection ability of DR to a certain extent. It can be seen from <xref ref-type="fig" rid="F6">Figure 6C</xref> and <xref ref-type="table" rid="T7">Table 7</xref> that the DR-IIXRN algorithm greatly improves the classification ability of DR samples.</p>
</sec>
<sec>
<title>4.6. Experimental Expansion</title>
<p>In this study, 945 fundus images were collected from Beijing Chaoyang Hospital, Capital Medical University. The categories of images are shown in the following <xref ref-type="table" rid="T8">Table 8</xref>. In this article, 133 images were randomly selected as the final test set, and the remaining 812 images were used to optimize network parameters. All image data were preprocessed in the same way as in the article. The evaluation indexes of each category in the test data set are shown in the following <xref ref-type="table" rid="T8">Table 8</xref>, and the comparison results of different base classifiers and DR-IIXRN in accuracy are shown in <xref ref-type="fig" rid="F7">Figure 7</xref>. As can be seen from the data in the table and figure, the deep ensemble learning algorithm proposed in this article largely integrates the advantages of various base classifiers, the auc, accuracy, and recall rate of the proposed method are improved to 95, 92, and 92%, respectively, and the network model trained in public data sets can achieve better results after the optimization of actual data.</p>
<table-wrap position="float" id="T8">
<label>Table 8</label>
<caption><p>Dataset category distribution and evaluation indicators.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="center" colspan="2"><bold>Dataset category distribution and evaluation indicators</bold></th>
<th valign="top" align="center"><bold>0</bold></th>
<th valign="top" align="center"><bold>1</bold></th>
<th valign="top" align="center"><bold>2</bold></th>
<th valign="top" align="center"><bold>3</bold></th>
<th valign="top" align="center"><bold>4</bold></th>
<th valign="top" align="center"><bold>Overall</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="middle" align="left" rowspan="7">Test set</td>
<td valign="top" align="left">Precision</td>
<td valign="top" align="center">0.9</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">0.81</td>
<td valign="top" align="center">0.96</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">0.93</td>
</tr>
<tr>
<td valign="top" align="left">Recall</td>
<td valign="top" align="center">0.95</td>
<td valign="top" align="center">0.69</td>
<td valign="top" align="center">0.97</td>
<td valign="top" align="center">0.89</td>
<td valign="top" align="center">0.97</td>
<td valign="top" align="center">0.92</td>
</tr>
<tr>
<td valign="top" align="left">F1</td>
<td valign="top" align="center">0.93</td>
<td valign="top" align="center">0.81</td>
<td valign="top" align="center">0.89</td>
<td valign="top" align="center">0.92</td>
<td valign="top" align="center">0.99</td>
<td valign="top" align="center">0.92</td>
</tr>
<tr>
<td valign="top" align="left">Specificity</td>
<td valign="top" align="center">0.98</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">0.91</td>
<td valign="top" align="center">0.99</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">0.97</td>
</tr>
<tr>
<td valign="top" align="left">Accuracy</td>
<td valign="top" align="center">0.95</td>
<td valign="top" align="center">0.69</td>
<td valign="top" align="center">0.97</td>
<td valign="top" align="center">0.89</td>
<td valign="top" align="center">0.97</td>
<td valign="top" align="center">0.92</td>
</tr>
<tr>
<td valign="top" align="left">AUC</td>
<td valign="top" align="center">0.97</td>
<td valign="top" align="center">0.84</td>
<td valign="top" align="center">0.94</td>
<td valign="top" align="center">0.94</td>
<td valign="top" align="center">0.99</td>
<td valign="top" align="center">0.95</td>
</tr>
<tr>
<td valign="top" align="left">Support</td>
<td valign="top" align="center">20</td>
<td valign="top" align="center">16</td>
<td valign="top" align="center">36</td>
<td valign="top" align="center">27</td>
<td valign="top" align="center">34</td>
<td valign="top" align="center">133</td>
</tr>
<tr style="border-top: thin solid #000000;">
<td valign="top" align="left">Data set</td>
<td valign="top" align="left">Support</td>
<td valign="top" align="center">159</td>
<td valign="top" align="center">99</td>
<td valign="top" align="center">289</td>
<td valign="top" align="center">166</td>
<td valign="top" align="center">232</td>
<td valign="top" align="center">945</td>
</tr>
</tbody>
</table>
</table-wrap>
<fig id="F7" position="float">
<label>Figure 7</label>
<caption><p>The influence of different base classifiers on accuracy.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-15-778552-g0007.tif"/>
</fig>
</sec>
</sec>
<sec sec-type="conclusions" id="s5">
<title>5. Conclusion</title>
<p>In this article, we propose a DR detection algorithm DR-IIXRN based on deep ensemble learning and attention mechanism. The experimental results show that the image preprocessing and image enhancement module can solve the problem of low classification accuracy caused by the uneven distribution of the original data. DR-IIXRN deep ensemble learning algorithm can make the base classifier play better roles in the detection of DR in actual hospitals through weight calculation. The results of the comparison with other detectors confirm the accuracy and verify the performance of the algorithm. In the future, we plan to use more actual hospital samples to test the robustness of the algorithm.</p>
</sec>
<sec sec-type="data-availability" id="s6">
<title>Data Availability Statement</title>
<p>The datasets generated and/or analysed during the current study are available from the corresponding author on reasonable request. Correspondence and requests for data materials should be addressed to Yaping Lu (<email>luyaping&#x00040;sinopharm.com</email>).</p>
</sec>
<sec id="s7">
<title>Author Contributions</title>
<p>ZA, YF, and YL: conceptualization and writing original draft preparation. ZA and XH: methodology. ZA and FZ: writing review and editing. YL and FZ: project administration. XH and JF: data collection. YL, XH, and FZ: funding acquisition. All authors read and agreed to the published version of the manuscript.</p>
</sec>
<sec sec-type="funding-information" id="s8">
<title>Funding</title>
<p>This work was supported in part by the National Natural Science Foundation of China to FZ (81902861) and XH (32000485) and in part by the Sinopharm Genomics Technology Co., Ltd. The funder Sinopharm Genomics Technology Co., Ltd. had the following involvement with the study: design, collection, analysis, interpretation of data, the writing of this article and the decision to submit it for publication.</p>
</sec>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of Interest</title>
<p>ZA, YF, and YL are employees of Sinopharm Genomics Technology Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s9">
<title>Publisher&#x00027;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
</body>
<back>
<ack><p>The numerical calculations in this study have been done on the supercomputing system in the Supercomputing Center of Wuhan University.</p>
</ack>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Ahmed</surname> <given-names>S. A. A.</given-names></name> <name><surname>Yanikoglu</surname> <given-names>B.</given-names></name> <name><surname>Goksu</surname> <given-names>O.</given-names></name> <name><surname>Aptoula</surname> <given-names>E</given-names></name></person-group> (<year>2020a</year>). <article-title>&#x0201C;Skin lesion classification with deep CNN ensembles,&#x0201D;</article-title> in <source>2020 28th Signal Processing and Communications Applications Conference (SIU)</source> (<publisher-loc>IEEE</publisher-loc>), <fpage>1</fpage>&#x02013;<lpage>4</lpage>. <pub-id pub-id-type="doi">10.1109/SIU49456.2020.9302125</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B2">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Ahmed</surname> <given-names>S. A. A.</given-names></name> <name><surname>Yanikoglu</surname> <given-names>B.</given-names></name> <name><surname>Zor</surname> <given-names>C.</given-names></name> <name><surname>Awais</surname> <given-names>M.</given-names></name> <name><surname>Kittler</surname> <given-names>J</given-names></name></person-group> (<year>2020b</year>). <article-title>&#x0201C;Skin Lesion Diagnosis with Imbalanced ECOC Ensembles,&#x0201D;</article-title> in <source>International Conference on Machine Learning, Optimization, and Data Science</source> (<publisher-loc>Springer</publisher-loc>), <fpage>292</fpage>&#x02013;<lpage>303</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-030-64580-9_25</pub-id></citation>
</ref>
<ref id="B3">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Araujo</surname> <given-names>T.</given-names></name> <name><surname>Aresta</surname> <given-names>G.</given-names></name> <name><surname>Mendonsa</surname> <given-names>L.</given-names></name> <name><surname>Penas</surname> <given-names>S.</given-names></name> <name><surname>Maia</surname> <given-names>C.</given-names></name> <name><surname>Carneiro</surname> <given-names>A.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>DR|GRADUATE: uncertainty-aware deep learning-based diabetic retinopathy grading in eye fundus images</article-title>. <source>Med. Image Anal.</source> <volume>63</volume>:<fpage>101715</fpage>. <pub-id pub-id-type="doi">10.1016/j.media.2020.101715</pub-id><pub-id pub-id-type="pmid">32434128</pub-id></citation></ref>
<ref id="B4">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ardakani</surname> <given-names>A. A.</given-names></name> <name><surname>Kanafi</surname> <given-names>A. R.</given-names></name> <name><surname>Acharya</surname> <given-names>U. R.</given-names></name> <name><surname>Khadem</surname> <given-names>N.</given-names></name> <name><surname>Mohammadi</surname> <given-names>A</given-names></name></person-group> (<year>2020</year>). <article-title>Application of deep learning technique to manage COVID-19 in routine clinical practice using CT images: results of 10 convolutional neural networks</article-title>. <source>Comput. Biol. Med.</source> <volume>121</volume>:<fpage>103795</fpage>. <pub-id pub-id-type="doi">10.1016/j.compbiomed.2020.103795</pub-id><pub-id pub-id-type="pmid">32568676</pub-id></citation></ref>
<ref id="B5">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Bakkali</surname> <given-names>S.</given-names></name> <name><surname>Ming</surname> <given-names>Z.</given-names></name> <name><surname>Coustaty</surname> <given-names>M.</given-names></name> <name><surname>Rusinol</surname> <given-names>M</given-names></name></person-group> (<year>2020</year>). <article-title>&#x0201C;Cross-modal deep networks for document image classification,&#x0201D;</article-title> in <source>2020 IEEE International Conference on Image Processing (ICIP)</source> (<publisher-loc>IEEE</publisher-loc>), <fpage>2556</fpage>&#x02013;<lpage>2560</lpage>. <pub-id pub-id-type="doi">10.1109/ICIP40778.2020.9191268</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B6">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Bharati</surname> <given-names>S.</given-names></name> <name><surname>Podder</surname> <given-names>P.</given-names></name> <name><surname>Mondal</surname> <given-names>M. R. H.</given-names></name> <name><surname>Gandhi</surname> <given-names>N</given-names></name></person-group> (<year>2021</year>). <article-title>&#x0201C;Optimized NASNet for Diagnosis of COVID-19 from Lung CT Images,&#x0201D;</article-title> in <source>International Conference on Intelligent Systems Design and Applications</source> (<publisher-loc>Springer</publisher-loc>), <fpage>647</fpage>&#x02013;<lpage>656</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-030-71187-0_59</pub-id></citation>
</ref>
<ref id="B7">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Bravo</surname> <given-names>M. A.</given-names></name> <name><surname>Arbelaez</surname> <given-names>P. A</given-names></name></person-group> (<year>2017</year>). <article-title>&#x0201C;Automatic diabetic retinopathy classification,&#x0201D;</article-title> in <source>13th International Conference on Medical Information Processing and Analysis</source> (<publisher-loc>International Society for Optics and Photonics</publisher-loc>), p. 105721. <pub-id pub-id-type="doi">10.1117/12.2285939</pub-id><pub-id pub-id-type="pmid">17645476</pub-id></citation></ref>
<ref id="B8">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Cao</surname> <given-names>J.</given-names></name> <name><surname>Zhang</surname> <given-names>H.</given-names></name> <name><surname>Ren</surname> <given-names>W</given-names></name></person-group> (<year>2021</year>). <article-title>&#x0201C;Improved YOLOv3 model based on ResNeXt for target detection,&#x0201D;</article-title> in <source>2021 IEEE International Conference on Power, Intelligent Computing and Systems (ICPICS)</source> (<publisher-loc>IEEE</publisher-loc>), <fpage>709</fpage>&#x02013;<lpage>713</lpage>. <pub-id pub-id-type="doi">10.1109/ICPICS52425.2021.9524125</pub-id><pub-id pub-id-type="pmid">33669229</pub-id></citation></ref>
<ref id="B9">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Carrera</surname> <given-names>E. V.</given-names></name> <name><surname>Gonzalez</surname> <given-names>A.</given-names></name> <name><surname>Careera</surname> <given-names>R</given-names></name></person-group> (<year>2017</year>). <article-title>&#x0201C;Automated detection of diabetic retinopathy using SVM,&#x0201D;</article-title> in <source>2017 IEEE XXIV international conference on electronics, electrical engineering and computing (INTERCON)</source> (<publisher-loc>IEEE</publisher-loc>), <fpage>1</fpage>&#x02013;<lpage>4</lpage>. <pub-id pub-id-type="doi">10.1109/INTERCON.2017.8079692</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B10">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Chakrabarty</surname> <given-names>N.</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;A deep learning method for the detection of diabetic retinopathy,&#x0201D;</article-title> in <source>2018 5th IEEE Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON)</source> (<publisher-loc>IEEE</publisher-loc>), <fpage>1</fpage>&#x02013;<lpage>5</lpage>. <pub-id pub-id-type="doi">10.1109/UPCON.2018.8596839</pub-id><pub-id pub-id-type="pmid">34591173</pub-id></citation></ref>
<ref id="B11">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Charabarty</surname> <given-names>N.</given-names></name> <name><surname>Chatterjee</surname> <given-names>S</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;An Offbeat Technique for Diabetic Retinopathy Detection using Computer Vision,&#x0201D;</article-title> in <source>2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT)</source> (<publisher-loc>IEEE</publisher-loc>), <fpage>1</fpage>&#x02013;<lpage>5</lpage>. <pub-id pub-id-type="doi">10.1109/ICCCNT45670.2019.8944633</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B12">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>Y.</given-names></name> <name><surname>Wang</surname> <given-names>Y.</given-names></name> <name><surname>Gu</surname> <given-names>Y.</given-names></name> <name><surname>He</surname> <given-names>X.</given-names></name> <name><surname>Ghamisi</surname> <given-names>P.</given-names></name> <name><surname>Jia</surname> <given-names>X</given-names></name></person-group> (<year>2019</year>). <article-title>Deep learning ensemble for hyperspe-ctral image classification</article-title>. <source>IEEE J. Select. Top. Appl. Earth Observat. Remote Sens.</source> <volume>12</volume>, <fpage>1882</fpage>&#x02013;<lpage>1897</lpage>. <pub-id pub-id-type="doi">10.1109/JSTARS.2019.2915259</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B13">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chollet</surname> <given-names>F.</given-names></name></person-group> (<year>2017</year>). <article-title>&#x0201C;Xception: Deep learning with depthwise separable convolutions,&#x0201D;</article-title> in <source>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</source>, <fpage>1251</fpage>&#x02013;<lpage>1258</lpage>. <pub-id pub-id-type="doi">10.1109/CVPR.2017.195</pub-id><pub-id pub-id-type="pmid">33338542</pub-id></citation></ref>
<ref id="B14">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Das</surname> <given-names>A. K.</given-names></name> <name><surname>Ghosh</surname> <given-names>S.</given-names></name> <name><surname>Thunder</surname> <given-names>S.</given-names></name> <name><surname>Dutta</surname> <given-names>R.</given-names></name> <name><surname>Agarwal</surname> <given-names>S.</given-names></name> <name><surname>Chakrabarti</surname> <given-names>A</given-names></name></person-group> (<year>2021</year>). <article-title>Automatic COVID-19 detection from X-ray images using ensemble learning with convolutional neural network</article-title>. <source>Pattern Anal. Appl.</source> <volume>24</volume>, <fpage>1</fpage>&#x02013;<lpage>4</lpage>. <pub-id pub-id-type="doi">10.1007/s10044-021-00970-4</pub-id></citation>
</ref>
<ref id="B15">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dong</surname> <given-names>N.</given-names></name> <name><surname>Zhao</surname> <given-names>L.</given-names></name> <name><surname>Wu</surname> <given-names>C.</given-names></name> <name><surname>Chang</surname> <given-names>J</given-names></name></person-group> (<year>2020</year>). <article-title>Inception v3 based cervical cell classification combined with artificially extracted features</article-title>. <source>Appl. Soft Comput.</source> <volume>93</volume>:<fpage>106311</fpage>. <pub-id pub-id-type="doi">10.1016/j.asoc.2020.106311</pub-id></citation>
</ref>
<ref id="B16">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Dongmei</surname> <given-names>Z.</given-names></name> <name><surname>Ke</surname> <given-names>W.</given-names></name> <name><surname>Hongbo</surname> <given-names>G.</given-names></name> <name><surname>Peng</surname> <given-names>W.</given-names></name> <name><surname>Chao</surname> <given-names>W.</given-names></name> <name><surname>Shaofeng</surname> <given-names>P</given-names></name></person-group> (<year>2020</year>). <article-title>&#x0201C;Classification and identification of citrus pests based on inceptionv3 convolutional neural network and migration learning,&#x0201D;</article-title> in <source>2020 International Conference on Internet of Things and Intelligent Applications (ITIA)</source> (<publisher-loc>IEEE</publisher-loc>), <fpage>1</fpage>&#x02013;<lpage>7</lpage>. <pub-id pub-id-type="doi">10.1109/ITIA50152.2020.9312359</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B17">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Farag</surname> <given-names>H. H.</given-names></name> <name><surname>Said</surname> <given-names>L. A.</given-names></name> <name><surname>Rizk</surname> <given-names>M. R.</given-names></name> <name><surname>Ahmed</surname> <given-names>M. A. E</given-names></name></person-group> (<year>2021</year>). <article-title>Hyperparameters optimization for resnet and xception in the purpose of diagnosing COVID-19</article-title>. <source>J. Intell. Fuzzy Syst.</source> <volume>41</volume>, <fpage>1</fpage>&#x02013;<lpage>17</lpage>. <pub-id pub-id-type="doi">10.3233/JIFS-210925</pub-id></citation>
</ref>
<ref id="B18">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ferreira</surname> <given-names>C. A.</given-names></name> <name><surname>Melo</surname> <given-names>T.</given-names></name> <name><surname>Sousa</surname> <given-names>P.</given-names></name> <name><surname>Meyer</surname> <given-names>M. I.</given-names></name> <name><surname>Shakibapour</surname> <given-names>E.</given-names></name> <name><surname>Costa</surname> <given-names>P.</given-names></name> <etal/></person-group>. (<year>2018</year>). <article-title>Classification of breast cancer histology images through transfer learning using a pre-trained inception resnet v2</article-title>. <source>Lecture Notes Comput. Sci.</source> <volume>10882</volume>, <fpage>763</fpage>&#x02013;<lpage>770</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-319-93000-8_86</pub-id></citation>
</ref>
<ref id="B19">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fu</surname> <given-names>H.</given-names></name> <name><surname>Cheng</surname> <given-names>J.</given-names></name> <name><surname>Xu</surname> <given-names>Y.</given-names></name> <name><surname>Zhang</surname> <given-names>C.</given-names></name> <name><surname>Wong</surname> <given-names>D. W. K.</given-names></name> <name><surname>Liu</surname> <given-names>J.</given-names></name> <etal/></person-group>. (<year>2018</year>). <article-title>Disc-aware ensem-ble network for glaucoma screening from fundus image</article-title>. <source>IEEE Trans. Med. Imaging</source> <volume>37</volume>, <fpage>2493</fpage>&#x02013;<lpage>2501</lpage>. <pub-id pub-id-type="doi">10.1109/TMI.2018.2837012</pub-id><pub-id pub-id-type="pmid">29994764</pub-id></citation></ref>
<ref id="B20">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Gautam</surname> <given-names>A. S.</given-names></name> <name><surname>Jana</surname> <given-names>S. K.</given-names></name> <name><surname>Dutta</surname> <given-names>M. P</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;Automated diagnosis of diabetic retinopathy using image processing for non-invasive biomedical application,&#x0201D;</article-title> in <source>2019 International Conference on Intelligent Computing and Control Systems (ICCS)</source> (<publisher-loc>IEEE</publisher-loc>), <fpage>809</fpage>&#x02013;<lpage>812</lpage>. <pub-id pub-id-type="doi">10.1109/ICCS45141.2019.9065446</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B21">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Go</surname> <given-names>J. H.</given-names></name> <name><surname>Jan</surname> <given-names>T.</given-names></name> <name><surname>Mohanty</surname> <given-names>M.</given-names></name> <name><surname>Patel</surname> <given-names>O. P.</given-names></name> <name><surname>Puthal</surname> <given-names>D.</given-names></name> <name><surname>Prasad</surname> <given-names>M</given-names></name></person-group> (<year>2020</year>). <article-title>&#x0201C;Visualization approach for malware classification with ResNeXt,&#x0201D;</article-title> in <source>2020 IEEE Congress on Evolutionary Computation (CEC)</source> (<publisher-loc>IEEE</publisher-loc>), <fpage>1</fpage>&#x02013;<lpage>7</lpage>. <pub-id pub-id-type="doi">10.1109/CEC48606.2020.9185490</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B22">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Guo</surname> <given-names>X.</given-names></name> <name><surname>Zhang</surname> <given-names>Y.-D.</given-names></name> <name><surname>Lu</surname> <given-names>S.</given-names></name> <name><surname>Lu</surname> <given-names>Z</given-names></name></person-group> (<year>2021</year>). <article-title>A survey on machine learning in COVID-19 diagnosis</article-title>. <source>Comput. Model. Eng. Sci.</source> <volume>129</volume>, <fpage>23</fpage>&#x02013;<lpage>71</lpage>. <pub-id pub-id-type="doi">10.32604/cmes.2021.017679</pub-id><pub-id pub-id-type="pmid">33769936</pub-id></citation></ref>
<ref id="B23">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>He</surname> <given-names>K.</given-names></name> <name><surname>Zhang</surname> <given-names>X.</given-names></name> <name><surname>Ren</surname> <given-names>S.</given-names></name> <name><surname>Sun</surname> <given-names>R</given-names></name></person-group> (<year>2016</year>). <article-title>&#x0201C;Deep residual learning for image recognition,&#x0201D;</article-title> in <source>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</source>, <fpage>770</fpage>&#x02013;<lpage>778</lpage>. <pub-id pub-id-type="doi">10.1109/CVPR.2016.90</pub-id><pub-id pub-id-type="pmid">32166560</pub-id></citation></ref>
<ref id="B24">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Herliana</surname> <given-names>A.</given-names></name> <name><surname>Arifin</surname> <given-names>T.</given-names></name> <name><surname>Susanti</surname> <given-names>S.</given-names></name> <name><surname>Hitmah</surname> <given-names>A. B</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;Feature selection of diabetic retinopathy disease using particle swarm optimization and neural network,&#x0201D;</article-title> in <source>2018 6th International Conference on Cyber and IT Service Management (CITSM)</source> (<publisher-loc>IEEE</publisher-loc>), <fpage>1</fpage>&#x02013;<lpage>4</lpage>. <pub-id pub-id-type="doi">10.1109/CITSM.2018.8674295</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B25">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Hu</surname> <given-names>J.</given-names></name> <name><surname>Shen</surname> <given-names>L.</given-names></name> <name><surname>Albanie</surname> <given-names>S.</given-names></name> <name><surname>Sun</surname> <given-names>G.</given-names></name> <name><surname>Wu</surname> <given-names>E</given-names></name></person-group> (<year>2020</year>). <article-title>&#x0201C;Squeeze-and-excitation networks,&#x0201D;</article-title> in <source>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Vol. 8</source> (<publisher-loc>IEEE</publisher-loc>), <fpage>7132</fpage>&#x02013;<lpage>1741</lpage>. <pub-id pub-id-type="doi">10.1109/TPAMI.2019.2913372</pub-id><pub-id pub-id-type="pmid">31034408</pub-id></citation></ref>
<ref id="B26">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Jayakumari</surname> <given-names>C.</given-names></name> <name><surname>Lavanya</surname> <given-names>V.</given-names></name> <name><surname>Sumesh</surname> <given-names>E. P</given-names></name></person-group> (<year>2020</year>). <article-title>&#x0201C;Automated diabetic retinopathy detection and classification using imagenet convolution neural network using fundus images,&#x0201D;</article-title> in <source>2020 International Conference on Smart Electronics and Communication (ICOSEC)</source> (<publisher-loc>IEEE</publisher-loc>), <fpage>577</fpage>&#x02013;<lpage>582</lpage>. <pub-id pub-id-type="doi">10.1109/ICOSEC49089.2020.9215270</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B27">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Kamble</surname> <given-names>R. M.</given-names></name> <name><surname>Kokare</surname> <given-names>M.</given-names></name> <name><surname>Chan</surname> <given-names>G. C.</given-names></name> <name><surname>Perdomo</surname> <given-names>O.</given-names></name> <name><surname>Gonzalez</surname> <given-names>F. A.</given-names></name> <name><surname>Muller</surname> <given-names>H.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>&#x0201C;Automated diabetic macular edema (DME) analysis using fine tuning with inception-resnet-v2 on OCT images,&#x0201D;</article-title> in <source>2018 IEEE-EMBS Conference on Biomedical Engineering and Sciences (IECBES)</source> (<publisher-loc>IEEE</publisher-loc>), <fpage>442</fpage>&#x02013;<lpage>446</lpage>. <pub-id pub-id-type="doi">10.1109/IECBES.2018.8626616</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B28">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Kanth</surname> <given-names>S.</given-names></name> <name><surname>Jaiswal</surname> <given-names>A.</given-names></name> <name><surname>Kakkar</surname> <given-names>M</given-names></name></person-group> (<year>2013</year>). <article-title>&#x0201C;Identification of different stages of Diabetic Retinopathy using artificial neural network,&#x0201D;</article-title> in <source>2013 Sixth International Conference on Contemporary Computing (IC3)</source> (<publisher-loc>IEEE</publisher-loc>), <fpage>479</fpage>&#x02013;<lpage>484</lpage>. <pub-id pub-id-type="doi">10.1109/IC3.2013.6612243</pub-id><pub-id pub-id-type="pmid">19205991</pub-id></citation></ref>
<ref id="B29">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kermany</surname> <given-names>D. S.</given-names></name> <name><surname>Goldbaum</surname> <given-names>M.</given-names></name> <name><surname>Cai</surname> <given-names>W.</given-names></name> <name><surname>Valentim</surname> <given-names>C. C. S.</given-names></name> <name><surname>Liang</surname> <given-names>H.</given-names></name> <name><surname>Baxter</surname> <given-names>S. L.</given-names></name> <etal/></person-group>. (<year>2018</year>). <article-title>Identifying medical diagnoses and treatable diseases by image-based deep learning</article-title>. <source>Cell</source> <volume>172</volume>, <fpage>1122</fpage>&#x02013;<lpage>1131</lpage>. <pub-id pub-id-type="doi">10.1016/j.cell.2018.02.010</pub-id><pub-id pub-id-type="pmid">29474911</pub-id></citation></ref>
<ref id="B30">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Kon&#x000E9;</surname> <given-names>I.</given-names></name> <name><surname>Boulmane</surname> <given-names>L</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;Hierarchical ResNeXt models for breast cancer histology image classification,&#x0201D;</article-title> in <source>International Conference Image Analysis and Recognition</source> (<publisher-loc>Springer</publisher-loc>), <fpage>796</fpage>&#x02013;<lpage>803</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-319-93000-8_90</pub-id></citation>
</ref>
<ref id="B31">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>LeCun</surname> <given-names>Y.</given-names></name> <name><surname>Bottou</surname> <given-names>L.</given-names></name> <name><surname>Bengio</surname> <given-names>Y.</given-names></name> <name><surname>Haffner</surname> <given-names>P</given-names></name></person-group> (<year>1998</year>). <article-title>Gradient-based learning applied to document recognition</article-title>. <source>Proc. IEEE</source> <volume>86</volume>, <fpage>2278</fpage>&#x02013;<lpage>2324</lpage>. <pub-id pub-id-type="doi">10.1109/5.726791</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B32">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>W.</given-names></name> <name><surname>Wang</surname> <given-names>Z.</given-names></name> <name><surname>Wang</surname> <given-names>Y.</given-names></name> <name><surname>Wu</surname> <given-names>J.</given-names></name> <name><surname>Wang</surname> <given-names>J.</given-names></name> <name><surname>Jia</surname> <given-names>Y.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Classification of high-spatial-resolution remote sensing scenes method using transfer learning and deep convolutional neural network</article-title>. <source>IEEE J. Select. Top. Appl. Earth Observ. Remote Sens.</source> <volume>13</volume>, <fpage>1986</fpage>&#x02013;<lpage>1995</lpage>. <pub-id pub-id-type="doi">10.36227/techrxiv.11694966.v1</pub-id></citation>
</ref>
<ref id="B33">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>Y.</given-names></name> <name><surname>Teng</surname> <given-names>D.</given-names></name> <name><surname>Shi</surname> <given-names>X.</given-names></name> <name><surname>Qin</surname> <given-names>G.</given-names></name> <name><surname>Qin</surname> <given-names>Y.</given-names></name> <name><surname>Quan</surname> <given-names>H.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Prevalence of diabetes recorded in mainland china using 2018 diagnostic criteria from the american diabetes association: national cross sectional study</article-title>. <source>BMJ</source> <volume>369</volume>:<fpage>m997</fpage>. <pub-id pub-id-type="doi">10.1136/bmj.m997</pub-id><pub-id pub-id-type="pmid">32345662</pub-id></citation></ref>
<ref id="B34">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lian</surname> <given-names>C.</given-names></name> <name><surname>Liang</surname> <given-names>Y.</given-names></name> <name><surname>Kang</surname> <given-names>R.</given-names></name> <name><surname>Xiang</surname> <given-names>Y</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;Deep convolutional neural networks for diabetic retinopathy classification,&#x0201D;</article-title> in <source>Proceedings of the 2nd International Conference on Advances in Image Processing</source>, <fpage>68</fpage>&#x02013;<lpage>72</lpage>. <pub-id pub-id-type="doi">10.1145/3239576.3239589</pub-id><pub-id pub-id-type="pmid">34033015</pub-id></citation></ref>
<ref id="B35">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lin</surname> <given-names>T. H.</given-names></name> <name><surname>Jhang</surname> <given-names>J. Y.</given-names></name> <name><surname>Huang</surname> <given-names>C. R.</given-names></name> <name><surname>Tsai</surname> <given-names>Y. C.</given-names></name> <name><surname>Cheng</surname> <given-names>H. C.</given-names></name> <name><surname>Sheu</surname> <given-names>B. S</given-names></name></person-group> (<year>2021</year>). <article-title>Deep ensemble feature network for gastric section classification</article-title>. <source>IEEE J. Biomed. Health Inform.</source> <volume>25</volume>, <fpage>77</fpage>&#x02013;<lpage>87</lpage>. <pub-id pub-id-type="doi">10.1109/JBHI.2020.2999731</pub-id><pub-id pub-id-type="pmid">32750926</pub-id></citation></ref>
<ref id="B36">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>K.</given-names></name> <name><surname>Yu</surname> <given-names>S.</given-names></name> <name><surname>Liu</surname> <given-names>S</given-names></name></person-group> (<year>2020</year>). <article-title>An improved inceptionv3 network for obscured ship classifica-tion in remote sensing images</article-title>. <source>IEEE J. Select. Top. Appl. Earth Observ. Remote Sens.</source> <volume>13</volume>, <fpage>4738</fpage>&#x02013;<lpage>4747</lpage>. <pub-id pub-id-type="doi">10.1109/JSTARS.2020.3017676</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B37">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Loffe</surname> <given-names>S.</given-names></name> <name><surname>Szegedy</surname> <given-names>C</given-names></name></person-group> (<year>2015</year>). <article-title>&#x0201C;Batch normalization: accelerating deep network training by reducing internal covariate shift,&#x0201D;</article-title> in <source>International Conference On Machine Learning</source> (<publisher-loc>PMLR</publisher-loc>), <fpage>448</fpage>&#x02013;<lpage>456</lpage>.</citation>
</ref>
<ref id="B38">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lu</surname> <given-names>S.</given-names></name> <name><surname>Zhu</surname> <given-names>Z.</given-names></name> <name><surname>Gorriz</surname> <given-names>J. M.</given-names></name> <name><surname>Wang</surname> <given-names>S. H.</given-names></name> <name><surname>Zhang</surname> <given-names>Y. D</given-names></name></person-group> (<year>2021</year>). <article-title>NAGNN: classification of COVID-19 based on neighboring aware representation from deep graph neural network</article-title>. <source>Int. J. Intell. Syst.</source> <fpage>1</fpage>&#x02013;<lpage>27</lpage>. <pub-id pub-id-type="doi">10.1002/int.22686</pub-id><pub-id pub-id-type="pmid">25855820</pub-id></citation></ref>
<ref id="B39">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lu</surname> <given-names>S. Y.</given-names></name> <name><surname>Nayak</surname> <given-names>D. R.</given-names></name> <name><surname>Wang</surname> <given-names>S. H.</given-names></name> <name><surname>Zhang</surname> <given-names>Y. D</given-names></name></person-group> (<year>2021</year>). <article-title>A cerebral microbleed diagno-sis method via featurenet and ensembled randomized neural networks</article-title>. <source>Appl. Soft Comput.</source> <volume>109</volume>:<fpage>107567</fpage>. <pub-id pub-id-type="doi">10.1016/j.asoc.2021.107567</pub-id></citation>
</ref>
<ref id="B40">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Maistry</surname> <given-names>A.</given-names></name> <name><surname>Pillay</surname> <given-names>A.</given-names></name> <name><surname>Jembere</surname> <given-names>E</given-names></name></person-group> (<year>2020</year>). <article-title>&#x0201C;Improving the accuracy of diabetes retinopathy image classification using augmentation,&#x0201D;</article-title> in <source>Conference of the South African Institute of Computer Scientists and Information Technologists 2020</source>, <fpage>134</fpage>&#x02013;<lpage>140</lpage>. <pub-id pub-id-type="doi">10.1145/3410886.3410914</pub-id></citation>
</ref>
<ref id="B41">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Minetto</surname> <given-names>R.</given-names></name> <name><surname>Segundo</surname> <given-names>M. P.</given-names></name> <name><surname>Sarkar</surname> <given-names>S</given-names></name></person-group> (<year>2019</year>). <article-title>Hydra: an ensemble of convolutional neu-ral networks for geospatial land classification</article-title>. <source>IEEE Trans. Geosci. Remote Sens.</source> <volume>57</volume>, <fpage>6530</fpage>&#x02013;<lpage>6541</lpage>. <pub-id pub-id-type="doi">10.1109/TGRS.2019.2906883</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B42">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Narin</surname> <given-names>A.</given-names></name> <name><surname>Kaya</surname> <given-names>C.</given-names></name> <name><surname>Pamuk</surname> <given-names>Z</given-names></name></person-group> (<year>2021</year>). <article-title>Automatic detection of coronavirus disease (COVID-19) using X-ray images and deep convolutional neural networks</article-title>. <source>Pattern Anal. Appl.</source> <volume>24</volume>, <fpage>1</fpage>&#x02013;<lpage>4</lpage>. <pub-id pub-id-type="doi">10.1007/s10044-021-00984-y</pub-id><pub-id pub-id-type="pmid">33994847</pub-id></citation></ref>
<ref id="B43">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pant</surname> <given-names>G.</given-names></name> <name><surname>Yadav</surname> <given-names>D. P.</given-names></name> <name><surname>Gaur</surname> <given-names>A</given-names></name></person-group> (<year>2020</year>). <article-title>Resnext convolution neural network topology-based deep learning model for identification and classification of pediastrum</article-title>. <source>Algal Res.</source> <volume>48</volume>:<fpage>101932</fpage>. <pub-id pub-id-type="doi">10.1016/j.algal.2020.101932</pub-id></citation>
</ref>
<ref id="B44">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Peng</surname> <given-names>S.</given-names></name> <name><surname>Huang</surname> <given-names>H.</given-names></name> <name><surname>Chen</surname> <given-names>W.</given-names></name> <name><surname>Zhang</surname> <given-names>L.</given-names></name> <name><surname>Fang</surname> <given-names>W</given-names></name></person-group> (<year>2020</year>). <article-title>More trainable inception-resnet for face recognition</article-title>. <source>Neurocomputing</source> <volume>411</volume>, <fpage>9</fpage>&#x02013;<lpage>19</lpage>. <pub-id pub-id-type="doi">10.1016/j.neucom.2020.05.022</pub-id></citation>
</ref>
<ref id="B45">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pires</surname> <given-names>R.</given-names></name> <name><surname>Avila</surname> <given-names>S.</given-names></name> <name><surname>Jelinek</surname> <given-names>H. F.</given-names></name> <name><surname>Wainer</surname> <given-names>J.</given-names></name> <name><surname>Valle</surname> <given-names>E.</given-names></name> <name><surname>Rocha</surname> <given-names>A</given-names></name></person-group> (<year>2017</year>). <article-title>Beyond lesion-based diabetic retinopathy: a direct approach for referral</article-title>. <source>IEEE J. Biomed. Health Inform.</source> <volume>21</volume>, <fpage>193</fpage>&#x02013;<lpage>200</lpage>. <pub-id pub-id-type="doi">10.1109/JBHI.2015.2498104</pub-id><pub-id pub-id-type="pmid">26561488</pub-id></citation></ref>
<ref id="B46">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Porwal</surname> <given-names>P.</given-names></name> <name><surname>Pachade</surname> <given-names>S.</given-names></name> <name><surname>Kokare</surname> <given-names>M.</given-names></name> <name><surname>Deshmukh</surname> <given-names>G.</given-names></name> <name><surname>Son</surname> <given-names>J.</given-names></name> <name><surname>Bae</surname> <given-names>W.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>IDRiD: diabetic retinopathy-segmentation and grading challenge</article-title>. <source>Med. Image Anal</source>. <volume>59</volume>:<fpage>101561</fpage>. <pub-id pub-id-type="doi">10.1016/j.media.2019.101561</pub-id><pub-id pub-id-type="pmid">31671320</pub-id></citation></ref>
<ref id="B47">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pratt</surname> <given-names>H.</given-names></name> <name><surname>Coenen</surname> <given-names>F.</given-names></name> <name><surname>Broadbent</surname> <given-names>D. M.</given-names></name> <name><surname>Harding</surname> <given-names>S. P.</given-names></name> <name><surname>Zheng</surname> <given-names>Y</given-names></name></person-group> (<year>2016</year>). <article-title>Convolutional neural networks for diabetic retinopathy</article-title>. <source>Proc. Comput. Sci</source>. <volume>90</volume>, <fpage>200</fpage>&#x02013;<lpage>205</lpage>. <pub-id pub-id-type="doi">10.1016/j.procs.2016.07.014</pub-id></citation>
</ref>
<ref id="B48">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Qomariah</surname> <given-names>D. U. N.</given-names></name> <name><surname>Tjandrasa</surname> <given-names>H.</given-names></name> <name><surname>Fatichah</surname> <given-names>C</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;Classification of diabetic retinopathy and normal retinal images using CNN and SVM,&#x0201D;</article-title> in <source>2019 12th International Conference on Information &#x00026; Communication Technology and System (ICTS)</source> (<publisher-loc>IEEE</publisher-loc>), <fpage>152</fpage>&#x02013;<lpage>157</lpage>. <pub-id pub-id-type="doi">10.1109/ICTS.2019.8850940</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B49">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Quellec</surname> <given-names>G.</given-names></name> <name><surname>Hajj</surname> <given-names>H. A.</given-names></name> <name><surname>Lamard</surname> <given-names>M.</given-names></name> <name><surname>Conze</surname> <given-names>P. H.</given-names></name> <name><surname>Massin</surname> <given-names>P.</given-names></name> <name><surname>Cochener</surname> <given-names>B</given-names></name></person-group> (<year>2021</year>). <article-title>Explanatory artificial intelligence for diabetic retinopathy diagnosis</article-title>. <source>Med. Image Anal.</source> <volume>72</volume>, <fpage>102</fpage>&#x02013;<lpage>118</lpage>. <pub-id pub-id-type="doi">10.1016/j.media.2021.102118</pub-id><pub-id pub-id-type="pmid">34126549</pub-id></citation></ref>
<ref id="B50">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Rismiyati</surname> <given-names>E.</given-names></name> <name><surname>Khadijah</surname> <given-names>S. N.</given-names></name> <name><surname>Shiddiq</surname> <given-names>I. N</given-names></name></person-group> (<year>2020</year>). <article-title>&#x0201C;Xception architecture transfer learning for garbage classification,&#x0201D;</article-title> in <source>2020 4th International Conference on Informatics and Computational Sciences (ICICoS)</source> (<publisher-loc>IEEE</publisher-loc>), <fpage>1</fpage>&#x02013;<lpage>4</lpage>. <pub-id pub-id-type="doi">10.1109/ICICoS51170.2020.9299017</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B51">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Roychowdhury</surname> <given-names>S.</given-names></name> <name><surname>Koozehanani</surname> <given-names>D. D.</given-names></name> <name><surname>Parhi</surname> <given-names>K</given-names></name></person-group> (<year>2014</year>). <article-title>DREAM: Diabetic retinopathy analysis using machine learning</article-title>. <source>IEEE J. Biomed. Health Inform</source>. <volume>18</volume>, <fpage>1717</fpage>&#x02013;<lpage>1728</lpage>. <pub-id pub-id-type="doi">10.1109/JBHI.2013.2294635</pub-id><pub-id pub-id-type="pmid">25192577</pub-id></citation></ref>
<ref id="B52">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Simonyan</surname> <given-names>K.</given-names></name> <name><surname>Zisserman</surname> <given-names>A</given-names></name></person-group> (<year>2015</year>). <article-title>&#x0201C;Very deep convolutional networks for large-scale image recognition,&#x0201D;</article-title> in <source>3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings</source> (<publisher-loc>ICLR</publisher-loc>), arXiv: 1409.1556.</citation>
</ref>
<ref id="B53">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Szegedy</surname> <given-names>C.</given-names></name> <name><surname>Liu</surname> <given-names>W.</given-names></name> <name><surname>Jia</surname> <given-names>Y.</given-names></name> <name><surname>Sermanet</surname> <given-names>P.</given-names></name> <name><surname>Reed</surname> <given-names>S.</given-names></name> <name><surname>Anguelov</surname> <given-names>D.</given-names></name> <etal/></person-group>. (<year>2015</year>). <article-title>&#x0201C;Going deeper with convolutions,&#x0201D;</article-title> in <source>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</source> (<publisher-loc>IEEE</publisher-loc>), <fpage>1</fpage>&#x02013;<lpage>9</lpage>.</citation>
</ref>
<ref id="B54">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Szegedy</surname> <given-names>C.</given-names></name> <name><surname>Loffe</surname> <given-names>S.</given-names></name> <name><surname>Vanhoucke</surname> <given-names>V.</given-names></name> <name><surname>Alemi</surname> <given-names>A. A</given-names></name></person-group> (<year>2017</year>). <article-title>&#x0201C;Inception-v4, inception-ResNet and the impact of residual connections on learning,&#x0201D;</article-title> in <source>Thirty-first AAAI Conference on Artificial Intelligence</source> (<publisher-loc>San Francisco, SF</publisher-loc>: <publisher-name>AAAI</publisher-name>), <fpage>4278</fpage>&#x02013;<lpage>4284</lpage>.</citation>
</ref>
<ref id="B55">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Szegedy</surname> <given-names>C.</given-names></name> <name><surname>Vanhoucke</surname> <given-names>V.</given-names></name> <name><surname>Loffe</surname> <given-names>S.</given-names></name> <name><surname>Shlens</surname> <given-names>J.</given-names></name> <name><surname>Wojna</surname> <given-names>Z</given-names></name></person-group> (<year>2016</year>). <article-title>&#x0201C;Rethinking the inception architecture for computer vision,&#x0201D;</article-title> in <source>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</source> (<publisher-loc>IEEE</publisher-loc>), <fpage>2818</fpage>&#x02013;<lpage>2826</lpage>. <pub-id pub-id-type="doi">10.1109/CVPR.2016.308</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B56">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Thomas</surname> <given-names>A.</given-names></name> <name><surname>Harikrishnan</surname> <given-names>P. M.</given-names></name> <name><surname>Palanisamy</surname> <given-names>P.</given-names></name> <name><surname>Gopi</surname> <given-names>V. P</given-names></name></person-group> (<year>2020</year>). <article-title>&#x0201C;Moving vehicle candidate recognition and classification using inception-resnet-v2,&#x0201D;</article-title> in <source>2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC)</source> (<publisher-loc>IEEE</publisher-loc>), <fpage>467</fpage>&#x02013;<lpage>472</lpage>. <pub-id pub-id-type="doi">10.1109/COMPSAC48688.2020.0-207</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B57">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Torrey</surname> <given-names>L.</given-names></name> <name><surname>Shavlik</surname> <given-names>J</given-names></name></person-group> (<year>2010</year>). <article-title>&#x0201C;Transfer learning,&#x0201D;</article-title> in <source>Handbook of Research on Machine Lear-ning Applications and Trends: Algorithms, Methods, and Techniques</source>, eds E. S. Olivas, J. D. M. Guerrero, M. Martinez-Sober, J. R. Magdalena-Benedito, and A. J. S. Lopez (<publisher-loc>Hershey, PA</publisher-loc>: <publisher-name>IGI Global</publisher-name>), <fpage>242</fpage>&#x02013;<lpage>264</lpage>. <pub-id pub-id-type="doi">10.4018/978-1-60566-766-9.ch011</pub-id></citation>
</ref>
<ref id="B58">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Woo</surname> <given-names>S.</given-names></name> <name><surname>Park</surname> <given-names>J.</given-names></name> <name><surname>Lee</surname> <given-names>J. Y.</given-names></name> <name><surname>Kweon</surname> <given-names>S</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;CBAM: Convolutional block attention module,&#x0201D;</article-title> in <source>Proceedings of the European Conference on Computer Vision (ECCV)</source>, <fpage>3</fpage>&#x02013;<lpage>19</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-030-01234-2_1</pub-id></citation>
</ref>
<ref id="B59">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Wu</surname> <given-names>X.</given-names></name> <name><surname>Liu</surname> <given-names>R.</given-names></name> <name><surname>Yang</surname> <given-names>H.</given-names></name> <name><surname>Chen</surname> <given-names>Z</given-names></name></person-group> (<year>2020</year>). <article-title>&#x0201C;An xception based convolutional neural network for scene image classification with transfer learning,&#x0201D;</article-title> in <source>2020 2nd International Conference on Information Technology and Computer Application (ITCA)</source> (<publisher-loc>IEEE</publisher-loc>), <fpage>262</fpage>&#x02013;<lpage>267</lpage>. <pub-id pub-id-type="doi">10.1109/ITCA52113.2020.00063</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B60">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Wu</surname> <given-names>Y.</given-names></name> <name><surname>Hu</surname> <given-names>Z</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;Recognition of diabetic retinopathy basedon transfer learning,&#x0201D;</article-title> in <source>2019 IEEE 4th International Conference on Cloud Computing and Big Data Analysis (ICCCBDA)</source> (<publisher-loc>IEEE</publisher-loc>), <fpage>398</fpage>&#x02013;<lpage>401</lpage>. <pub-id pub-id-type="doi">10.1109/ICCCBDA.2019.8725801</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B61">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Xiao</surname> <given-names>F.</given-names></name> <name><surname>Kuang</surname> <given-names>R.</given-names></name> <name><surname>Ou</surname> <given-names>Z.</given-names></name> <name><surname>Xiong</surname> <given-names>B</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;DeepMen: multi-model ensemble network for b-lymphoblast cell classification,&#x0201D;</article-title> in <source>ISBI 2019 C-NMC Challenge: Classification in Cancer Cell Imaging</source> (<publisher-loc>Springer</publisher-loc>), <fpage>83</fpage>&#x02013;<lpage>93</lpage>. <pub-id pub-id-type="doi">10.1007/978-981-15-0798-4_9</pub-id></citation>
</ref>
<ref id="B62">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xie</surname> <given-names>S.</given-names></name> <name><surname>Girshick</surname> <given-names>R.</given-names></name> <name><surname>Dollar</surname> <given-names>P.</given-names></name> <name><surname>Tu</surname> <given-names>Z.</given-names></name> <name><surname>He</surname> <given-names>K</given-names></name></person-group> (<year>2017</year>). <article-title>&#x0201C;Aggregated residual transformations for deep neural networks,&#x0201D;</article-title> in <source>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</source>, <fpage>1497</fpage>&#x02013;<lpage>1500</lpage>. <pub-id pub-id-type="doi">10.1109/CVPR.2017.634</pub-id><pub-id pub-id-type="pmid">31141794</pub-id></citation></ref>
<ref id="B63">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yao</surname> <given-names>N.</given-names></name> <name><surname>Ni</surname> <given-names>F.</given-names></name> <name><surname>Wang</surname> <given-names>Z.</given-names></name> <name><surname>Luo</surname> <given-names>J.</given-names></name> <name><surname>Sung</surname> <given-names>W. K.</given-names></name> <name><surname>Luo</surname> <given-names>C.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>L2mxception: an improved xception network for classification of peach diseases</article-title>. <source>Plant Methods</source> <volume>17</volume>, <fpage>1</fpage>&#x02013;<lpage>13</lpage>. <pub-id pub-id-type="doi">10.1186/s13007-021-00736-3</pub-id><pub-id pub-id-type="pmid">33794942</pub-id></citation></ref>
<ref id="B64">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>W.</given-names></name> <name><surname>Zhao</surname> <given-names>X.</given-names></name> <name><surname>Chen</surname> <given-names>Y.</given-names></name> <name><surname>Zhong</surname> <given-names>J.</given-names></name> <name><surname>Yi</surname> <given-names>Z</given-names></name></person-group> (<year>2021</year>). <article-title>DeepUWF: an automated ultra-wide-field fundus screening system via deep learning</article-title>. <source>IEEE J. Biomed. Health Inform</source>. <volume>25</volume>, <fpage>2988</fpage>&#x02013;<lpage>2996</lpage>. <pub-id pub-id-type="doi">10.1109/JBHI.2020.3046771</pub-id><pub-id pub-id-type="pmid">33361011</pub-id></citation></ref>
<ref id="B65">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Zhao</surname> <given-names>Z.</given-names></name> <name><surname>Zhang</surname> <given-names>K.</given-names></name> <name><surname>Hao</surname> <given-names>X.</given-names></name> <name><surname>Tian</surname> <given-names>J.</given-names></name> <name><surname>Chua</surname> <given-names>M. C. H.</given-names></name> <name><surname>Chen</surname> <given-names>L.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>&#x0201C;BiRA-net: bilinear attention net for diabetic retinopathy grading,&#x0201D;</article-title> in <source>2019 IEEE International Conference on Image Processing (ICIP)</source> (<publisher-loc>IEEE</publisher-loc>), <fpage>1385</fpage>&#x02013;<lpage>1389</lpage>. <pub-id pub-id-type="doi">10.1109/ICIP.2019.8803074</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B66">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zheng</surname> <given-names>J.</given-names></name> <name><surname>Cao</surname> <given-names>X.</given-names></name> <name><surname>Zhang</surname> <given-names>B.</given-names></name> <name><surname>Zhen</surname> <given-names>X.</given-names></name> <name><surname>Su</surname> <given-names>X</given-names></name></person-group> (<year>2019</year>). <article-title>Deep ensemble machine for video classification</article-title>. <source>IEEE Trans. Neural Netw. Learn. Syst.</source> <volume>30</volume>, <fpage>553</fpage>&#x02013;<lpage>565</lpage>. <pub-id pub-id-type="doi">10.1109/TNNLS.2018.2844464</pub-id><pub-id pub-id-type="pmid">29994406</pub-id></citation></ref>
<ref id="B67">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zoph</surname> <given-names>B.</given-names></name> <name><surname>Vasudevan</surname> <given-names>V.</given-names></name> <name><surname>Shlens</surname> <given-names>J.</given-names></name> <name><surname>Le</surname> <given-names>Q. V</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;Learning transferable architectures for scalable image recognition,&#x0201D;</article-title> in <source>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</source>, <fpage>8697</fpage>&#x02013;<lpage>8710</lpage>. <pub-id pub-id-type="doi">10.1109/CVPR.2018.00907</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
</ref-list> 
</back>
</article>