<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Med.</journal-id>
<journal-title>Frontiers in Medicine</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Med.</abbrev-journal-title>
<issn pub-type="epub">2296-858X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fmed.2021.741407</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Medicine</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Red Blood Cell Classification Based on Attention Residual Feature Pyramid Network</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Song</surname> <given-names>Weiqing</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1407163/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Huang</surname> <given-names>Pu</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Wang</surname> <given-names>Jing</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Shen</surname> <given-names>Yajuan</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Zhang</surname> <given-names>Jian</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Lu</surname> <given-names>Zhiming</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<xref ref-type="corresp" rid="c002"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/714175/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Li</surname> <given-names>Dengwang</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="corresp" rid="c003"><sup>&#x0002A;</sup></xref>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Liu</surname> <given-names>Danhua</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="corresp" rid="c004"><sup>&#x0002A;</sup></xref>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Shandong Key Laboratory of Medical Physics and Image Processing, Shandong Institute of Industrial Technology for Health Sciences and Precision Medicine, School of Physics and Electronics, Shandong Normal University</institution>, <addr-line>Jinan</addr-line>, <country>China</country></aff>
<aff id="aff2"><sup>2</sup><institution>Department of Clinical Laboratory, Shandong Provincial Hospital Affiliated to Shandong First Medical University</institution>, <addr-line>Jinan</addr-line>, <country>China</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Jun Feng, Northwest University, China</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Nurlan Dauletbayev, McGill University, Canada; Karim A. Mohamed Al-Jashamy, SEGi University, Malaysia</p></fn>
<corresp id="c004">&#x0002A;Correspondence: Danhua Liu <email>liudanhua&#x00040;sdnu.edu.cn</email></corresp>
<corresp id="c003">Dengwang Li <email>dengwang&#x00040;sdnu.edu.cn</email></corresp>
<corresp id="c002">Zhiming Lu <email>luzhiming&#x00040;sdu.edu.cn</email></corresp>
<corresp id="c001">Pu Huang <email>pu.wong&#x00040;139.com</email></corresp>
<fn fn-type="other" id="fn001"><p>This article was submitted to Precision Medicine, a section of the journal Frontiers in Medicine</p></fn></author-notes>
<pub-date pub-type="epub">
<day>14</day>
<month>12</month>
<year>2021</year>
</pub-date>
<pub-date pub-type="collection">
<year>2021</year>
</pub-date>
<volume>8</volume>
<elocation-id>741407</elocation-id>
<history>
<date date-type="received">
<day>14</day>
<month>07</month>
<year>2021</year>
</date>
<date date-type="accepted">
<day>25</day>
<month>11</month>
<year>2021</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2021 Song, Huang, Wang, Shen, Zhang, Lu, Li and Liu.</copyright-statement>
<copyright-year>2021</copyright-year>
<copyright-holder>Song, Huang, Wang, Shen, Zhang, Lu, Li and Liu</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license> </permissions>
<abstract><p>Clinically, red blood cell abnormalities are closely related to tumor diseases, red blood cell diseases, internal medicine, and other diseases. Red blood cell classification is the key to detecting red blood cell abnormalities. Traditional red blood cell classification is done manually by doctors, which requires a lot of manpower produces subjective results. This paper proposes an Attention-based Residual Feature Pyramid Network (ARFPN) to classify 14 types of red blood cells to assist the diagnosis of related diseases. The model performs classification directly on the entire red blood cell image. Meanwhile, a spatial attention mechanism and channel attention mechanism are combined with residual units to improve the expression of category-related features and achieve accurate extraction of features. Besides, the RoI align method is used to reduce the loss of spatial symmetry and improve classification accuracy. Five hundred and eighty eight red blood cell images are used to train and verify the effectiveness of the proposed method. The Channel Attention Residual Feature Pyramid Network (C-ARFPN) model achieves an mAP of 86%; the Channel and Spatial Attention Residual Feature Pyramid Network (CS-ARFPN) model achieves an mAP of 86.9%. The experimental results indicate that our method can classify more red blood cell types and better adapt to the needs of doctors, thus reducing the doctor&#x00027;s time and improving the diagnosis efficiency.</p></abstract>
<kwd-group>
<kwd>attention mechanism</kwd>
<kwd>feature pyramid network</kwd>
<kwd>red blood cells</kwd>
<kwd>classification</kwd>
<kwd>microscopic image</kwd>
</kwd-group>
<counts>
<fig-count count="8"/>
<table-count count="8"/>
<equation-count count="4"/>
<ref-count count="42"/>
<page-count count="12"/>
<word-count count="7750"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>Introduction</title>
<p>As a connective tissue, blood has the following four forms, namely white blood cells (WBCs), red blood cells (RBCs), platelets, and plasma. Plasma can be regarded as an intercellular substance. The other three types of cells can be distinguished according to their shape, size, presence or absence of nucleus, color, and texture (<xref ref-type="bibr" rid="B1">1</xref>). RBCs are the majority component of blood cells, which transport oxygen to various parts of the human body and discharge the carbon dioxide produced by the human body (<xref ref-type="bibr" rid="B2">2</xref>, <xref ref-type="bibr" rid="B3">3</xref>). The morphology of RBCs is non-nucleated, with biconvex and concave round pie-shaped cells. Its average diameter and thickness of this type of cell are about 7 and 2.5 &#x003BC;m, respectively. RBCs are produced in the bone marrow, and the development of primitive RBCs into mature RBCs consists of four stages: basophilic normoblast, polychromatic normoblast, orthochromatic normoblast, and reticulocytes. After mature, RBCs enter the peripheral blood, as shown in <xref ref-type="fig" rid="F1">Figure 1</xref>. The average life span of RBCs is about 120 days, and abnormal RBCs may live longer or shorter. Common RBC abnormalities include polycythemia, erythropenia, decreasing or increasing in size and hemoglobin, and changes in RBC morphology.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>Red blood cell maturation process.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fmed-08-741407-g0001.tif"/>
</fig>
<p>Diseases associated with RBCs include anemia, malaria, kidney tumors, malnutrition, and hemolytic disease, and anemia is the most common disease (<xref ref-type="bibr" rid="B4">4</xref>). These diseases cause many abnormal RBCs to appear in the peripheral blood. Mainly manifested as a change in the shape, size, and hemoglobin content of RBCs (<xref ref-type="bibr" rid="B5">5</xref>). Since abnormal RBCs may be a signal of certain diseases (<xref ref-type="bibr" rid="B6">6</xref>, <xref ref-type="bibr" rid="B7">7</xref>), the detection and classification of RBCs are of great significance for the timely detection of diseases.</p>
<p>Clinically, doctors need to use a microscope to check whether there are abnormal RBCs or immature cells in the peripheral blood (<xref ref-type="bibr" rid="B8">8</xref>). In this case, there are usually hundreds of RBCs in the field of view, and a large number of images are obtained by microscopic image capturing equipment. This requires a lot of manpower. Meanwhile, the operation relies on the subjective judgment of the doctor, and different operators may produce different results (<xref ref-type="bibr" rid="B9">9</xref>), which will affect the accuracy of the test results.</p>
<p>In recent years, with the development of image processing technology, medical image analysis has become an indispensable tool in medical research, clinical disease diagnosis, and treatment (<xref ref-type="bibr" rid="B8">8</xref>). This technique has been used to analyze various types of medical images and extract more useful medical information from images to help clinical diagnosis. An automatic and effective cell classification method can be used to assist doctors in improving treatment plans and predicting treatment results. At present, the microscopic images generally have the following shortcomings: (1) The image capture process is affected by many factors such as light, color changes, blurring, etc.; (2) There may be interferences such as noise. In recent years, deep learning has developed into a research hotspot in medical image analysis. It can extract the hidden diagnosis features from medical images and solves the problems in medical image processing, such as object tracking (<xref ref-type="bibr" rid="B10">10</xref>), multi-label classification (<xref ref-type="bibr" rid="B11">11</xref>), pedestrian detection (<xref ref-type="bibr" rid="B12">12</xref>), and multi-class classification (<xref ref-type="bibr" rid="B13">13</xref>). Aiming at the challenges in RBC images, our study attempts to use the deep learning method to greatly improve the efficiency of doctors and ensure the accuracy and objectivity of the detection results.</p>
<p>A lot of research works have been done on the detection and classification of RBCs. Yi et al. (<xref ref-type="bibr" rid="B14">14</xref>) proposed a method to analyze the equality of the covariance matrix in the Gabor filtered holographic image to automatically select the linear or non-linear classifier for RBC classification. This method used a single RBC image to classify three types of RBCs. Maji et al. (<xref ref-type="bibr" rid="B15">15</xref>) proposed to use mathematical morphology to automatically characterize RBCs. Mahmood et al. (<xref ref-type="bibr" rid="B16">16</xref>) used geometric features and Hough transform to detect the center of RBCs. Morphology was used to identify and extract RBCs from the background or other cells, Hough transform is used to identify the shape of RBCs. Besides, K-means clustering (<xref ref-type="bibr" rid="B17">17</xref>), boundary descriptors (<xref ref-type="bibr" rid="B18">18</xref>), and geometric features (<xref ref-type="bibr" rid="B19">19</xref>) were used to extract features. Sen et al. (<xref ref-type="bibr" rid="B20">20</xref>) used machine learning to divide RBCs into three categories. The method first divides RBCs into individual cells and then extracts features and classifications, which achieves an accuracy of 92%.</p>
<p>Lee et al. (<xref ref-type="bibr" rid="B21">21</xref>) proposed a hybrid neural network structure that combines parallel and cascading topologies for RBC classification. The authors used a single RBC image to extract shape features and clustering features. Then, the extracted features were input into a feedforward neural network with a three-layer structure for classification. Jambhekar et al. (<xref ref-type="bibr" rid="B22">22</xref>) studied the use of artificial neural networks to classify blood cells. The three-layer network achieves an accuracy of 81% for classifying sickle RBCs, WBCs, and overlapping cells. Elsalamony et al. (<xref ref-type="bibr" rid="B23">23</xref>) proposed to use a three-layer neural network to classify sickle cells and elliptical cells using the shape features of RBC. Xu et al. (<xref ref-type="bibr" rid="B24">24</xref>) used deep convolutional neural networks to classify eight types of RBCs, and the proposed method achieves an accuracy of 87.5%. Alzubaidi et al. (<xref ref-type="bibr" rid="B25">25</xref>) proposed a convolutional neural network using the ECOC model as a classifier. The method divides RBCs into normal cells, sickle cells, and other three categories, which achieves an accuracy of 88.11%. Kihm et al. (<xref ref-type="bibr" rid="B26">26</xref>) used a regression-based convolutional neural network to classify two types of RBCs (&#x0201C;slipper&#x0201D; and &#x0201C;croissant&#x0201D;) in a flowing state. Parab et al. (<xref ref-type="bibr" rid="B27">27</xref>) used a convolutional neural network to extract and classify individual RBCs after segmentation. They divided RBCs into nine categories and achieved an accuracy of 98.5%. Lin et al. (<xref ref-type="bibr" rid="B28">28</xref>) used FPN-ResNet-101 and Mask RCNN to classify two types of RBCs (hRBCs and tRBCs) in quantitative phase images, with an accuracy of 97%.</p>
<p>Most of the current works to segment red blood cell images into individual RBCs and then perform feature extraction and classification; very few works perform direct classification of the entire red blood cell image, and the number of RBCs in each image is small (about dozens). After fully understanding the needs of doctors and summarizing the methods in the research field, this paper proposed an Attention Residual Feature Pyramid Network (ARFPN). In this method, dense red blood cell images (each image contains about 230 red blood cells) are used for direct classification. Meanwhile, the feature pyramid network (<xref ref-type="bibr" rid="B29">29</xref>) is combined with spatial and channel attention mechanisms to focus on the multi-scale features related to categories, thus improving the expression of related features and suppressing background features. Besides, an anchor intensive strategy is adopted to better cover RBCs in the proposal stage. Moreover, the RoI align method is used to improve the extraction accuracy of RoI and locate the object more accurately. The contributions of this paper are summarized as follows: (i) the method can detect and classify 14 red blood cells; (ii) there is no single red blood cell segmentation, which simplifies the implementation steps and improves the efficiency; (iii) the method provides convenience for doctors, which can better adapt to the needs of doctors, and has better clinical applicability.</p>
<p>The rest of this paper is organized as follows. Section Materials and Methods introduces the used data set, data preprocessing methods, and the feature extraction and classification methods based on the channel and spatial attention feature pyramid network; section Results analyzes and introduced the experimental results; the results are discussed in section Discussion. Finally, conclusions are put forward in section Conclusions.</p></sec>
<sec sec-type="materials and methods" id="s2">
<title>Materials and Methods</title>
<p>As shown in <xref ref-type="fig" rid="F2">Figure 2</xref>, the workflow of our proposed method for RBC classification includes the image processing stage, feature extraction stage, post-processing stage, and cell classification stage. Each stage is described in detail in the following.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>RBC classification based on the ARFPN classification network.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fmed-08-741407-g0002.tif"/>
</fig>
<sec>
<title>Data Acquisition</title>
<p>The dataset was collected from the Department of Clinical Laboratory of Shandong Provincial Hospital, affiliated with Shandong First Medical University. The RBC images in the dataset were collected by CellaVision DM96 (CellaVision AB, Lund, Sweden). The blood sample was put into a blood smear and then detected by the device to capture the image. The finished blood smear is shown in <xref ref-type="fig" rid="F3">Figure 3</xref>.</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>Schematic diagram of blood smear. A large number of red blood cells overlap in the green box area. The number of red blood cells in the yellow frame area is small. The number of cells in the red frame area is appropriate and evenly distributed, which is suitable for observation.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fmed-08-741407-g0003.tif"/>
</fig>
<p>The resolution of each collected images is about 1,418 &#x000D7; 1,420. There are approximately 1,300 RBCs in each image (not including edge cells). All collected microscope images are in BMP format and contain RGB channels. Three types of cells are included in the images, i.e., RBC, white blood cells (WBC), and platelets. The obtained data set was verified by experienced doctors to avoid the interference of external factors, such as light.</p></sec>
<sec>
<title>Pre-processing</title>
<p>The obtained images were preprocessed to make them more suitable for our study. First, the image in BMP format was converted to JPG format, and the noise was eliminated by a Gaussian filter. Then, the image was cropped according to the Pascal VOC dataset format. The size of the cropped image is 375 &#x000D7; 500, and the number of RBCs in the image is usually more than 200. The repeat parameter was set to 30% during cropping to expand the data. In the cropping process, the image containing many rare RBCs was horizontally flipped to expand the data and increase the sample size. Finally, LabelImg was adapted to label the RBCs in the image. Labeling and inspection were conducted by two experienced doctors. All the RBCs were divided into 14 categories (schistocyte, spherocyte, stomatocytes, target cells, hypochromic, elliptocytes, normal RBCs, overlapping RBCs, hyperchromic, microcyte, macrocyte, teardrop cells, basophilicstippling and the cells at the edge of the image). In RBC image, the resolution size of normal RBCs is 21 &#x000D7; 21, those with a resolution &#x0003E;24 &#x000D7; 24 are macrocyte, and those with a resolution &#x0003C;18 &#x000D7; 18 are microcyte. The schistocytes are broken red blood cells that resemble &#x0201C;fragments&#x0201D; in shape. Hyperchromic and hypochromic are related to the content of hemoglobin, and elliptocytes are shaped like ellipses. The target cell is shaped like a &#x0201C;shooting target,&#x0201D; and stomatocytes is shaped like a &#x0201C;mouth.&#x0201D; The corresponding quantity of each RBC category is listed in <xref ref-type="table" rid="T1">Table 1</xref>. The obtained dataset was used to evaluate our method and compare the results. After preprocessing, there are 588 images in total, each of which is a 350 &#x000D7; 500 &#x000D7; 3 RGB image. Four hundred and seventy images were used as the training set, and the remaining 118 images were used as the test set. As shown in <xref ref-type="table" rid="T2">Table 2</xref>.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Various types of RBCs.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Name</bold></th>
<th valign="top" align="center"><bold>Number</bold></th>
<th valign="top" align="center"><bold>Name</bold></th>
<th valign="top" align="center"><bold>Number</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Schistocyte</td>
<td valign="top" align="center">772</td>
<td valign="top" align="center">Overlap</td>
<td valign="top" align="center">6,064</td>
</tr>
<tr>
<td valign="top" align="left">Spherocyte</td>
<td valign="top" align="center">4,021</td>
<td valign="top" align="center">Hyperchromic</td>
<td valign="top" align="center">17,693</td>
</tr>
<tr>
<td valign="top" align="left">Stomatocytes</td>
<td valign="top" align="center">615</td>
<td valign="top" align="center">Microcyte</td>
<td valign="top" align="center">17,060</td>
</tr>
<tr>
<td valign="top" align="left">Target cells</td>
<td valign="top" align="center">537</td>
<td valign="top" align="center">Macrocyte</td>
<td valign="top" align="center">9,084</td>
</tr>
<tr>
<td valign="top" align="left">Hypochromic</td>
<td valign="top" align="center">5,414</td>
<td valign="top" align="center">Teardrop cells</td>
<td valign="top" align="center">1,287</td>
</tr>
<tr>
<td valign="top" align="left">Elliptocytes</td>
<td valign="top" align="center">15,439</td>
<td valign="top" align="center">Basophilicstippling</td>
<td valign="top" align="center">451</td>
</tr>
<tr>
<td valign="top" align="left">Normal RBC</td>
<td valign="top" align="center">18,126</td>
<td valign="top" align="center">Edge cells</td>
<td valign="top" align="center">20,661</td>
</tr>
</tbody>
</table>
</table-wrap>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Allocation of training set and test set (number of cells and number of images).</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th/>
<th valign="top" align="center"><bold>Cell number</bold></th>
<th valign="top" align="center"><bold>Image number</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Train</td>
<td valign="top" align="center">9,3885</td>
<td valign="top" align="center">470</td>
</tr>
<tr>
<td valign="top" align="left">Test</td>
<td valign="top" align="center">2,3366</td>
<td valign="top" align="center">118</td>
</tr>
</tbody>
</table>
</table-wrap></sec>
<sec>
<title>Feature Extraction of Shape, Size, and Hemoglobin Content</title>
<p>The size of normal RBCs is about 7&#x0007E;8 &#x003BC;m, which is reflected in the image with a resolution of 21 &#x000D7; 21. The size of abnormal RBCs in the image varies widely, and each has its specific shape. The characteristics of the RBC images can be summarized as: (a) Large changes in cell size; (b) RBCs are small objects; (c) Cells are densely distributed; (d) The contrast between the RBC and the background is low. In deep learning object detection, the detection of small objects has always been a difficult problem due to low resolution, blurry pictures, less information, and weak feature expression. This study used feature pyramid network (FPN) to overcome the above problems because it can better deal with the multi-scale changes in object detection. The FPN makes reasonable use of the features of each layer in the convolutional network and merges the features of different layers. Specifically, it constructs a top-down, horizontally connected hierarchical structure that combines low-resolution and strong semantic features with high-resolution and weak semantic features.</p>
<p>In recent years, attention network models have achieved good performance in classification tasks. In this research, channel attention mechanism (<xref ref-type="bibr" rid="B30">30</xref>) and spatial attention mechanism (<xref ref-type="bibr" rid="B31">31</xref>) were integrated into the feature extraction network to achieve accurate classification of RBCs. In the feature extraction stage, the attention mechanism (<xref ref-type="bibr" rid="B32">32</xref>) can highlight the features related to categories while focusing on the key features of red blood cells and generating more discriminative feature representations. The integration of these two mechanisms contributes to a great performance improvement when the number of growth parameters is small.</p>
<sec>
<title>Channel Attention Residual Feature Pyramid Network</title>
<p>The structure of Channel Attention Residual Feature Pyramid Network (C-ARFPN) is shown in <xref ref-type="fig" rid="F4">Figure 4</xref>. ResNet-101 and ResNet-50 are used as the backbone of our network. Each bottleneck of the residual network is replaced with channel attention residual units (CARUs) that are located behind the residual unit. CARU first averages and pools the input features, so that the features can respond to the global distribution, thus expanding the global receptive field and reducing the calculation amount. The following two full connection layers map the channel feature representation to the sample label space, and the output represents the weight of each feature channel. The feature channel is weighted with the input feature to recalibration the input feature on the channel dimension.</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p>Schematic diagram of the Channel Attention Residual Feature Pyramid Network (C-ARFPN) structure. Resnet-101 is used as the backbone. Channel Attention Residual Units (CARU) are located at the front of each residual congestion unit and stacked in varying numbers to form residual blocks.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fmed-08-741407-g0004.tif"/>
</fig></sec>
<sec>
<title>Channel and Spatial Attention Residual Feature Pyramid Network</title>
<p>The structure of Channel and Spatial Attention Residuals Feature Pyramid Net-work (CS-ARFPN) is shown in <xref ref-type="fig" rid="F5">Figure 5</xref>. The upper part of the figure illustrates the overall flow chart of feature extraction. Similarly, ResNet-50 and ResNet-101 are used as the backbone. The lower part of the figure shows the structure of the Channel and Spatial Attention Residual Unit (CSARU), which is nested in each residual unit of the residual network and located behind the three convolution cores. The input feature is first compressed in the spatial dimension, and average pooling and maximum pooling are used to aggregate the spatial information of the feature map. The obtained feature space information is sent to the multi-layer perceptron for element-by-element summation, and the compressed space information is multiplied by the original feature points to obtain the channel attention feature. Then, the channel attention feature is input to the spatial attention unit, and average pooling and max pooling are used to compress channels and extract the maximum value. After dimensionality reduction through convolution operation, the attention feature is obtained by dot product with the original channel attention feature.</p>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption><p>Schematic diagram of the Channel and Spatial Attention Residual Feature Pyramid Network (CS-ARFPN) structure. This structure still uses ResNet-101 as the backbone. Channel and Spatial Attention Residual Units (CSAU) are located behind each residual congestion unit. They focus on the key and detailed features of the red blood cell image.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fmed-08-741407-g0005.tif"/>
</fig>
<p>Both C-ARFPN and CS-ARFPN use ResNet-101 and ResNet-50 as the backbone. Each attention module is distributed in a residual unit according to its position. Since RBCs are small objects, the feature pyramid network combined with the attention mechanism can merge deep and shallow features and focus on category-related features. This improvement makes the characteristics of RBCs more accurate and richer, thus improving the model&#x00027;s detection and classification ability of small objects, and improving the performance of the model.</p></sec></sec>
<sec>
<title>Post-processing and Classification</title>
<p>After feature extraction, the features are input to the subsequent network for post-processing and classification. First, the feature map is input into the RPN to filter out the anchors containing the foreground. Then, the high-quality object candidate box is selected and input into the ROI pooling layer. In the pooling operation, RoI align (<xref ref-type="bibr" rid="B33">33</xref>) instead of RoI pooling operation is used. Compared with RoI pooling, RoI align removes the quantization rounding operation, so it can overcome the bounding box offset problem (<xref ref-type="bibr" rid="B34">34</xref>) and extract more accurate RoI. After the RoI align operation is performed on the feature map, the candidate recognition regions of different sizes are normalized into a fixed-size object recognition region.</p>
<p>The features after RPN and RoI pooling are sent to the subsequent network for classification and regression. In this process, 14 types of RBCs are classified including schistocyte, spherocyte, stomatocytes, target cells, hypochromic, elliptocytes, normal RBCs, overlapping RBCs, hyperchromic, microcyte, macrocyte, teardrop cells, basophilic stippling, and the cells at the edge of the image. The schematic diagram is shown in <xref ref-type="fig" rid="F6">Figure 6</xref>. Normal RBCs and macrocytes are displayed in one image to make their difference obvious. It can be seen from the figure that each cell has its characteristics. During the training process, the weight of the network is adjusted according to the input data to minimize the error between the input and the target. Then Fast RCNN (<xref ref-type="bibr" rid="B35">35</xref>) is used to perform cell classification, and the output of the classification prediction is converted into a probability distribution through softmax.</p>
<fig id="F6" position="float">
<label>Figure 6</label>
<caption><p>Schematic diagram of normal and abnormal red blood cells. This picture illustrates the most obvious characteristics of each red blood cell, such as the shape, size, and hemoglobin content.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fmed-08-741407-g0006.tif"/>
</fig></sec>
<sec>
<title>Ablation Study</title>
<p>Our proposed method uses the attention module to learn the features related to categories. First, the effectiveness of the attention module was verified, and the performance of two different attention modules was compared. Meanwhile, the performance of RoI align and RoI pooling methods was compared. Besides, the impact of Adam optimizer, momentum optimizer, and various training parameters on the model performance was investigated. All comparisons and analyses were performed under the same parameter settings. Moreover, the effectiveness of the proposed model on different was verified public datasets.</p></sec>
<sec>
<title>Training Implementation</title>
<p>Our proposed method was implemented on a computer equipped with Intel<sup>&#x000AE;</sup> Core&#x02122; i7-8700k CPU&#x00040;3.70GHz with 32GB memory, and the computationally intensive calculations were offloaded to an Nvidia Tesla P100 GPU with 16 GB HBM2 memory and 3,584 computer unified device architecture (CUDA) cores. To visually present the obtained model parameters, all experiments were conducted using Python programming language under the TensorFlow framework (<xref ref-type="bibr" rid="B36">36</xref>). In the training process, momentum and Adam optimizer were used in the parameter configuration to minimize the loss. The batch size was set to 1, and the number of iterations was set to 110,000. It takes 35 h to complete the optimization. In the early stage of training, a large learning rate was used to make the model easy to obtain the optimal solution; in the later stage of training, a small learning rate was used to ensure that the model will not fluctuate too much. The learning rate was divided by 10 after 60,000 and 80,000 iterations, and the minimum learning rate was set to 10<sup>&#x02212;6</sup>. Besides, the momentum of the model was set to 0.9, and the weight decay was set to 10<sup>&#x02212;4</sup>. During training, random initialization was used to initialize the weights, and used the cross-entropy loss function was adopted to evaluate the error between the predicted value and the true value of our model. The calculation formula of the cross-entropy loss function is shown in Equation (1).</p>
<disp-formula id="E1"><label>(1)</label><mml:math id="M1"><mml:mtable class="eqnarray" columnalign="right center left"><mml:mtr><mml:mtd><mml:mi>C</mml:mi><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:munderover></mml:mstyle><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mtext>&#x000A0;</mml:mtext><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>g</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <italic>k</italic> represents the number of classes; <italic>y</italic><sub><italic>i</italic></sub> represents the label of category <italic>i</italic>; <italic>p</italic><sub><italic>i</italic></sub> represents the output probability of class <italic>i</italic>, and this value was calculated by Softmax.</p></sec>
<sec>
<title>Evaluation Metrics</title>
<p>In our experiments, the metrics of precision, recall, and F1-score were taken to evaluate the performance of our proposed method. The calculation formulas of the evaluation metrics are expressed in Equations (2&#x02013;4).</p>
<disp-formula id="E2"><label>(2)</label><mml:math id="M2"><mml:mtable class="eqnarray" columnalign="right center left"><mml:mtr><mml:mtd><mml:mi>P</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi></mml:mtd><mml:mtd><mml:mo>=</mml:mo></mml:mtd><mml:mtd><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mi>F</mml:mi><mml:mi>P</mml:mi></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="E3"><label>(3)</label><mml:math id="M3"><mml:mtable class="eqnarray" columnalign="right center left"><mml:mtr><mml:mtd><mml:mi>R</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi></mml:mtd><mml:mtd><mml:mo>=</mml:mo></mml:mtd><mml:mtd><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mi>F</mml:mi><mml:mi>N</mml:mi></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="E4"><label>(4)</label><mml:math id="M4"><mml:mtable class="eqnarray" columnalign="right center left"><mml:mtr><mml:mtd><mml:mi>F</mml:mi><mml:mn>1</mml:mn><mml:mtext>&#x000A0;</mml:mtext><mml:mi>s</mml:mi><mml:mi>c</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi></mml:mtd><mml:mtd><mml:mo>=</mml:mo></mml:mtd><mml:mtd><mml:mfrac><mml:mrow><mml:mn>2</mml:mn><mml:mtext>&#x000A0;</mml:mtext><mml:mo>&#x000D7;</mml:mo><mml:mi>P</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mtext>&#x000A0;</mml:mtext><mml:mo>&#x000D7;</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mi>R</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi></mml:mrow><mml:mrow><mml:mi>P</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mtext>&#x000A0;</mml:mtext><mml:mo>&#x0002B;</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mi>R</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Where, TF (True Positive) indicates the number of positive samples that are also judged by the model as positive; TN (True Negative) indicates the number of negative samples that are also judged by the model as negative; FN (False Negative) indicates number of positive samples that are judged by the model as negative. FP (False Positive) indicates the number of negative samples that are judged by the model as positive. Based on this, precision is the ratio of the number of correctly predicted positive examples to the number of samples predicted as positive; recall is the ratio of correctly predicted positive examples to the number of real positive samples. F1 score is the harmonic average of precision and recall, so it can comprehensively reflect the performance. In general, the higher the F1 score, the better the performance of the model.</p></sec></sec>
<sec sec-type="results" id="s3">
<title>Results</title>
<sec>
<title>Ablation Study: Comparison of FPN With or Without Attention Module, ROI Pooling or ROI Align, and Others</title>
<p>In the ablation study, ResNet-101 and ResNet-50 were used as the backbone in the training, and different learning rates were set. Since the RBC object is small and densely distributed, a small anchor size was used.</p>
<p>In <xref ref-type="table" rid="T3">Table 3</xref>, FPN is the original model without attention module; C-ARFPN is the feature pyramid network with channel attention residual unit; CS-ARFPN is the feature pyramid network with channel and spatial attention residual unit. It can be seen that the CS-ARFPN model achieves better performance. Compared with FPN, the accuracy, recall, F1 score, and mAP of CS-ARFPN and S-ARFPN are improved by {4.4, 7.9, 5.7, 6.1%} and {5.5, 7.4, 6.0, 7.2%}, respectively. Compared with the S-ARFPN model, the accuracy, recall, F1 score, and mAP of the CS-ARFPN model is improved by 1.1, &#x02212;0.7, 0.3, and 0.9%.</p>
<table-wrap position="float" id="T3">
<label>Table 3</label>
<caption><p>The evaluation metrics of the model with/without the attention module. The best results are shown for each model.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Models</bold></th>
<th valign="top" align="center"><bold>Recall</bold></th>
<th valign="top" align="center"><bold>Precision</bold></th>
<th valign="top" align="center"><bold>F1</bold></th>
<th valign="top" align="center"><bold>mAP</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">FPN</td>
<td valign="top" align="center">0.756</td>
<td valign="top" align="center">0.759</td>
<td valign="top" align="center">0.759</td>
<td valign="top" align="center">0.798</td>
</tr>
<tr>
<td valign="top" align="left">Our proposed (S-ARFPN)</td>
<td valign="top" align="center">0.8</td>
<td valign="top" align="center">0.838</td>
<td valign="top" align="center">0.816</td>
<td valign="top" align="center">0.86</td>
</tr>
<tr>
<td valign="top" align="left">Our proposed (CS-ARFPN)</td>
<td valign="top" align="center">0.811</td>
<td valign="top" align="center">0.831</td>
<td valign="top" align="center">0.819</td>
<td valign="top" align="center">0.869</td>
</tr>
</tbody>
</table>
</table-wrap>
<p><xref ref-type="fig" rid="F7">Figure 7</xref> presents the feature map of the two models with different attention residual units and the original FPN model. The leftmost column shows the input image, and the next six columns show the feature maps of the three models including CS-ARFPN, S-ARFPN, and FPN (each model contains two columns of feature maps). They are the feature maps extracted by the convolutional layer {C2, C3, C4} and the pyramid layer {P2, P3, P4}. It can be seen that the feature maps extracted by the CS-ARFPN model pay more attention to the object to be recognized, so the model achieves better performance.</p>
<fig id="F7" position="float">
<label>Figure 7</label>
<caption><p><bold>(A)</bold> The feature map of the Channel Spatial Attention Residual Feature Pyramid Network (CS-ARFPN) model; <bold>(B)</bold> The feature map of the Spatial Attention Residual Feature Pyramid Network (S-ARFPN) model; <bold>(C)</bold> The feature map of the original FPN model. The feature maps extracted from each layer are presented, where warmer colors, such as red and yellow, indicate higher attention weights. The figure, the model with an attention module has a stronger expression of target characteristics and focuses more on the object.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fmed-08-741407-g0007.tif"/>
</fig>
<p><xref ref-type="fig" rid="F8">Figure 8</xref> shows the precision-recall curve (PR) of the three models of FPN, C-ARFPN, and CS-ARFPN. The closer the curve to the upper right, the larger the area under the line and the better the performance of the model. The PR area under the curve (PR-AUC) of the three models is 0.798, 0.86, 0.869, respectively. Thus, the CS-ARFPN model achieves the best performance, followed by C-ARFPN.</p>
<fig id="F8" position="float">
<label>Figure 8</label>
<caption><p>Precision-Recall (PR) curve of the cell classification results. Different colored PR curves represent different types of RBCs. The closer the curved surface is to the upper right, the better the classification effect of the red blood cell.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fmed-08-741407-g0008.tif"/>
</fig>
<p><xref ref-type="table" rid="T4">Table 4</xref> lists the recall, precision, F1-score, and AP of the CS-ARFPN model for classifying the 14 types of cells.</p>
<table-wrap position="float" id="T4">
<label>Table 4</label>
<caption><p>Results of different red blood cells by the CS-ARFPN model.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Class</bold></th>
<th valign="top" align="center"><bold>Recall</bold></th>
<th valign="top" align="center"><bold>Precision</bold></th>
<th valign="top" align="center"><bold>F1</bold></th>
<th valign="top" align="center"><bold>AP</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Schistocytes</td>
<td valign="top" align="center">0.681</td>
<td valign="top" align="center">0.841</td>
<td valign="top" align="center">0.753</td>
<td valign="top" align="center">0.787</td>
</tr>
<tr>
<td valign="top" align="left">Spherocyte</td>
<td valign="top" align="center">0.822</td>
<td valign="top" align="center">0.782</td>
<td valign="top" align="center">0.801</td>
<td valign="top" align="center">0.879</td>
</tr>
<tr>
<td valign="top" align="left">Stomatocytes</td>
<td valign="top" align="center">0.697</td>
<td valign="top" align="center">0.853</td>
<td valign="top" align="center">0.768</td>
<td valign="top" align="center">0.840</td>
</tr>
<tr>
<td valign="top" align="left">Target cells</td>
<td valign="top" align="center">0.744</td>
<td valign="top" align="center">0.821</td>
<td valign="top" align="center">0.786</td>
<td valign="top" align="center">0.815</td>
</tr>
<tr>
<td valign="top" align="left">Hypochromic</td>
<td valign="top" align="center">0.772</td>
<td valign="top" align="center">0.855</td>
<td valign="top" align="center">0.811</td>
<td valign="top" align="center">0.890</td>
</tr>
<tr>
<td valign="top" align="left">Elliptocytes</td>
<td valign="top" align="center">0.890</td>
<td valign="top" align="center">0.859</td>
<td valign="top" align="center">0.874</td>
<td valign="top" align="center">0.946</td>
</tr>
<tr>
<td valign="top" align="left">Edge</td>
<td valign="top" align="center">0.987</td>
<td valign="top" align="center">0.987</td>
<td valign="top" align="center">0.987</td>
<td valign="top" align="center">0.997</td>
</tr>
<tr>
<td valign="top" align="left">Normal RBC</td>
<td valign="top" align="center">0.804</td>
<td valign="top" align="center">0.810</td>
<td valign="top" align="center">0.807</td>
<td valign="top" align="center">0.880</td>
</tr>
<tr>
<td valign="top" align="left">Overlap</td>
<td valign="top" align="center">0.966</td>
<td valign="top" align="center">0.964</td>
<td valign="top" align="center">0.965</td>
<td valign="top" align="center">0.973</td>
</tr>
<tr>
<td valign="top" align="left">Hyperchromic</td>
<td valign="top" align="center">0.838</td>
<td valign="top" align="center">0.791</td>
<td valign="top" align="center">0.814</td>
<td valign="top" align="center">0.883</td>
</tr>
<tr>
<td valign="top" align="left">Microcyte</td>
<td valign="top" align="center">0.876</td>
<td valign="top" align="center">0.842</td>
<td valign="top" align="center">0.859</td>
<td valign="top" align="center">0.913</td>
</tr>
<tr>
<td valign="top" align="left">Macrocyte</td>
<td valign="top" align="center">0.857</td>
<td valign="top" align="center">0.784</td>
<td valign="top" align="center">0.819</td>
<td valign="top" align="center">0.893</td>
</tr>
<tr>
<td valign="top" align="left">Teardrop cells</td>
<td valign="top" align="center">0.570</td>
<td valign="top" align="center">0.686</td>
<td valign="top" align="center">0.622</td>
<td valign="top" align="center">0.631</td>
</tr>
<tr>
<td valign="top" align="left">Basophilic</td>
<td valign="top" align="center">0.837</td>
<td valign="top" align="center">0.761</td>
<td valign="top" align="center">0.797</td>
<td valign="top" align="center">0.843</td>
</tr>
</tbody>
</table>
</table-wrap>
<p><xref ref-type="table" rid="T5">Table 5</xref> shows the AP of the three models with two different optimization strategies. In the FPN, C-ARFPN, and CS-ARFPN models, the Momentum optimizer leads to 1.9, 18.7, and 9.7% higher performance than the Adam optimizer, respectively.</p>
<table-wrap position="float" id="T5">
<label>Table 5</label>
<caption><p>Performance comparison of models using Adam and Momentum.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th/>
<th valign="top" align="center"><bold>FPN</bold></th>
<th valign="top" align="center"><bold>S-ARFPN</bold></th>
<th valign="top" align="center"><bold>CS-ARFPN</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Adam</td>
<td valign="top" align="center">0.779</td>
<td valign="top" align="center">0.673</td>
<td valign="top" align="center">0.774</td>
</tr>
<tr>
<td valign="top" align="left">Momentum</td>
<td valign="top" align="center">0.798</td>
<td valign="top" align="center">0.860</td>
<td valign="top" align="center">0.869</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The performance of using two different RoI processing methods, i.e., Roi pooling and RoI align is shown in <xref ref-type="table" rid="T6">Table 6</xref>. The model using the RoI align method achieves better performance than that using the RoI pooling method.</p>
<table-wrap position="float" id="T6">
<label>Table 6</label>
<caption><p>Comparison of the precision of ROI pooling and ROI align methods.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th/>
<th valign="top" align="center"><bold>FPN</bold></th>
<th valign="top" align="center"><bold>S-ARFPN</bold></th>
<th valign="top" align="center"><bold>CS-ARFPN</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">ROI Pooling</td>
<td valign="top" align="center">0.786</td>
<td valign="top" align="center">0.823</td>
<td valign="top" align="center">0.858</td>
</tr>
<tr>
<td valign="top" align="left">ROI Align</td>
<td valign="top" align="center">0.798</td>
<td valign="top" align="center">0.860</td>
<td valign="top" align="center">0.869</td>
</tr>
</tbody>
</table>
</table-wrap></sec>
<sec>
<title>Comparison With the State-of-the-Art Models and Comparison of Results Obtained on Other Data Sets</title>
<p>Our proposed method was compared with five classification methods based on deep learning, including Faster RCNN (<xref ref-type="bibr" rid="B32">32</xref>), RetinaNet (<xref ref-type="bibr" rid="B37">37</xref>), Cascade RCNN (<xref ref-type="bibr" rid="B38">38</xref>), R-FCN (<xref ref-type="bibr" rid="B39">39</xref>), and Cascade-FPN (<xref ref-type="bibr" rid="B40">40</xref>). All the models used were trained from scratch on the RBC dataset.</p>
<p>ResNet-50 and ResNet-101 were used as the backbone for model training; then, different parameters were set to finetune the models; finally, showed the best results for each model were obtained. <xref ref-type="table" rid="T7">Table 7</xref> lists the classification performance of different models. The highest performance values are bolded in the table. The mAP of the two proposed models is 0.86 and 0.869, respectively, and the two models achieve the best performance among all the models.</p>
<table-wrap position="float" id="T7">
<label>Table 7</label>
<caption><p>Comparison of our proposed method with other advanced methods.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Backbone</bold></th>
<th valign="top" align="center"><bold>Models</bold></th>
<th valign="top" align="center"><bold>Recall</bold></th>
<th valign="top" align="center"><bold>Precision</bold></th>
<th valign="top" align="center"><bold>F1</bold></th>
<th valign="top" align="center"><bold>mAP</bold></th>
<th valign="top" align="center"><bold>Param (MB)</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">ResNet-50</td>
<td valign="top" align="center">Cascade RCNN (<xref ref-type="bibr" rid="B33">33</xref>)</td>
<td valign="top" align="center">0.361</td>
<td valign="top" align="center">0.687</td>
<td valign="top" align="center">0.463</td>
<td valign="top" align="center">0.385</td>
<td valign="top" align="center">254.47</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">Faster RCNN (<xref ref-type="bibr" rid="B32">32</xref>)</td>
<td valign="top" align="center">0.398</td>
<td valign="top" align="center">0.652</td>
<td valign="top" align="center">0.481</td>
<td valign="top" align="center">0.394</td>
<td valign="top" align="center">54.07</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">R-FCN (<xref ref-type="bibr" rid="B34">34</xref>)</td>
<td valign="top" align="center">0.530</td>
<td valign="top" align="center">0.757</td>
<td valign="top" align="center">0.638</td>
<td valign="top" align="center">0.551</td>
<td valign="top" align="center">95</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">RetinaNet (<xref ref-type="bibr" rid="B31">31</xref>)</td>
<td valign="top" align="center">0.695</td>
<td valign="top" align="center">0.751</td>
<td valign="top" align="center">0.736</td>
<td valign="top" align="center">0.684</td>
<td valign="top" align="center">61.92</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">Cascade-FPN (<xref ref-type="bibr" rid="B35">35</xref>)</td>
<td valign="top" align="center">0.759</td>
<td valign="top" align="center">0.757</td>
<td valign="top" align="center">0.736</td>
<td valign="top" align="center">0.736</td>
<td valign="top" align="center">98.14</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">FPN</td>
<td valign="top" align="center">0.753</td>
<td valign="top" align="center">0.758</td>
<td valign="top" align="center">0.759</td>
<td valign="top" align="center">0.796</td>
<td valign="top" align="center">79.05</td>
</tr>
<tr>
<td valign="top" align="left">ResNet-101</td>
<td valign="top" align="center">Cascade RCNN (<xref ref-type="bibr" rid="B33">33</xref>)</td>
<td valign="top" align="center">0.429</td>
<td valign="top" align="center">0.699</td>
<td valign="top" align="center">0.525</td>
<td valign="top" align="center">0.416</td>
<td valign="top" align="center">290.69</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">Faster RCNN (<xref ref-type="bibr" rid="B28">28</xref>)</td>
<td valign="top" align="center">0.447</td>
<td valign="top" align="center">0.687</td>
<td valign="top" align="center">0.526</td>
<td valign="top" align="center">0.434</td>
<td valign="top" align="center">63.70</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">R-FCN (<xref ref-type="bibr" rid="B34">34</xref>)</td>
<td valign="top" align="center">0.528</td>
<td valign="top" align="center">0.880</td>
<td valign="top" align="center">0.620</td>
<td valign="top" align="center">0.548</td>
<td valign="top" align="center">95.66</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">RetinaNet (<xref ref-type="bibr" rid="B31">31</xref>)</td>
<td valign="top" align="center">0.686</td>
<td valign="top" align="center">0.738</td>
<td valign="top" align="center">0.709</td>
<td valign="top" align="center">0.683</td>
<td valign="top" align="center">133.58</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">FPN</td>
<td valign="top" align="center">0.756</td>
<td valign="top" align="center">0.759</td>
<td valign="top" align="center">0.759</td>
<td valign="top" align="center">0.798</td>
<td valign="top" align="center">115.27</td>
</tr>
<tr>
<td valign="top" align="left">ResNet-50</td>
<td valign="top" align="center">Our proposed (S-ARFPN)</td>
<td valign="top" align="center">0.754</td>
<td valign="top" align="center">0.758</td>
<td valign="top" align="center">0.753</td>
<td valign="top" align="center">0.792</td>
<td valign="top" align="center">88.67</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">Our proposed (CS-ARFPN)</td>
<td valign="top" align="center"><bold>0.811</bold></td>
<td valign="top" align="center"><bold>0.831</bold></td>
<td valign="top" align="center"><bold>0.819</bold></td>
<td valign="top" align="center"><bold>0.869</bold></td>
<td valign="top" align="center">88.68</td>
</tr>
<tr>
<td valign="top" align="left">ResNet-101</td>
<td valign="top" align="center">Our proposed (S-ARFPN)</td>
<td valign="top" align="center"><bold>0.800</bold></td>
<td valign="top" align="center"><bold>0.838</bold></td>
<td valign="top" align="center"><bold>0.816</bold></td>
<td valign="top" align="center"><bold>0.860</bold></td>
<td valign="top" align="center">133.44</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">Our proposed (CS-ARFPN)</td>
<td valign="top" align="center">0.791</td>
<td valign="top" align="center">0.789</td>
<td valign="top" align="center">0.788</td>
<td valign="top" align="center">0.833</td>
<td valign="top" align="center">133.43</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>Each model was trained under different parameter settings, and the best results of each model are shown. The classification effect of Cascade-FPN is not ideal when the backbone is Resnet101, so it is not displayed in the table. The best results are highlighted in bold</italic>.</p>
</table-wrap-foot>
</table-wrap>
<p>Besides, to verify the effectiveness of the proposed model, the performance of our proposed model on different datasets was compared, and the comparison results are listed in <xref ref-type="table" rid="T8">Table 8</xref>. Among them, in the IDB data set, the accuracy of circular and elongated red blood cells are 99 and 94.4%, respectively. In the BCCD data set, the accuracy of WBC and Platelets are 97.43 and 92.89%, respectively. The proposed model achieves an mAP of 91% and 98.8% on the BCCD dataset and IDB dataset is 91.23%, respectively.</p>
<table-wrap position="float" id="T8">
<label>Table 8</label>
<caption><p>Comparison of the accuracy of the proposed method on different datasets.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Dataset</bold></th>
<th valign="top" align="left"><bold>Images</bold></th>
<th valign="top" align="left"><bold>Class</bold></th>
<th valign="top" align="left"><bold>AP (%)</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">BCCD (<xref ref-type="bibr" rid="B41">41</xref>)</td>
<td valign="top" align="left">364</td>
<td valign="top" align="left">WBC</td>
<td valign="top" align="left">97.43</td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">RBC</td>
<td valign="top" align="left">83.05</td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">Platelets</td>
<td valign="top" align="left">92.89</td>
</tr>
<tr>
<td valign="top" align="left">IDB (<xref ref-type="bibr" rid="B42">42</xref>)</td>
<td valign="top" align="left">626</td>
<td valign="top" align="left">circular</td>
<td valign="top" align="left">99</td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">elongated</td>
<td valign="top" align="left">94.4</td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">other</td>
<td valign="top" align="left">80.3</td>
</tr>
</tbody>
</table>
</table-wrap></sec></sec>
<sec sec-type="discussion" id="s4">
<title>Discussion</title>
<p>To better assist doctors in diagnosing the diseases related to RBCs, this paper proposed an attention feature pyramid network model that can directly classify dense red blood cell images. Since RBCs are small objects, this paper combined the attention mechanism with the feature pyramid network to improve the detection of small objects. The experimental results show that the two proposed attention residual units can capture more key feature information of RBCs, which helps to classify RBCs more accurately.</p>
<p>In the training process, different backbones, learning strategies, and anchor settings were used, and the optimal parameter setting of the two models was obtained after a lot of training. The results show that different learning rates, anchor sizes, backbones, and attention modules led to performance differences. When ResNet-50 was used as the backbone, the CS-ARFPN model achieved the best performance under the learning rate of 0.001 and the anchor size of 32. When ResNet-50 was used as the backbone, the S-ARFPN model achieved the best performance under the learning rate of 0.002 and the anchor size of 4. The subsequent experiment and analysis were conducted based on the model and the above-mentioned optimal parameters.</p>
<p>In the experiment, the performance of FPN, C-ARFPN and CS-ARFPN with two different attention residual unit models was compared to verify the effectiveness of the attention mechanism. The effectiveness of our method was proved through evaluation metrics, feature maps, and PR curves.</p>
<p>It can be seen from <xref ref-type="table" rid="T3">Table 3</xref> that, compared with FPN, both CS-ARFPN and C-ARFPN achieved improved performance, which shows the effectiveness of the attention mechanism. Meanwhile, the CS-ARFPN model performed better than the C-ARFPN model, indicating that the CS-ARFPN model pays more attention to features, channel feature information and spatial feature information. To make the function of the attention module more intuitive, the feature maps of FPN, C-ARFPN, and CS-ARFPN are shown <xref ref-type="fig" rid="F7">Figure 7</xref>. It can be seen that as the number of layers increases, the extracted features become more and more abstract and more difficult to understand. Compared with FPN, C-ARFPN and CS-ARFPN weaken the background characteristics and highlight the relevant components of the category, so the two models can more accurately capture the shape and size of RBCs. Besides, CS-ARFPN can better focus on the detailed features than C-ARFPN.</p>
<p><xref ref-type="fig" rid="F8">Figure 8</xref> shows the PR curves of the three models. The curves in the figure show that for each type of RBC, different attention modules led to different classification performances. In general, the PR curve obtained by the CS-ARFPN model is located on the upper right, which indicates that the area under the line is the largest, and the performance is the best.</p>
<p>The performance of the Adam optimizer and momentum optimizer was compared and analyzed. The results in <xref ref-type="table" rid="T5">Table 5</xref> show that the Momentum optimizer performs better for RBCs classification. During the training process, the Momentum optimizer had a slower convergence speed than the Adam optimizer, but it obtained better results and the generalization performance in our work.</p>
<p>As shown in <xref ref-type="table" rid="T6">Table 6</xref>, the RoI align method achieved higher AP than the RoI pooling method. The rounding operation in the RoI pooling method has little impact on the classification of large objects, but it will have a huge impact on the classification of small objects such as RBCs. The RoI align method removes the rounding operation, so it can accurately extract RoI and achieve better performance.</p>
<p><xref ref-type="table" rid="T4">Table 4</xref> shows the performance metrics of classifying the 14 types of RBCs obtained by the CS-ARFPN model. Among them, the classification results of teardrop cells and schistocytes are not as accurate as other types. Although the attention mechanism focuses on the features related to categories, inaccurate classifications are caused by certain features. This is because different types of feature extraction have different difficulties, and certain red blood cell types have a specific definition standard. In this case, the model fails to learn the abnormal RBC, thus resulting in misclassifications. Secondly, some types of red blood cell samples are small. Although the RBCs were expanded during the cutting process, the sample imbalance problem still existed in the study, which makes the model fail to learn the characteristics of red blood cells with few samples.</p>
<p>The comparison between our method and other advanced methods is shown in <xref ref-type="table" rid="T7">Table 7</xref>. Our method achieves better performance than other models. Meanwhile, our method was compared with other red blood cell classification methods, including Kihm et al. (<xref ref-type="bibr" rid="B26">26</xref>), Parab et al. (<xref ref-type="bibr" rid="B27">27</xref>), Lin et al. (<xref ref-type="bibr" rid="B28">28</xref>), and others. These methods all use a single red blood cell image for feature extraction and achieve good accuracy. The classification of the entire red blood cell image can be regarded as the classification of dense small objects with weak feature expression and diverse target changes, so feature extraction is more difficult. Due to this, our method obtains a slightly lower accuracy than the comparison methods. To better compare with other methods and verify the effectiveness and generalization of the proposed method, our method was evaluated on two public data sets, i.e., namely the BCCD dataset and the IDB dataset. As shown in <xref ref-type="table" rid="T8">Table 8</xref>, the classification results of WBCs and Platelets in the BCCD dataset in <xref ref-type="table" rid="T8">Table 8</xref> are better. The reason for the low accuracy of RBC classification is that the dataset is mainly provided for WBC classification, and most of the RBCs in the image are overlapping cells. In the IDB dataset, the classification results of circular and elongated are good. The reason for the low accuracy of the RBCs of the other category is that the category contains many small and medium categories, which poses a challenge to the classification. The results indicate that our method is effective and generalizable, and the classification of the entire image can be further improved.</p>
<p>At present, due to the limited dataset and the imbalance of different types of RBC samples, it is difficult to improve the classification performance. After communicating with the doctor, we will collect RBC images under a microscope so that clear images with obvious RBC characteristics can be obtained. In future work, we will collect more data, especially the rare type of RBC. Meanwhile, we will investigate the use of a fully connected layer and loss function of the model to reduce the impact of sample imbalance to further improve the classification performance.</p></sec>
<sec sec-type="conclusions" id="s5">
<title>Conclusions</title>
<p>Abnormal red blood cells can cause changes in shape, size, and amount of hemoglobin, which are closely related to the diagnosis of many diseases. This paper proposed a classification method that can directly classify 14 types of red blood cells on the entire red blood cell image. The feature pyramid network extracts the multi-scale features of RBCs, and the attention mechanism is used to improve the learning and representation of RBC features. Besides, the ROI alignment layer with good performance is used to unify the size of the candidate area. This method proposed in this study can achieve accurate red blood cell classification, which provides a clinically feasible, universal and convenient method for the diagnosis of red blood cell diseases.</p></sec>
<sec sec-type="data-availability" id="s6">
<title>Data Availability Statement</title>
<p>The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.</p></sec>
<sec id="s7">
<title>Author Contributions</title>
<p>WS and PH: conceptualization and methodology. JW, YS, and JZ: data production and curation. ZL: data curation, resources, and supervision. WS: validation, writing&#x02013;original draft preparation, investigation, and visualization. DLi: supervision, project administration, and funding acquisition. DLiu: writing&#x02013;review and editing and formal analysis. All authors contributed to the article and approved the submitted version.</p></sec>
<sec sec-type="funding-information" id="s8">
<title>Funding</title>
<p>This work was funded by the National Natural Science Foundation of China (61971271), the Taishan Scholars Project of Shandong Province (Tsqn20161023), and the Primary Research and Development Plan of Shandong Province (No. 2018GGX101018, No. 2019QYTPY02).</p></sec>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p></sec>
<sec sec-type="disclaimer" id="s9">
<title>Publisher&#x00027;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p></sec>
</body>
<back>
<sec sec-type="supplementary-material" id="s10">
<title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fmed.2021.741407/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fmed.2021.741407/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="Data_Sheet_1.zip" id="SM1" mimetype="application/zip" xmlns:xlink="http://www.w3.org/1999/xlink"/></sec>
<ref-list>
<title>References</title>
<ref id="B1">
<label>1.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chadha</surname> <given-names>GK</given-names></name> <name><surname>Srivastava</surname> <given-names>A</given-names></name> <name><surname>Singh</surname> <given-names>A</given-names></name> <name><surname>Gupta</surname> <given-names>R</given-names></name> <name><surname>Singla</surname> <given-names>D</given-names></name></person-group>. <article-title>An automated method for counting red blood cells using image processing</article-title>. <source>Proc Comput Sci.</source> (<year>2020</year>) <volume>167</volume>:<fpage>769</fpage>&#x02013;<lpage>78</lpage>. <pub-id pub-id-type="doi">10.1016/j.procs.2020.03.408</pub-id></citation>
</ref>
<ref id="B2">
<label>2.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Yang</surname> <given-names>M</given-names></name> <name><surname>Bourbakis</surname> <given-names>N</given-names></name></person-group> editors. <article-title>An overview of lossless digital image compression techniques</article-title>. In: <source>48th Midwest Symposium on Circuits and Systems, 2005</source>. <publisher-loc>Covington, KY</publisher-loc>: <publisher-name>IEEE</publisher-name> (<year>2005</year>).</citation>
</ref>
<ref id="B3">
<label>3.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Mazalan</surname> <given-names>SM</given-names></name> <name><surname>Mahmood</surname> <given-names>NH</given-names></name> <name><surname>Razak</surname> <given-names>MAA</given-names></name></person-group> editors. <article-title>Automated red blood cells counting in peripheral blood smear image using circular Hough transform</article-title>. In: <source>2013 1st International Conference on Artificial Intelligence, Modelling and Simulation</source>. <publisher-loc>Kota Kinabalu</publisher-loc>: <publisher-name>IEEE</publisher-name> (<year>2013</year>). <pub-id pub-id-type="doi">10.1109/AIMS.2013.59</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B4">
<label>4.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jansen</surname> <given-names>V</given-names></name></person-group>. <article-title>Diagnosis of anemia&#x02014;a synoptic overview and practical approach</article-title>. <source>Transfus Apher Sci.</source> (<year>2019</year>) <volume>58</volume>:<fpage>375</fpage>&#x02013;<lpage>85</lpage>. <pub-id pub-id-type="doi">10.1016/j.transci.2019.06.012</pub-id><pub-id pub-id-type="pmid">31326294</pub-id></citation></ref>
<ref id="B5">
<label>5.</label>
<citation citation-type="web"><person-group person-group-type="author"><name><surname>George</surname> <given-names>LE</given-names></name></person-group>. <article-title>Comparative study using weka for red blood cells classification</article-title>. <source>World Acad Sci Eng Technol Int J Med Health Pharm Biomed Eng</source>. (<year>2015</year>) <volume>9</volume>:<fpage>19</fpage>&#x02013;<lpage>23</lpage>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.researchgate.net/publication/281208619_Comparative_Study_Using_Weka_for_Red_Blood_Cells_Classification">https://www.researchgate.net/publication/281208619_Comparative_Study_Using_Weka_for_Red_Blood_Cells_Classification</ext-link></citation>
</ref>
<ref id="B6">
<label>6.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Chy</surname> <given-names>TS</given-names></name> <name><surname>Rahaman</surname> <given-names>MA</given-names></name></person-group> editors. <article-title>Automatic sickle cell anemia detection using image processing technique</article-title>. In: <source>2018 International Conference on Advancement in Electrical and Electronic Engineering (ICAEEE)</source>. <publisher-loc>Gazipur</publisher-loc>: <publisher-name>IEEE</publisher-name> (<year>2018</year>). <pub-id pub-id-type="doi">10.1109/ICAEEE.2018.8642984</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B7">
<label>7.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Sun</surname> <given-names>Q</given-names></name> <name><surname>Yang</surname> <given-names>S</given-names></name> <name><surname>Sun</surname> <given-names>C</given-names></name> <name><surname>Yang</surname> <given-names>W</given-names></name></person-group> editors. <article-title>An automatic method for red blood cells detection in urine sediment micrograph</article-title>. In: <source>2018 33rd Youth Academic Annual Conference of Chinese Association of Automation (YAC)</source>. <publisher-loc>Nanjing</publisher-loc>: <publisher-name>IEEE</publisher-name> (<year>2018</year>). <pub-id pub-id-type="doi">10.1109/YAC.2018.8406379</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B8">
<label>8.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Di Ruberto</surname> <given-names>C</given-names></name> <name><surname>Loddo</surname> <given-names>A</given-names></name> <name><surname>Putzu</surname> <given-names>L</given-names></name></person-group>. <article-title>Detection of red and white blood cells from microscopic blood images using a region proposal approach</article-title>. <source>Comput Biol Med.</source> (<year>2020</year>) <volume>116</volume>:<fpage>103530</fpage>. <pub-id pub-id-type="doi">10.1016/j.compbiomed.2019.103530</pub-id><pub-id pub-id-type="pmid">31778895</pub-id></citation></ref>
<ref id="B9">
<label>9.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Venkatalakshmi</surname> <given-names>B</given-names></name> <name><surname>Thilagavathi</surname> <given-names>K</given-names></name></person-group> editors. <article-title>Automatic red blood cell counting using hough transform</article-title>. In: <source>2013 IEEE Conference on Information &#x00026; Communication Technologies.</source> <publisher-loc>Thuckalay</publisher-loc>: <publisher-name>IEEE</publisher-name> (<year>2013</year>). <pub-id pub-id-type="doi">10.1109/CICT.2013.6558103</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B10">
<label>10.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ara&#x000FA;jo</surname> <given-names>T</given-names></name> <name><surname>Aresta</surname> <given-names>G</given-names></name> <name><surname>Castro</surname> <given-names>E</given-names></name> <name><surname>Rouco</surname> <given-names>J</given-names></name> <name><surname>Aguiar</surname> <given-names>P</given-names></name> <name><surname>Eloy</surname> <given-names>C</given-names></name> <etal/></person-group>. <article-title>Classification of breast cancer histology images using convolutional neural networks</article-title>. <source>PLoS ONE.</source> (<year>2017</year>) <volume>12</volume>:<fpage>e0177544</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0177544</pub-id><pub-id pub-id-type="pmid">28570557</pub-id></citation></ref>
<ref id="B11">
<label>11.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Hamad</surname> <given-names>A</given-names></name> <name><surname>Ersoy</surname> <given-names>I</given-names></name> <name><surname>Bunyak</surname> <given-names>F</given-names></name></person-group> editors. <article-title>Improving nuclei classification performance in H&#x00026;E stained tissue images using fully convolutional regression network and convolutional neural network</article-title>. In: <source>2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)</source>. <publisher-loc>Washington, DC</publisher-loc>: <publisher-name>IEEE</publisher-name> (<year>2018</year>). <pub-id pub-id-type="doi">10.1109/AIPR.2018.8707397</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B12">
<label>12.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Albehadili</surname> <given-names>H</given-names></name> <name><surname>Alzubaidi</surname> <given-names>L</given-names></name> <name><surname>Rashed</surname> <given-names>J</given-names></name> <name><surname>Al-Imam</surname> <given-names>M</given-names></name> <name><surname>Alwzwazy</surname> <given-names>HA</given-names></name></person-group> editors. <article-title>Fast and accurate real time pedestrian detection using convolutional neural network</article-title>. In: <source>The 1 st International Conference on Information Technology (ICoIT&#x00027;17)</source>. <publisher-loc>Irbil</publisher-loc> (<year>2017</year>). <pub-id pub-id-type="doi">10.25212/ICoIT17.029</pub-id></citation>
</ref>
<ref id="B13">
<label>13.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Zeiler</surname> <given-names>MD</given-names></name> <name><surname>Fergus</surname> <given-names>R</given-names></name></person-group> editors. <article-title>Visualizing and understanding convolutional networks</article-title>. In: <source>European Conference on Computer Vision</source>. <publisher-loc>Cham</publisher-loc>: <publisher-name>Springer</publisher-name> (<year>2014</year>). <pub-id pub-id-type="doi">10.1007/978-3-319-10590-1_53</pub-id></citation>
</ref>
<ref id="B14">
<label>14.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yi</surname> <given-names>F</given-names></name> <name><surname>Moon</surname> <given-names>I</given-names></name> <name><surname>Javidi</surname> <given-names>B</given-names></name></person-group>. <article-title>Cell morphology-based classification of red blood cells using holographic imaging informatics</article-title>. <source>Biomed Opt Expr.</source> (<year>2016</year>) <volume>7</volume>:<fpage>2385</fpage>&#x02013;<lpage>99</lpage>. <pub-id pub-id-type="doi">10.1364/BOE.7.002385</pub-id><pub-id pub-id-type="pmid">27375953</pub-id></citation></ref>
<ref id="B15">
<label>15.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Maji</surname> <given-names>P</given-names></name> <name><surname>Mandal</surname> <given-names>A</given-names></name> <name><surname>Ganguly</surname> <given-names>M</given-names></name> <name><surname>Saha</surname> <given-names>S</given-names></name></person-group> editors. <article-title>An automated method for counting and characterizing red blood cells using mathematical morphology</article-title>. In: <source>2015 Eighth International Conference on Advances in Pattern Recognition (ICAPR)</source>. <publisher-loc>Kolkata</publisher-loc>: <publisher-name>IEEE</publisher-name> (<year>2015</year>). <pub-id pub-id-type="doi">10.1109/ICAPR.2015.7050674</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B16">
<label>16.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mahmood</surname> <given-names>NH</given-names></name> <name><surname>Mansor</surname> <given-names>MA</given-names></name></person-group>. <article-title>Red blood cells estimation using hough transform technique</article-title>. <source>Signal Image Process.</source> (<year>2012</year>) <volume>3</volume>:<fpage>53</fpage>. <pub-id pub-id-type="doi">10.5121/sipij.2012.3204</pub-id></citation>
</ref>
<ref id="B17">
<label>17.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Savkare</surname> <given-names>S</given-names></name> <name><surname>Narote</surname> <given-names>S</given-names></name></person-group> editors. <article-title>Blood cell segmentation from microscopic blood images</article-title>. In: <source>2015 International Conference on Information Processing (ICIP)</source>. <publisher-loc>Pune</publisher-loc>: <publisher-name>IEEE</publisher-name> (<year>2015</year>). <pub-id pub-id-type="doi">10.1109/INFOP.2015.7489435</pub-id><pub-id pub-id-type="pmid">32411282</pub-id></citation></ref>
<ref id="B18">
<label>18.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Lotfi</surname> <given-names>M</given-names></name> <name><surname>Nazari</surname> <given-names>B</given-names></name> <name><surname>Sadri</surname> <given-names>S</given-names></name> <name><surname>Sichani</surname> <given-names>NK</given-names></name></person-group> editors. <article-title>The detection of dacrocyte, schistocyte and elliptocyte cells in iron deficiency anemia</article-title>. In: <source>2015 2nd International Conference on Pattern Recognition and Image Analysis (IPRIA).</source> <publisher-loc>Rasht</publisher-loc>: <publisher-name>IEEE</publisher-name> (<year>2015</year>). <pub-id pub-id-type="doi">10.1109/PRIA.2015.7161628</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B19">
<label>19.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Dalvi</surname> <given-names>PT</given-names></name> <name><surname>Vernekar</surname> <given-names>N</given-names></name></person-group> editors. <article-title>Computer aided detection of abnormal red blood cells</article-title>. In: <source>2016 IEEE International Conference on Recent Trends in Electronics, Information &#x00026; Communication Technology (RTEICT)</source>. <publisher-loc>Bangalore</publisher-loc>: <publisher-name>IEEE</publisher-name> (<year>2016</year>). <pub-id pub-id-type="doi">10.1109/RTEICT.2016.7808132</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B20">
<label>20.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Sen</surname> <given-names>B</given-names></name> <name><surname>Ganesh</surname> <given-names>A</given-names></name> <name><surname>Bhan</surname> <given-names>A</given-names></name> <name><surname>Dixit</surname> <given-names>S</given-names></name> <name><surname>Goyal</surname> <given-names>A</given-names></name></person-group>. <article-title>Machine learning based diagnosis and classification of sickle cell anemia in human RBC[C]</article-title>. In: <source>2021 Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV)</source>. <publisher-loc>Tirunelveli</publisher-loc>: <publisher-name>IEEE</publisher-name> (<year>2021</year>). <pub-id pub-id-type="doi">10.1109/ICICV50876.2021.9388610</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B21">
<label>21.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lee</surname> <given-names>H</given-names></name> <name><surname>Chen</surname> <given-names>Y-PP</given-names></name></person-group>. <article-title>Cell morphology based classification for red cells in blood smear images</article-title>. <source>Pattern Recognit Lett.</source> (<year>2014</year>) <volume>49</volume>:<fpage>155</fpage>&#x02013;<lpage>61</lpage>. <pub-id pub-id-type="doi">10.1016/j.patrec.2014.06.010</pub-id><pub-id pub-id-type="pmid">22255740</pub-id></citation></ref>
<ref id="B22">
<label>22.</label>
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Jambhekar</surname> <given-names>ND</given-names></name></person-group>. <article-title>Red blood cells classification using image processing</article-title>. <source>Sci Res Rep.</source> (<year>2011</year>) <volume>1</volume>:<fpage>151</fpage>&#x02013;<lpage>4</lpage>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.researchgate.net/publication/285841327_Red_blood_cells_classification_using_image_processing">https://www.researchgate.net/publication/285841327_Red_blood_cells_classification_using_image_processing</ext-link></citation>
</ref>
<ref id="B23">
<label>23.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Elsalamony</surname> <given-names>HA</given-names></name></person-group>. <article-title>Healthy and unhealthy red blood cell detection in human blood smears using neural networks</article-title>. <source>Micron.</source> (<year>2016</year>) <volume>83</volume>:<fpage>32</fpage>&#x02013;<lpage>41</lpage>. <pub-id pub-id-type="doi">10.1016/j.micron.2016.01.008</pub-id><pub-id pub-id-type="pmid">26867209</pub-id></citation></ref>
<ref id="B24">
<label>24.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xu</surname> <given-names>M</given-names></name> <name><surname>Papageorgiou</surname> <given-names>DP</given-names></name> <name><surname>Abidi</surname> <given-names>SZ</given-names></name> <name><surname>Dao</surname> <given-names>M</given-names></name> <name><surname>Zhao</surname> <given-names>H</given-names></name> <name><surname>Karniadakis</surname> <given-names>GE</given-names></name></person-group>. <article-title>A deep convolutional neural network for classification of red blood cells in sickle cell anemia</article-title>. <source>PLoS Comput Biol.</source> (<year>2017</year>) <volume>13</volume>:<fpage>e1005746</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pcbi.1005746</pub-id><pub-id pub-id-type="pmid">29049291</pub-id></citation></ref>
<ref id="B25">
<label>25.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Alzubaidi</surname> <given-names>L</given-names></name> <name><surname>Al-Shamma</surname> <given-names>O</given-names></name> <name><surname>Fadhel</surname> <given-names>MA</given-names></name> <name><surname>Farhan</surname> <given-names>L</given-names></name> <name><surname>Zhang</surname> <given-names>J</given-names></name></person-group> editors. <article-title>Classification of red blood cells in sickle cell anemia using deep convolutional neural network</article-title>. In: <source>International Conference on Intelligent Systems Design and Applications</source>. <publisher-loc>Cham</publisher-loc>: <publisher-name>Springer</publisher-name> (<year>2018</year>). <pub-id pub-id-type="doi">10.1007/978-3-030-16657-1_51</pub-id><pub-id pub-id-type="pmid">29049291</pub-id></citation></ref>
<ref id="B26">
<label>26.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kihm</surname> <given-names>A</given-names></name> <name><surname>Kaestner</surname> <given-names>L</given-names></name> <name><surname>Wagner</surname> <given-names>C</given-names></name> <name><surname>Quint</surname> <given-names>S</given-names></name></person-group>. <article-title>Classification of red blood cell shapes in flow using outlier tolerant machine learning</article-title>. <source>PLoS Comput Biol.</source> (<year>2018</year>) <volume>14</volume>:<fpage>e1006278</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pcbi.1006278</pub-id><pub-id pub-id-type="pmid">29906283</pub-id></citation></ref>
<ref id="B27">
<label>27.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Parab</surname> <given-names>MA</given-names></name> <name><surname>Mehendale</surname> <given-names>ND</given-names></name></person-group>. <article-title>Red blood cell classification using image processing CNN</article-title>. <source>SN Comput Sci</source>. (<year>2021</year>) <volume>2</volume>:<fpage>70</fpage>. <pub-id pub-id-type="doi">10.1007/s42979-021-00458-2</pub-id></citation>
</ref>
<ref id="B28">
<label>28.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lin</surname> <given-names>YH</given-names></name> <name><surname>Liao</surname> <given-names>YK</given-names></name> <name><surname>Sung</surname> <given-names>KB</given-names></name></person-group>. <article-title>Automatic detection and characterization of quantitative phase images of thalassemic red blood cells using a mask region-based convolutional neural network</article-title>. <source>J Biomed Optics.</source> (<year>2020</year>) <volume>25</volume>:<fpage>116502</fpage>. <pub-id pub-id-type="doi">10.1117/1.JBO.25.11.116502</pub-id><pub-id pub-id-type="pmid">33188571</pub-id></citation></ref>
<ref id="B29">
<label>29.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Lin</surname> <given-names>T-Y</given-names></name> <name><surname>Doll&#x000E1;r</surname> <given-names>P</given-names></name> <name><surname>Girshick</surname> <given-names>R</given-names></name> <name><surname>He</surname> <given-names>K</given-names></name> <name><surname>Hariharan</surname> <given-names>B</given-names></name> <name><surname>Belongie</surname> <given-names>S</given-names></name></person-group> editors. <article-title>Feature pyramid networks for object detection</article-title>. In: <source>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</source>. <publisher-loc>Honolulu, HI</publisher-loc>: <publisher-name>IEEE</publisher-name> (<year>2017</year>). <pub-id pub-id-type="doi">10.1109/CVPR.2017.106</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B30">
<label>30.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Hu</surname> <given-names>J</given-names></name> <name><surname>Shen</surname> <given-names>L</given-names></name> <name><surname>Sun</surname> <given-names>G</given-names></name></person-group> editors. <article-title>Squeeze-and-excitation networks</article-title>. In: <source>Proceedings of the IEEE conference on Computer Vision and Pattern Recognition</source>. <publisher-loc>Salt Lake City, UT</publisher-loc>: <publisher-name>IEEE</publisher-name> (<year>2018</year>). <pub-id pub-id-type="doi">10.1109/CVPR.2018.00745</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B31">
<label>31.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Woo</surname> <given-names>S</given-names></name> <name><surname>Park</surname> <given-names>J</given-names></name> <name><surname>Lee</surname> <given-names>J-Y</given-names></name> <name><surname>Kweon</surname> <given-names>IS</given-names></name></person-group> editors. <article-title>CBAM: Convolutional block attention module</article-title>. In: <source>Proceedings of the European Conference on Computer Vision (ECCV)</source>. <publisher-loc>Cham</publisher-loc>: <publisher-name>Springer</publisher-name> (<year>2018</year>). <pub-id pub-id-type="doi">10.1007/978-3-030-01234-2_1</pub-id></citation>
</ref>
<ref id="B32">
<label>32.</label>
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Vaswani</surname> <given-names>A</given-names></name> <name><surname>Shazeer</surname> <given-names>N</given-names></name> <name><surname>Parmar</surname> <given-names>N</given-names></name> <name><surname>Uszkoreit</surname> <given-names>J</given-names></name> <name><surname>Jones</surname> <given-names>L</given-names></name> <name><surname>Gomez</surname> <given-names>AN</given-names></name> <etal/></person-group>. <article-title>Attention is all you need</article-title>. <source>arXiv</source>. (<year>2017</year>) arXiv:170603762. Available online at: <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/1706.03762">https://arxiv.org/abs/1706.03762</ext-link></citation>
</ref>
<ref id="B33">
<label>33.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>He</surname> <given-names>K</given-names></name> <name><surname>Gkioxari</surname> <given-names>G</given-names></name> <name><surname>Doll&#x000E1;r</surname> <given-names>P</given-names></name> <name><surname>Girshick</surname> <given-names>R</given-names></name></person-group> editors. <article-title>Mask R-CNN</article-title>. In: <source>Proceedings of the IEEE International Conference on Computer Vision</source>. <publisher-loc>Venice</publisher-loc>: <publisher-name>IEEE</publisher-name> (<year>2017</year>). <pub-id pub-id-type="doi">10.1109/ICCV.2017.322</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B34">
<label>34.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ren</surname> <given-names>S</given-names></name> <name><surname>He</surname> <given-names>K</given-names></name> <name><surname>Girshick</surname> <given-names>R</given-names></name> <name><surname>Sun</surname> <given-names>J</given-names></name></person-group>. <article-title>Faster R-CNN: towards real-time object detection with region proposal networks</article-title>. <source>arXiv</source>. (<year>2015</year>) arXiv:150601497. <pub-id pub-id-type="doi">10.1109/TPAMI.2016.2577031</pub-id><pub-id pub-id-type="pmid">27295650</pub-id></citation></ref>
<ref id="B35">
<label>35.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Girshick</surname> <given-names>R</given-names></name></person-group> editor. <article-title>Fast R-CNN</article-title>. In: <source>Proceedings of the IEEE International Conference on Computer Vision</source>. <publisher-loc>Santiago</publisher-loc>: <publisher-name>IEEE</publisher-name> (<year>2015</year>). <pub-id pub-id-type="doi">10.1109/ICCV.2015.169</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B36">
<label>36.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Abadi</surname> <given-names>M</given-names></name> <name><surname>Barham</surname> <given-names>P</given-names></name> <name><surname>Chen</surname> <given-names>J</given-names></name> <name><surname>Chen</surname> <given-names>Z</given-names></name> <name><surname>Davis</surname> <given-names>A</given-names></name> <name><surname>Dean</surname> <given-names>J</given-names></name> <etal/></person-group>. <article-title>editors</article-title>. Tensorflow: a system for large-scale machine learning. In: <italic>12th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 16)</italic>. Savannah, GA (<year>2016</year>).</citation>
</ref>
<ref id="B37">
<label>37.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Lin</surname> <given-names>T-Y</given-names></name> <name><surname>Goyal</surname> <given-names>P</given-names></name> <name><surname>Girshick</surname> <given-names>R</given-names></name> <name><surname>He</surname> <given-names>K</given-names></name> <name><surname>Doll&#x000E1;r</surname> <given-names>P</given-names></name></person-group> editors. <article-title>Focal loss for dense object detection</article-title>. In: <source>Proceedings of the IEEE International Conference on Computer Vision</source>. <publisher-loc>Venice</publisher-loc>: <publisher-name>IEEE</publisher-name> (<year>2017</year>). <pub-id pub-id-type="doi">10.1109/ICCV.2017.324</pub-id><pub-id pub-id-type="pmid">30040631</pub-id></citation></ref>
<ref id="B38">
<label>38.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Cai</surname> <given-names>Z</given-names></name> <name><surname>Vasconcelos</surname> <given-names>N</given-names></name></person-group> editors. <article-title>Cascade R-CNN: delving into high quality object detection</article-title>. In: <source>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</source>. <publisher-loc>Salt Lake City, UT</publisher-loc>: <publisher-name>IEEE</publisher-name> (<year>2018</year>). <pub-id pub-id-type="doi">10.1109/CVPR.2018.00644</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B39">
<label>39.</label>
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Dai</surname> <given-names>J</given-names></name> <name><surname>Li</surname> <given-names>Y</given-names></name> <name><surname>He</surname> <given-names>K</given-names></name> <name><surname>Sun</surname> <given-names>J</given-names></name></person-group>. <article-title>R-fcn: object detection via region-based fully convolutional networks</article-title>. <source>arXiv</source>. (<year>2016</year>) arXiv:160506409. Available online at: <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/1605.06409">https://arxiv.org/abs/1605.06409</ext-link></citation>
</ref>
<ref id="B40">
<label>40.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Wu</surname> <given-names>Y</given-names></name> <name><surname>Chen</surname> <given-names>Y</given-names></name> <name><surname>Yuan</surname> <given-names>L</given-names></name> <name><surname>Liu</surname> <given-names>Z</given-names></name> <name><surname>Wang</surname> <given-names>L</given-names></name> <name><surname>Li</surname> <given-names>H</given-names></name> <etal/></person-group>. editors. <article-title>Rethinking classification and localization for object detection</article-title>. In: <source>Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition</source>. <publisher-loc>Seattle, WA</publisher-loc>: <publisher-name>IEEE</publisher-name> (<year>2020</year>). <pub-id pub-id-type="doi">10.1109/CVPR42600.2020.01020</pub-id><pub-id pub-id-type="pmid">27295638</pub-id></citation></ref>
<ref id="B41">
<label>41.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Singh</surname> <given-names>I</given-names></name> <name><surname>Pal Singh</surname> <given-names>N</given-names></name> <name><surname>Singh</surname> <given-names>H</given-names></name> <name><surname>Bawankar</surname> <given-names>S</given-names></name> <name><surname>Ngom</surname> <given-names>A</given-names></name></person-group>. <article-title>Blood cell types classification using CNN</article-title>. In: <source>International Work-Conference on Bioinformatics and Biomedical Engineering.</source> <publisher-loc>Cham</publisher-loc>: <publisher-name>Springer.</publisher-name> (<year>2020</year>). p. <fpage>727</fpage>&#x02013;<lpage>38</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-030-45385-5_65</pub-id></citation>
</ref>
<ref id="B42">
<label>42.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ferreira</surname> <given-names>RL</given-names></name> <name><surname>Coelho Naldi</surname> <given-names>M</given-names></name> <name><surname>Fernando Mari</surname> <given-names>J</given-names></name></person-group>. <article-title>Morphological analysis and classification of erythrocytes in microscopy images</article-title>. In: <source>Proceedings of the 2016 Workshop de Vis&#x000E3;o Computacional.</source> Campo Grande. (<year>2016</year>). p. <fpage>9</fpage>&#x02013;<lpage>11</lpage>.</citation>
</ref>
</ref-list> 
</back>
</article>