<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="brief-report" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xml:lang="EN">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Radiol.</journal-id>
<journal-title>Frontiers in Radiology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Radiol.</abbrev-journal-title>
<issn pub-type="epub">2673-8740</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fradi.2023.1274273</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Radiology</subject>
<subj-group>
<subject>Brief Research Report</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>RoMIA: a framework for creating Robust Medical Imaging AI models for chest radiographs</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes"><name><surname>Anand</surname><given-names>Aditi</given-names></name>
<xref ref-type="corresp" rid="cor1">&#x002A;</xref><uri xlink:href="https://loop.frontiersin.org/people/1064038/overview"/><role content-type="https://credit.niso.org/contributor-roles/conceptualization/"/><role content-type="https://credit.niso.org/contributor-roles/data-curation/"/><role content-type="https://credit.niso.org/contributor-roles/formal-analysis/"/><role content-type="https://credit.niso.org/contributor-roles/investigation/"/><role content-type="https://credit.niso.org/contributor-roles/methodology/"/><role content-type="https://credit.niso.org/contributor-roles/software/"/><role content-type="https://credit.niso.org/contributor-roles/validation/"/><role content-type="https://credit.niso.org/contributor-roles/visualization/"/><role content-type="https://credit.niso.org/contributor-roles/writing-original-draft/"/><role content-type="https://credit.niso.org/contributor-roles/writing-review-editing/"/></contrib>
<contrib contrib-type="author"><name><surname>Krithivasan</surname><given-names>Sarada</given-names></name><uri xlink:href="https://loop.frontiersin.org/people/1303593/overview" /><role content-type="https://credit.niso.org/contributor-roles/formal-analysis/"/><role content-type="https://credit.niso.org/contributor-roles/software/"/><role content-type="https://credit.niso.org/contributor-roles/writing-review-editing/"/></contrib>
<contrib contrib-type="author"><name><surname>Roy</surname><given-names>Kaushik</given-names></name><uri xlink:href="https://loop.frontiersin.org/people/502975/overview" /><role content-type="https://credit.niso.org/contributor-roles/conceptualization/"/><role content-type="https://credit.niso.org/contributor-roles/project-administration/"/><role content-type="https://credit.niso.org/contributor-roles/resources/"/><role content-type="https://credit.niso.org/contributor-roles/supervision/"/><role content-type="https://credit.niso.org/contributor-roles/writing-review-editing/"/></contrib>
</contrib-group>
<aff><institution>School of Electrical and Computer Engineering, Purdue University</institution>, <addr-line>West Lafayette, IN</addr-line>, <country>United States</country></aff>
<author-notes>
<fn fn-type="edited-by"><p><bold>Edited by:</bold> Ulas Bagci, Northwestern University, United States</p></fn>
<fn fn-type="edited-by"><p><bold>Reviewed by:</bold> Bo Zhou, Yale University, United States</p>
<p>Koushik Biswas, Northwestern University, United States</p></fn>
<corresp id="cor1"><label>&#x002A;</label><bold>Correspondence:</bold> Aditi Anand <email>anand86@purdue.edu</email></corresp>
</author-notes>
<pub-date pub-type="epub"><day>08</day><month>01</month><year>2024</year></pub-date>
<pub-date pub-type="collection"><year>2023</year></pub-date>
<volume>3</volume><elocation-id>1274273</elocation-id>
<history>
<date date-type="received"><day>10</day><month>08</month><year>2023</year></date>
<date date-type="accepted"><day>18</day><month>12</month><year>2023</year></date>
</history>
<permissions>
<copyright-statement>&#x00A9; 2024 Anand, Krithivasan and Roy.</copyright-statement>
<copyright-year>2024</copyright-year><copyright-holder>Anand, Krithivasan and Roy</copyright-holder><license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the <ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution License (CC BY)</ext-link>. The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract>
<p>Artificial Intelligence (AI) methods, particularly Deep Neural Networks (DNNs), have shown great promise in a range of medical imaging tasks. However, the susceptibility of DNNs to producing erroneous outputs under the presence of input noise and variations is of great concern and one of the largest challenges to their adoption in medical settings. Towards addressing this challenge, we explore the robustness of DNNs trained for chest radiograph classification under a range of perturbations reflective of clinical settings. We propose RoMIA, a framework for the creation of <underline>Ro</underline>bust <underline>M</underline>edical <underline>I</underline>maging <underline>A</underline>I models. RoMIA adds three key steps to the model training and deployment flow: (i) Noise-added training, wherein a part of the training data is synthetically transformed to represent common noise sources, (ii) Fine-tuning with input mixing, in which the model is refined with inputs formed by mixing data from the original training set with a small number of images from a different source, and (iii) DCT-based denoising, which removes a fraction of high-frequency components of each image before applying the model to classify it. We applied RoMIA to create six different robust models for classifying chest radiographs using the CheXpert dataset. We evaluated the models on the CheXphoto dataset, which consists of naturally and synthetically perturbed images intended to evaluate robustness. Models produced by RoMIA show 3&#x0025;&#x2013;5&#x0025; improvement in robust accuracy, which corresponds to an average reduction of 22.6&#x0025; in misclassifications. These results suggest that RoMIA can be a useful step towards enabling the adoption of AI models in medical imaging applications.</p>
</abstract>
<kwd-group>
<kwd>medical imaging</kwd>
<kwd>artificial intelligence</kwd>
<kwd>artificial neural networks</kwd>
<kwd>robustness</kwd>
<kwd>radiology</kwd>
<kwd>chest radiographs</kwd>
</kwd-group>
<contract-sponsor id="cn001">Center for the Co-Design of Cognitive Systems (CoCoSys), a JUMP2.0 center sponsored by the Semiconductor Research Corporation (SRC) and DARPA</contract-sponsor>
<counts>
<fig-count count="4"/>
<table-count count="0"/><equation-count count="0"/><ref-count count="45"/><page-count count="0"/><word-count count="0"/></counts><custom-meta-wrap><custom-meta><meta-name>section-at-acceptance</meta-name><meta-value>Artificial Intelligence in Radiology</meta-value></custom-meta></custom-meta-wrap>
</article-meta>
</front>
<body>
<sec id="s1" sec-type="intro"><label>1</label><title>Introduction</title>
<p>Artificial Intelligence is transforming the field of medicine in many ways, with applications spanning from drug discovery to genomics and, most prominently, radiology. Since AI has been particularly successful in computer vision, one of its most promising applications is to medical imaging. Deep neural networks (DNNs), which are composed of several layers of artificial neurons, have demonstrated great success in computer vision tasks. These networks, particularly convolutional neural networks (CNNs), have explored for various medical imaging tasks, including diagnosis of diabetic retinopathy (<xref ref-type="bibr" rid="B1">1</xref>, <xref ref-type="bibr" rid="B2">2</xref>), breast cancer and malignant lymph nodes from histopathological images (<xref ref-type="bibr" rid="B3">3</xref>), and pulmonary and cardiological conditions from chest radiographs (<xref ref-type="bibr" rid="B4">4</xref>). The recent wave of promising research has led to significant interest in deploying these technologies in clinical settings. However, there are many hurdles that must be crossed before we can realize this potential.</p>
<p>Medical imaging models are first trained on a training dataset, and then tested in field trials before being deployed. One major challenge in this process arises from the differences between the data on which the models are trained and the data that they encounter after deployment (<xref ref-type="bibr" rid="B5">5</xref>). AI models are known to be very brittle to input noise and variations (<xref ref-type="bibr" rid="B6">6</xref>), even ones that are imperceptible to humans (<xref ref-type="bibr" rid="B7">7</xref>). There are several scenarios where medical imaging models encounter noise or variations that can impact the accuracy of their predictions (<xref ref-type="bibr" rid="B8">8</xref>). One popular use of medical imaging models is for telemedicine in areas that have a lack of trained physicians, where smartphones are used to take photos of scans, which are then sent through messaging apps, introducing distortion and compression artifacts (<xref ref-type="bibr" rid="B9">9</xref>). Additionally, using imaging equipment made by different manufacturers or using different settings on the imaging equipment can create variations in the resulting images (<xref ref-type="bibr" rid="B10">10</xref>, <xref ref-type="bibr" rid="B11">11</xref>). AI models have also demonstrated significant performance variation across different patient populations (<xref ref-type="bibr" rid="B12">12</xref>). Any of these factors can result in a model making inaccurate predictions (<xref ref-type="bibr" rid="B8">8</xref>).</p>
<p>Recent work has demonstrated that variations and noise in the input can significantly reduce the accuracy of medical imaging AI models (<xref ref-type="bibr" rid="B8">8</xref>, <xref ref-type="bibr" rid="B10">10</xref>). Although there has been a large body of work in the AI community on improving the robustness of these models under noise and adversarial perturbations, very few efforts have focused on the medical domain. There are various unique challenges posed by the domain of medical imaging that make it essential to address robustness specifically in this context (<xref ref-type="bibr" rid="B8">8</xref>). As described above, the nature of input noise and variations is primarily due to equipment differences, telemedicine, patient population; sources of variation seen in other settings (background objects, lighting, occlusion, etc.) are less relevant in medical settings (<xref ref-type="bibr" rid="B9">9</xref>, <xref ref-type="bibr" rid="B10">10</xref>, <xref ref-type="bibr" rid="B12">12</xref>). Furthermore, due to regulations and higher safeguards applied to medical data, adversarial attacks may be much less of a concern in this setting relative to other settings.</p>
<p>In this paper, we propose RoMIA, a framework to create more robust medical imaging models. RoMIA consists of three main steps: Noise-added Training, Fine-tuning with Input Mixing, and DCT-based denoising. In Noise-added Training, a fraction of the images in the training dataset are transformed by adding noise in order to make the trained model more robust (<xref ref-type="bibr" rid="B13">13</xref>). Specifically, we find that transformations such as glare matte, moire, and tilt result in models that perform best on photographs of radiographs. In Fine-tuning with Input Mixing, we fine-tune the trained model using a small amount of data from a different source in order to improve the model&#x0027;s robustness (<xref ref-type="bibr" rid="B14">14</xref>). Since only limited data from additional sources are likely to be available in practice, we use input mixing to avoid overfitting during this stage. Finally, in DCT-based denoising, we remove higher-frequency components in the input images before they are passed to the model for classification (<xref ref-type="bibr" rid="B15">15</xref>). This is motivated by our observation that perturbations encountered in medical imaging settings largely impact the high-frequency components of the images that are not essential for classification.</p>
<p>We evaluate the RoMIA framework using six popular CNNs trained on the CheXpert dataset, which contains 224,316 chest radiographs of 65,240 patients from Stanford Hospital (<xref ref-type="bibr" rid="B4">4</xref>). The created models diagnose Atelectasis, Cardiomegaly, Consolidation, Edema, and Pleural Effusion. For Fine-tuning with Input Mixing, we used 500 images from the ChestX-ray8 dataset from NIH (<xref ref-type="bibr" rid="B16">16</xref>). We evaluated the models using the CheXphoto dataset, which consists of 10,507 smartphone photos of chest radiographs from 3,000 patients (<xref ref-type="bibr" rid="B9">9</xref>). Our experiments indicate that a baseline model trained on the CheXpert dataset has an Area Under Receiving Operating Characteristic (AUROC) drop of 10&#x0025;&#x2013;14&#x0025; when evaluated on the CheXphoto dataset. RoMIA creates models that improve AUROC by up to 5&#x0025;, and reduces misclassifications by an average of 22.6&#x0025;, underscoring its potential to create more robust medical imaging models.</p>
<sec id="s1a"><label>1.1</label><title>Related work</title>
<p>Several research efforts have explored the use of CNNs for medical imaging. Building on these efforts, systems that support diagnosis are in various stages of deployment. These include systems for processing retinal scans (<xref ref-type="bibr" rid="B1">1</xref>, <xref ref-type="bibr" rid="B2">2</xref>, <xref ref-type="bibr" rid="B17">17</xref>), breast cancer detection (<xref ref-type="bibr" rid="B18">18</xref>), and skin cancer detection (<xref ref-type="bibr" rid="B19">19</xref>), among others. We focus our discussion on related efforts along two directions: those that explore CNN-based classification of chest radiographs and those that explore the robustness of medical imaging CNNs.</p>
<sec id="s1a1"><label>1.1.1</label><title>Prior work on chest radiograph classification</title>
<p>Chest radiographs are among the most commonly requested radiological examinations since they are highly effective in detecting cardiothoracic and pulmonary abnormalities. Automation of abnormality detection in chest radiographs can help address the high workload of radiologists in large urban settings on the one hand, and the lack of experienced radiologists in less developed rural settings on the other. This need was only exacerbated during the COVID-19 pandemic when healthcare systems were overwhelmed and chest radiographs were commonly used as a first-line triage method. Motivated by this challenge, several efforts have developed DNN models for processing of chest radiographs (<xref ref-type="bibr" rid="B20">20</xref>&#x2013;<xref ref-type="bibr" rid="B28">28</xref>). These works have proposed key ideas including the use of pre-training with natural images (<xref ref-type="bibr" rid="B20">20</xref>), multi-modal fusion of radiographs with clinical data (<xref ref-type="bibr" rid="B22">22</xref>), the use of transformer networks for such multi-modal fusion (<xref ref-type="bibr" rid="B24">24</xref>), manual design (<xref ref-type="bibr" rid="B27">27</xref>) or automated neural architecture search (<xref ref-type="bibr" rid="B25">25</xref>) to find a suitable DNN architecture for chest radiograph classification, bio-inspired training algorithms for small training sets (<xref ref-type="bibr" rid="B26">26</xref>) and the use of a focal loss function to address the significant class imbalance that is often present in chest radiograph datasets (<xref ref-type="bibr" rid="B28">28</xref>). These efforts have demonstrated high accuracies in various chest radiograph classification tasks, promoting interest in their use in clinical practice. Supporting the development of DNN models for chest radiographs has been the curation of public datasets (<xref ref-type="bibr" rid="B4">4</xref>, <xref ref-type="bibr" rid="B9">9</xref>, <xref ref-type="bibr" rid="B16">16</xref>, <xref ref-type="bibr" rid="B29">29</xref>).</p>
</sec>
<sec id="s1a2"><label>1.1.2</label><title>Prior work on robustness of medical imaging AI models</title>
<p>It is well known that input variations, noise and adversarial perturbations can have a large negative impact on the accuracy of DNNs. For example, it has been shown that chest radiographs with added natural noise as well as the use of smartphone-captured photographs of radiographs caused significant degradation in accuracy (<xref ref-type="bibr" rid="B9">9</xref>). Another study found that DNN models trained on data from one hospital demonstrate considerably lower performance on data from a different hospital (<xref ref-type="bibr" rid="B10">10</xref>). Adversarial perturbations have also been shown to have a drastic impact on the accuracy of DNNs used in medical imaging (<xref ref-type="bibr" rid="B30">30</xref>, <xref ref-type="bibr" rid="B31">31</xref>). These concerns, while broadly true of DNNs, are especially important for life-critical applications such as medical imaging. As a result, previous works have proposed and evaluated techniques to improve the robustness of medical imaging DNNs. The combination of large-scale supervised transfer learning with self-supervised learning was shown to improve the out-of-distribution generalization performance of medical imaging DNNs (<xref ref-type="bibr" rid="B32">32</xref>). The addition of Global Attention Noise during training (<xref ref-type="bibr" rid="B33">33</xref>), as well as adversarial training, where adversarial inputs are included in the training process (<xref ref-type="bibr" rid="B31">31</xref>), have been shown to improve the accuracy of medical imaging DNNs against adversarial attacks. Multi-task learning was used to address the specific challenges of prediction instability and explainability in the classification of smartphone photos of chest radiographs (<xref ref-type="bibr" rid="B21">21</xref>).</p>
<p>Our work makes the following contributions that go above and beyond the previous efforts. While noise-added training is a well-known technique to improve the robustness of neural networks (<xref ref-type="bibr" rid="B34">34</xref>) and has recently been applied to medical imaging specifically for adversarial robustness (<xref ref-type="bibr" rid="B31">31</xref>, <xref ref-type="bibr" rid="B33">33</xref>), our work applies it to achieve robustness to natural sources of noise. Input mixing and DCT-based denoising have not been previously applied to the medical imaging domain to the best of our knowledge. Further, RoMIA is the first framework to combine these three techniques to improve robustness and to incorporate robustness improvement into all three key steps of the medical imaging AI pipeline (training, fine-tuning, and inference). Our results show that the combined use of all three techniques leads to substantially better accuracy than any of the techniques alone.</p>
</sec>
</sec>
</sec>
<sec id="s2" sec-type="methods"><label>2</label><title>Materials and methods</title>
<p>In this section, we first describe the commonly used process for training medical imaging DNNs, and the challenges faced by such models due to input noise and variations. We then present the RoMIA framework to increase model robustness and the methodology used to evaluate it.</p>
<sec id="s2a"><label>2.1</label><title>Pitfalls in conventional training methods</title>
<p>Typically, the creation of a medical imaging model starts with the collection of a large training dataset with training labels provided by physicians. In some cases, this may require years of data collection. For example, the CheXpert dataset of chest radiographs represents data collected over a period of 15 years (<xref ref-type="bibr" rid="B4">4</xref>). Next, a DNN is either trained from scratch or a model trained on a different computer vision dataset such as ImageNet (<xref ref-type="bibr" rid="B35">35</xref>) is transferred using the training data. The model may be evaluated on held-out or entirely different datasets, and then deployed. When deployed, the model may be applied to data that contains noise or variations. Frequently, this leads to significant degradation in model performance (<xref ref-type="bibr" rid="B8">8</xref>).</p>
</sec>
<sec id="s2b"><label>2.2</label><title>RoMIA framework</title>
<p><xref ref-type="fig" rid="F1">Figure&#x00A0;1</xref> describes the RoMIA framework to train more robust medical imaging models. We modify the standard model creation flow by adding three main components: Noise-added Training, Fine-tuning with Input Mixing, and DCT-based denoising.</p>
<fig id="F1" position="float"><label>Figure 1</label>
<caption><p>Overview of the RoMIA framework to create robust medical imaging AI models. The chest radiographs shown are from the CheXpert dataset (<xref ref-type="bibr" rid="B4">4</xref>).</p></caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="fradi-03-1274273-g001.tif"/>
</fig>
<sec id="s2b1"><label>2.2.1</label><title>Noise-added training</title>
<p>In <italic>Noise-added Training</italic>, we introduce synthetic perturbations (noise) into the training data that mimic those observed in medical settings. We evaluated the following transformations:
<list list-type="simple">
<list-item><label>&#x2022;</label><p><italic>Glare matte:</italic> A filter designed to emulate the effect of glare observed when displaying the image on a matte screen.</p></list-item>
<list-item><label>&#x2022;</label><p><italic>Moire:</italic> A filter designed to simulate the Moire effect, which produces repetitive interference patterns such as lines or stripes on the image due to limited resolution.</p></list-item>
<list-item><label>&#x2022;</label><p><italic>Tilt:</italic> This transformation simulates a change in perspective that could result when a photograph of a medical image is taken using a device such as a smartphone (<xref ref-type="bibr" rid="B11">11</xref>).</p></list-item>
<list-item><label>&#x2022;</label><p><italic>Brightness</italic> and <italic>Contrast</italic>: These transformations simulate changes to the settings in imaging equipment.</p></list-item>
<list-item><label>&#x2022;</label><p><italic>Blur:</italic> This transformation simulates the loss in sharpness of the image due to motion of the patient during capture.</p></list-item>
</list>Among all evaluated transformations, we found that the first three were the most effective in creating more robust models. It bears mentioning that this result may be due to the fact that we evaluate robustness on the CheXphoto dataset. Hence, the transformations that introduce the most photographic noise may provide the best robustness. Notwithstanding this, the framework is extensible and additional transformations can be added to diversify the suite we have implemented.</p>
<p>We consider two strategies for applying noise to the training dataset: a specific percentage of the images in the dataset are injected with noise and either added (thereby expanding the dataset) or replace their original versions (thereby preserving the size of the dataset). We refer to these strategies as <italic>augmentation</italic> and <italic>replacement</italic>, respectively. All training hyperparameters (learning rate, batch size, optimizer, epochs, etc.) were kept unchanged.</p>
</sec>
<sec id="s2b2"><label>2.2.2</label><title>Fine-tuning with input mixing</title>
<p>In <italic>Fine-tuning with Input Mixing</italic>, we fine tune the model with a very small amount of data from a different source to improve the model&#x0027;s robustness. Since acquiring large amounts of additional training data may be challenging in practice, we limited ourselves to just 500 images, which correspond to around 0.22&#x0025; of the original training set. While input mixing has been proposed in the literature as a data augmentation strategy, our contribution is the specific use of input mixing during the fine-tuning step and its evaluation in the context of medical imaging models. For our experiments, we draw these images at random from the ChestX-ray8 dataset from NIH (<xref ref-type="bibr" rid="B16">16</xref>). One challenge with using a very limited amount of data is that it could easily lead to overfitting. In order to prevent this, we use input mixing, a well-known technique where two images are combined into a composite input that contains information from both. Minimizing loss on mixed inputs has been shown to approximately correspond to maximizing robust accuracy (<xref ref-type="bibr" rid="B36">36</xref>). We mixed the additional data with images from the original training set for the fine- tuning phase. We considered three different mixing strategies that have been proposed in the literature. With CutMix (<xref ref-type="bibr" rid="B14">14</xref>), a randomly selected patch of one input image is placed into another. With MixUp, the pixels of two images are averaged in a weighted manner to construct a composite image. In both cases, the labels from the two images being mixed are also combined to derive the target label for the composite input (<xref ref-type="bibr" rid="B36">36</xref>, <xref ref-type="bibr" rid="B37">37</xref>). In AugMix, images are mixed with augmented versions of themselves, so the label does not change (<xref ref-type="bibr" rid="B15">15</xref>). We mix the 500 images from ChestX-ray8 with 1,000 randomly selected images from the CheXpert training set and fine-tune the model for 3 epochs with these mixed inputs. All other hyperparameters such as the learning rate and optimizer were the same as those used in the training stage.</p>
</sec>
<sec id="s2b3"><label>2.2.3</label><title>DCT-based denoising</title>
<p>DCT-based denoising is based on the insight that most sources of noise disproportionately affect the high-frequency components of an image (<xref ref-type="bibr" rid="B38">38</xref>). This is shown in <xref ref-type="fig" rid="F2">Figure&#x00A0;2</xref>, which plots the percent difference in the top and bottom 1&#x0025; of frequencies of the original and noisy images from the CheXpert (<xref ref-type="bibr" rid="B4">4</xref>) and CheXphoto (<xref ref-type="bibr" rid="B9">9</xref>) datasets, where the noisy images were produced using synthetic digital perturbations, synthetic photographic perturbations, and photos taken of the images with a smartphone camera. During inference, we add a preprocessing stage to the model which uses DCT (discrete cosine transform) to transform the image into the frequency domain, then removes a set percentage of high-frequency components, and finally computes the inverse DCT (<xref ref-type="bibr" rid="B15">15</xref>, <xref ref-type="bibr" rid="B39">39</xref>). The percentage of high-frequency components to be removed from an image (denoted by <italic>&#x03B7;</italic>) is determined through an experiment where a small fraction of the training set (CheXpert, in our experiments) is subject to DCT-based denoising for different values of <italic>&#x03B7;</italic>. For each model, the largest value of <italic>&#x03B7;</italic> (which corresponds to the most aggressive denoising) that keeps the AUROC to within 0.005 of the original accuracy (where <italic>&#x03B7;</italic>&#x2009;&#x003D;&#x2009;0) is chosen. Optimizing the hyperparameter <italic>&#x03B7;</italic> ensures that the frequencies removed do not significantly interfere with the features used by the model for classification.</p>
<fig id="F2" position="float"><label>Figure 2</label>
<caption><p>Difference between clean (CheXpert) and noisy (CheXphoto) images in high and low frequencies.</p></caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="fradi-03-1274273-g002.tif"/>
</fig>
<p>To summarize, the proposed flow to create robust medical imaging models consists of transferring a model trained on ImageNet to the target medical imaging dataset using noise-added learning, then fine-tuning the resulting model with input mixing, then finally adding a DCT-based denoiser to the model before deployment.</p>
</sec>
</sec>
<sec id="s2c"><label>2.3</label><title>Experimental setup</title>
<p>We implemented the RoMIA framework using the PyTorch (<xref ref-type="bibr" rid="B40">40</xref>), TensorFlow (<xref ref-type="bibr" rid="B41">41</xref>), libAUC (<xref ref-type="bibr" rid="B42">42</xref>), and OpenCV (<xref ref-type="bibr" rid="B43">43</xref>) libraries. We applied the framework to create models for classification of chest radiographs. The base models were selected from popular image classification DNNs trained on the ImageNet (<xref ref-type="bibr" rid="B35">35</xref>) dataset (see <xref ref-type="fig" rid="F3">Figure&#x00A0;3A</xref>). Note that all the networks are Convolutional Neural Networks (CNNs), since these are the most popular type of DNN used for image classification tasks. We specifically created a model to detect Atelectasis, Cardiomegaly, Consolidation, Edema, and Pleural Effusion. Accordingly, the final fully connected layer of each base model was removed and replaced with a layer with five outputs. These models were then transferred using the CheXpert (<xref ref-type="bibr" rid="B4">4</xref>) dataset, which contains 224,316 chest radiographs of 65,240 patients from Stanford Hospital. For the fine-tuning step, we randomly selected 500 images from NIH&#x0027;s ChestX-ray8 (<xref ref-type="bibr" rid="B16">16</xref>) dataset. The learning rate used for both the transfer and fine-tuning steps was 0.0001, number of epochs was 3 with a batch size of 32, and weight decay was 10<sup>&#x2212;5</sup>. The Adam optimizer and cross-entropy loss were used. For the MixUp (<xref ref-type="bibr" rid="B36">36</xref>) strategy, we use a beta distribution to select values between 0.4 and 0.6 to determine <italic>&#x03BB;</italic>, the image mixing ratio. We evaluated the models on the CheXphoto (<xref ref-type="bibr" rid="B9">9</xref>) dataset, which consists of 10,507 natural photos and synthetic transformations of chest radiographs from 3,000 patients. Since our noise-added training step uses transformations similar to those in CheXphoto, we only perform our evaluations on the natural photographs. We repeated each of our experiments five times with different random seeds.</p>
<fig id="F3" position="float"><label>Figure 3</label>
<caption><p>(<bold>A</bold>) Characteristics of the baseline models used in the experiments and accuracy values (<bold>B</bold>) AUROC of baseline models on CheXpert and CheXphoto, (<bold>C</bold>) AUROC improvement from RoMIA and each of its constituent techniques, and (<bold>D</bold>) example inputs misclassified by the baseline model but correctly classified by RoMIA model.</p></caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="fradi-03-1274273-g003.tif"/>
</fig>
</sec>
</sec>
<sec id="s3" sec-type="results"><label>3</label><title>Results</title>
<p>In this section, we present results from evaluation of models created using the RoMIA framework. We first present the difference in AUROC of the baseline models when evaluated on a subset of CheXpert and CheXphoto images. Next, we present the performance of models trained using RoMIA and compare them to the baseline models. Subsequently, we perform an ablation study to investigate the contribution of each of the three components (Noise-added learning, Fine-tuning with input mixing, DCT-based denoising) to the overall improvement in robustness. We then explore different dataset transformation techniques for the Noise-added Training step and evaluate their impact on the model performance. We also compare the performance between different strategies for input mixing in the fine-tuning step. Finally, we explore the determination of the parameter <italic>&#x03B7;</italic> which controls the percent of high-frequency components removed from the input during DCT-based Denoising.</p>
<sec id="s3a"><label>3.1</label><title>Robustness of baseline models</title>
<p>A key motivation for this work is that baseline models trained on a certain dataset perform significantly worse on similar datasets with added noise. To demonstrate this in the context of CheXpert and CheXphoto, we study the differences in AUROC of a baseline model trained on CheXpert and then applied to both CheXpert and CheXphoto data. <xref ref-type="fig" rid="F3">Figure&#x00A0;3B</xref> presents the AUROC scores for the baseline models on the CheXpert and CheXphoto data. The figure shows a degradation of 10&#x0025;&#x2013;14&#x0025; in AUROC across all six models, underscoring the need to create more robust models in the context of medical imaging.</p>
</sec>
<sec id="s3b"><label>3.2</label><title>Overall improvements from RoMIA and ablation study</title>
<p>The RoMIA framework consists of three techniques to improve robustness, so we conduct an ablation study to evaluate each component. <xref ref-type="fig" rid="F3">Figure&#x00A0;3C</xref> shows the baseline accuracy, the results of the ablation study (applying each of the three techniques in RoMIA individually), and the resulting AUROC score when all three techniques are combined in RoMIA. To capture the benefits of the proposed framework, we first look solely at the CheXphoto AUROC values for the baseline and RoMIA models. We observe around 3&#x0025;&#x2013;5&#x0025; improvement in AUROC, which corresponds to an average reduction in misclassifications by 22.6&#x0025;, suggesting that the proposed framework is capable of creating substantially more robust models. We also observe a larger improvement in robustness on deeper models, such as ResNet50 and DenseNet201. We hypothesize that this is because deeper models can better learn the more diverse training data which they are presented in the RoMIA framework. In order to evaluate the statistical validity of the results, we repeated the training runs for the baseline and RoMIA models with 10 additional random seeds. We performed a one-tailed paired <italic>t</italic>-test and concluded that the improvements were statistically significant with <italic>p</italic>&#x2009;&#x003C;&#x2009;0.01. <xref ref-type="fig" rid="F3">Figure&#x00A0;3D</xref> presents examples of inputs that are misclassified by the baseline model but correctly classified by RoMIA.</p>
<p><xref ref-type="fig" rid="F3">Figure&#x00A0;3C</xref> also presents the results of our ablation study to evaluate each of the three components in the proposed framework. We do this by evaluating the CheXphoto AUROC when each technique is applied individually. We observe that overall, each technique has a positive impact on robustness. The combination of three techniques used in RoMIA boosts AUROC by up to 5&#x0025;. We evaluate each technique in more detail in subsequent sub-sections.</p>
</sec>
<sec id="s3c"><label>3.3</label><title>Contributions from noise-added training</title>
<p><xref ref-type="fig" rid="F4">Figure&#x00A0;4A</xref> explores the impact of various dataset transformation techniques used in noise-added learning. Specifically, we transformed 10&#x0025;, 25&#x0025;, and 50&#x0025; of the training samples in the CheXpert dataset and either added them to the dataset (augmentation) or replaced the original samples with them (replacement). We observe that the 25&#x0025; replacement strategy worked best across all networks. We note that this strategy does not impact training time, as the only overhead incurred is a one-time transformation (noise addition) to the inputs, which is insignificant.</p>
<fig id="F4" position="float"><label>Figure 4</label>
<caption><p>(<bold>A</bold>) Comparing dataset replacement vs. augmentation during noise-added training (ResNet18), (<bold>B</bold>) comparing different input mixing strategies in RoMIA for various networks, and (<bold>C</bold>) determination of frequency cutoff threshold (<italic>&#x03B7;</italic>) and (<bold>D</bold>) impact of <italic>&#x03B7;</italic> on CheXphoto AUROC.</p></caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="fradi-03-1274273-g004.tif"/>
</fig>
</sec>
<sec id="s3d"><label>3.4</label><title>Effect of fine-tuning with input mixing</title>
<p>Several approaches to input mixing have been proposed in the literature, primarily as methods for data augmentation that lead to better generalization of machine learning models. To evaluate the impact of the mixing strategy in the fine-tuning step of RoMIA, we consider CutMix (<xref ref-type="bibr" rid="B14">14</xref>) and MixUp (<xref ref-type="bibr" rid="B36">36</xref>), the two most widely used strategies, in addition to AugMix (<xref ref-type="bibr" rid="B15">15</xref>) and Cutout (<xref ref-type="bibr" rid="B44">44</xref>). To determine which strategy yields higher improvement in robustness, we compare in <xref ref-type="fig" rid="F4">Figure&#x00A0;4B</xref> the AUROC boosts on the CheXphoto dataset when each strategy is applied. We observe that, while all mixing strategies yield improvements over the baseline, MixUp provides the best results overall, followed by Cutout and AugMix. This motivated our decision to use MixUp in the final RoMIA framework.</p>
</sec>
<sec id="s3e"><label>3.5</label><title>Selection of <italic>&#x03B7;</italic> in DCT-based denoising</title>
<p>A key feature of our framework is the DCT-based Denoising step, which removes high-frequency noise from the inputs. We use the parameter <italic>&#x03B7;</italic> to denote the percentage of high-frequency components removed from each image. In <xref ref-type="fig" rid="F4">Figure&#x00A0;4C, D</xref>, we consider the impact of the choice of the parameter <italic>&#x03B7;</italic> by showing how different <italic>&#x03B7;</italic> values affect CheXpert and CheXphoto AUROC. Due to the nature of x-ray radiographs, we find that removing a large fraction of the high frequencies does not have a detrimental impact on performance for either dataset and in fact improves accuracy on the noisy (CheXphoto) data. We determine <italic>&#x03B7;</italic> as the largest value that results in a less than 0.5&#x0025; decrease in accuracy on the clean (CheXpert) dataset (<xref ref-type="fig" rid="F4">Figure&#x00A0;4C</xref>). We observe that this value of <italic>&#x03B7;</italic> improves performance on the CheXphoto dataset (<xref ref-type="fig" rid="F4">Figure&#x00A0;4D</xref>). This result underscores the efficacy of DCT-based denoising.</p>
</sec>
</sec>
<sec id="s4" sec-type="discussion"><label>4</label><title>Discussion</title>
<p>The success of AI in recent years has led to significant interest in applying it to the medical field. In particular, since DNNs have been very successful in image processing applications, they are frequently being applied to medical imaging tasks. One of the challenges that must be addressed when applying AI to any critical application, and certainly to medical imaging, is their robustness under conditions encountered in the real world. Previous research has shown that DNN models can be very brittle in the presence of input noise and variations. Our work is a first step towards improving the robustness of medical imaging models, with a particular focus on the kinds of noise encountered in medical settings. Although our experimental setup focuses on models for classifying chest radiographs, the techniques we propose are worth exploring in other medical imaging applications.</p>
<p>While the RoMIA framework achieves considerable improvements in robust accuracy, there still remains a gap in accuracy on clean and noisy inputs, especially for high levels of noise, that could be addressed by future work. One possible direction is to address robustness when training from scratch, in contrast to RoMIA, which only addresses it in the transfer learning step. Also, our work evaluates robustness as accuracy in classifying photographs of chest radiographs (i.e., the CheXphoto dataset). Future work could evaluate robustness under a broader set of conditions. Another interesting direction would be evaluating these techniques in a broader range of medical imaging applications. Given the criticality of medical imaging applications, robustness evaluation should be made a standard part of the regulatory evaluation process for these models. Finally, human checking of the output of AI models is one way of improving the confidence in their decisions. This could be enabled by creating explainable models that produce a human-interpretable justification for their decisions. Addressing these issues will go a long way towards enabling the adoption of AI-based medical imaging in clinical practice.</p>
</sec>
</body>
<back>
<sec id="s5" sec-type="data-availability"><title>Data availability statement</title>
<p>The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.</p>
</sec>
<sec id="s6" sec-type="author-contributions"><title>Author contributions</title>
<p>AA: Conceptualization, Data curation, Formal Analysis, Investigation, Methodology, Software, Validation, Visualization, Writing &#x2013; original draft, Writing &#x2013; review &#x0026; editing. SK: Formal Analysis, Software, Writing &#x2013; review &#x0026; editing. KR: Conceptualization, Project administration, Resources, Supervision, Writing &#x2013; review &#x0026; editing.</p>
</sec>
<sec id="s7" sec-type="funding-information"><title>Funding</title>
<p>The author(s) declare financial support was received for the research, authorship, and/or publication of this article.</p>
<p>This work was supported in part by the Center for the Co-Design of Cognitive Systems (CoCoSys), a JUMP2.0 center sponsored by the Semiconductor Research Corporation (SRC) and DARPA.</p>
</sec>
<sec id="s8" sec-type="COI-statement"><title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec id="s9" sec-type="disclaimer"><title>Publisher&#x0027;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ref-list><title>References</title>
<ref id="B1"><label>1.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gulshan</surname><given-names>V</given-names></name><name><surname>Peng</surname><given-names>L</given-names></name><name><surname>Coram</surname><given-names>M</given-names></name><name><surname>Stumpe</surname><given-names>M</given-names></name><name><surname>Wu</surname><given-names>D</given-names></name><name><surname>Narayanaswamy</surname><given-names>A</given-names></name><etal/></person-group> <article-title>Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs</article-title>. <source>JAMA</source>. (<year>2016</year>) <volume>316</volume>(<issue>22</issue>):<fpage>2402</fpage>&#x2013;<lpage>10</lpage>. <pub-id pub-id-type="doi">10.1001/jama.2016.17216</pub-id><pub-id pub-id-type="pmid">27898976</pub-id></citation></ref>
<ref id="B2"><label>2.</label><citation citation-type="other"><comment>Arda: Using artificial intelligence in ophthalmology. Google Health</comment>. <comment>Available at:</comment> <ext-link ext-link-type="uri" xlink:href="https://health.google/caregivers/arda/">https://health.google/caregivers/arda/</ext-link> <comment>(Cited January 26, 2023)</comment>.</citation></ref>
<ref id="B3"><label>3.</label><citation citation-type="other"><person-group person-group-type="author"><name><surname>Greenfield</surname><given-names>D</given-names></name></person-group>. <comment>Artificial Intelligence in Medicine: Applications, implications, and limitations. Science in the news. Harvard University</comment> (<year>2019</year>). <comment>Available at:</comment> <ext-link ext-link-type="uri" xlink:href="https://sitn.hms.harvard.edu/flash/2019/artificial-intelligence-in-medicine-applications-implications-and-limitations/">https://sitn.hms.harvard.edu/flash/2019/artificial-intelligence-in-medicine-applications-implications-and-limitations/</ext-link> <comment>(Cited January 26, 2023)</comment>.</citation></ref>
<ref id="B4"><label>4.</label><citation citation-type="other"><person-group person-group-type="author"><name><surname>Irvin</surname><given-names>J</given-names></name><name><surname>Rajpurkar</surname><given-names>P</given-names></name><name><surname>Ko</surname><given-names>M</given-names></name><name><surname>Yu</surname><given-names>Y</given-names></name><name><surname>Ciurea-Ilcus</surname><given-names>S</given-names></name><name><surname>Chute</surname><given-names>C</given-names></name><etal/></person-group> <comment>CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison</comment>. arXiv [Preprint]. <italic>arXiv:1901.07031</italic> (2019).</citation></ref>
<ref id="B5"><label>5.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname><given-names>S</given-names></name><name><surname>Cao</surname><given-names>G</given-names></name><name><surname>Wang</surname><given-names>Y</given-names></name><name><surname>Liao</surname><given-names>S</given-names></name><name><surname>Wang</surname><given-names>Q</given-names></name><name><surname>Shi</surname><given-names>J</given-names></name><etal/></person-group> <article-title>Review and prospect: artificial intelligence in advanced medical imaging</article-title>. <source>Front Radiol</source>. (<year>2021</year>) <volume>1</volume>:<fpage>781868</fpage>. <pub-id pub-id-type="doi">10.3389/fradi.2021.781868</pub-id><pub-id pub-id-type="pmid">37492170</pub-id></citation></ref>
<ref id="B6"><label>6.</label><citation citation-type="other"><person-group person-group-type="author"><name><surname>Chen</surname><given-names>A</given-names></name><name><surname>Li</surname><given-names>C</given-names></name><name><surname>Chen</surname><given-names>H</given-names></name><name><surname>Yang</surname><given-names>H</given-names></name><name><surname>Zhao</surname><given-names>P</given-names></name><name><surname>Hu</surname><given-names>W</given-names></name><etal/></person-group> <comment>A Comparison for Anti-noise Robustness of Deep Learning Classification Methods on a Tiny Object Image Dataset: from Convolutional Neural Network to Visual Transformer and Performer</comment>. <italic>arXiv</italic> [Preprint] <italic>arXiv:2106.01927</italic> (2021).</citation></ref>
<ref id="B7"><label>7.</label><citation citation-type="other"><person-group person-group-type="author"><name><surname>Liu</surname><given-names>M</given-names></name><name><surname>Liu</surname><given-names>S</given-names></name><name><surname>Su</surname><given-names>H</given-names></name><name><surname>Cao</surname><given-names>K</given-names></name><name><surname>Zhu</surname><given-names>J</given-names></name></person-group>. <comment>Analyzing the Noise Robustness of Deep Neural Networks</comment>. <italic>arXiv</italic> [Preprint] <italic>arXiv:1810.03913</italic> (2018).</citation></ref>
<ref id="B8"><label>8.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kulkarni</surname><given-names>V</given-names></name><name><surname>Gawali</surname><given-names>M</given-names></name><name><surname>Kharat</surname><given-names>A</given-names></name></person-group>. <article-title>Key technology considerations in developing and deploying machine learning models in clinical radiology practice</article-title>. <source>JMIR Med Inform</source>. (<year>2021 Sep 9</year>) <volume>9</volume>(<issue>9</issue>):<fpage>e28776</fpage>. <pub-id pub-id-type="doi">10.2196/28776</pub-id></citation></ref>
<ref id="B9"><label>9.</label><citation citation-type="other"><person-group person-group-type="author"><name><surname>Phillips</surname><given-names>N</given-names></name><name><surname>Rajpurkar</surname><given-names>P</given-names></name><name><surname>Sabini</surname><given-names>M</given-names></name><name><surname>Krishnan</surname><given-names>R</given-names></name><name><surname>Zhou</surname><given-names>S</given-names></name><name><surname>Pareek</surname><given-names>A</given-names></name><etal/></person-group> <comment>CheXphoto: 10,000&#x002B; Photos and Transformations of Chest x-rays for Benchmarking Deep Learning Robustness</comment>. <italic>arXiv</italic> [Preprint] <italic>arXiv:2007.06199</italic> (2020).</citation></ref>
<ref id="B10"><label>10.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zech</surname><given-names>JR</given-names></name><name><surname>Badgeley</surname><given-names>MA</given-names></name><name><surname>Liu</surname><given-names>M</given-names></name><name><surname>Costa</surname><given-names>AB</given-names></name><name><surname>Titano</surname><given-names>JJ</given-names></name><name><surname>Oermann</surname><given-names>EK</given-names></name></person-group>. <article-title>Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: a cross-sectional study</article-title>. <source>PLoS Med</source>. (<year>2018 Nov 6</year>) <volume>15</volume>(<issue>11</issue>):<fpage>e1002683</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pmed.1002683.</pub-id> <comment>PMID: 30399157; PMCID: PMC6219764</comment>.</citation></ref>
<ref id="B11"><label>11.</label><citation citation-type="book"><person-group person-group-type="author"><name><surname>Tompe</surname><given-names>A</given-names></name><name><surname>Sargar</surname><given-names>K</given-names></name></person-group>. <source>X-Ray Image Quality Assurance</source>. <publisher-loc>Treasure Island, FL</publisher-loc>: <publisher-name>StatPearls Publishing</publisher-name> (<year>2022</year>).</citation></ref>
<ref id="B12"><label>12.</label><citation citation-type="other"><person-group person-group-type="author"><name><surname>Seyyed-Kalantari</surname><given-names>L</given-names></name><name><surname>Liu</surname><given-names>G</given-names></name><name><surname>McDermott</surname><given-names>M</given-names></name><name><surname>Chen</surname><given-names>I</given-names></name><name><surname>Ghassemi</surname><given-names>M</given-names></name></person-group>. <comment>CheXclusion: Fairness gaps in deep chest x-ray classifiers</comment>. <source>arXiv</source> [Preprint] <italic>arXiv:2003.00827</italic> (<year>202</year>0).</citation></ref>
<ref id="B13"><label>13.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Zheng</surname><given-names>S</given-names></name><name><surname>Song</surname><given-names>Y</given-names></name><name><surname>Leung</surname><given-names>T</given-names></name><name><surname>Goodfellow</surname><given-names>I</given-names></name></person-group>. <conf-name>Improving the robustness of deep neural networks via stability training</conf-name>. <conf-name>International Joint Conference on Artificial Intelligence</conf-name> (<year>2021</year>). p. <fpage>2909</fpage>&#x2013;<lpage>15</lpage>. <pub-id pub-id-type="doi">10.24963/ijcai.2019/403</pub-id></citation></ref>
<ref id="B14"><label>14.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Yun</surname><given-names>S</given-names></name><name><surname>Han</surname><given-names>D</given-names></name><name><surname>Oh</surname><given-names>S</given-names></name><name><surname>Chun</surname><given-names>S</given-names></name><name><surname>Choe</surname><given-names>J</given-names></name><name><surname>Yoo</surname><given-names>Y</given-names></name></person-group>. <conf-name>Cutmix: regularization strategy to train strong classifiers with localizable features</conf-name>. <conf-name>2019 IEEE/CVF International Conference on Computer Vision</conf-name>. <comment>Available at:</comment> <ext-link ext-link-type="uri" xlink:href="https://doi.ieeecomputersociety.org/10.1109/ICCV.2019.00612">https://doi.ieeecomputersociety.org/10.1109/ICCV.2019.00612</ext-link></citation></ref>
<ref id="B15"><label>15.</label><citation citation-type="other"><person-group person-group-type="author"><name><surname>Hendrycks</surname><given-names>D</given-names></name><name><surname>Mu</surname><given-names>N</given-names></name><name><surname>Cubuk</surname><given-names>ED</given-names></name><name><surname>Zoph</surname><given-names>B</given-names></name><name><surname>Gilmer</surname><given-names>J</given-names></name><name><surname>Lakshminarayanan</surname><given-names>B</given-names></name></person-group>. <article-title>Augmix: a simple data processing method to improve robustness and uncertainty</article-title>. <comment>arXiv preprint arXiv:1912.02781</comment> (<year>2019</year>).</citation></ref>
<ref id="B16"><label>16.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Wang</surname><given-names>X</given-names></name><name><surname>Yifan Peng</surname><given-names>LL</given-names></name><name><surname>Lu</surname><given-names>Z</given-names></name><name><surname>Bagheri</surname><given-names>M</given-names></name><name><surname>Ronald</surname><given-names>M</given-names></name></person-group>. <conf-name>SummersChestx-Ray8: hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases</conf-name>. <conf-name>2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</conf-name>; <publisher-name>IEEE</publisher-name>. (<year>2017</year>).</citation></ref>
<ref id="B17"><label>17.</label><citation citation-type="other"><person-group person-group-type="author"><name><surname>Vakharia</surname><given-names>PS</given-names></name></person-group>. <comment>Artificial Intelligence for the Screening of Diabetic Retinopathy. Retinalphysician.com. Retinal Physician</comment> (<year>2022</year>). <comment>Available at:</comment> <ext-link ext-link-type="uri" xlink:href="https://www.retinalphysician.com/issues/2022/november-december-2022/artificial-intelligence-for-the-screening-of-diabe">https://www.retinalphysician.com/issues/2022/november-december-2022/artificial-intelligence-for-the-screening-of-diabe</ext-link> <comment>(Cited January 26, 2023)</comment>.</citation></ref>
<ref id="B18"><label>18.</label><citation citation-type="other"><comment>Medcognetics</comment>. <comment>Available at:</comment> <ext-link ext-link-type="uri" xlink:href="https://www.3derm.com/">https://www.3derm.com/</ext-link> <comment>(Cited January 26, 2023)</comment>.</citation></ref>
<ref id="B19"><label>19.</label><citation citation-type="other"><comment>3Derm</comment>. <comment>Available at:</comment> <ext-link ext-link-type="uri" xlink:href="https://www.medcognetics.com/">https://www.medcognetics.com/</ext-link> <comment>(Cited January 26, 2023)</comment>.</citation></ref>
<ref id="B20"><label>20.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tang</surname><given-names>YX</given-names></name><name><surname>Tang</surname><given-names>YB</given-names></name><name><surname>Peng</surname><given-names>Y</given-names></name><name><surname>Yan</surname><given-names>K</given-names></name><name><surname>Bagheri</surname><given-names>M</given-names></name><name><surname>Redd</surname><given-names>B</given-names></name><etal/></person-group> <article-title>Automated abnormality classification of chest radiographs using deep convolutional neural networks</article-title>. <source>NPJ Digit Med</source>. (<year>2020</year>) <volume>3</volume>:<fpage>70</fpage>. <pub-id pub-id-type="doi">10.1038/s41746-020-0273-z</pub-id><pub-id pub-id-type="pmid">32435698</pub-id></citation></ref>
<ref id="B21"><label>21.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Antony</surname><given-names>M</given-names></name><name><surname>Kakileti</surname><given-names>ST</given-names></name><name><surname>Shah</surname><given-names>R</given-names></name><name><surname>Sahoo</surname><given-names>S</given-names></name><name><surname>Bhattacharyya</surname><given-names>C</given-names></name><name><surname>Manjunath</surname><given-names>G</given-names></name></person-group>. <article-title>Challenges of AI driven diagnosis of chest x-rays transmitted through smart phones: a case study in COVID-19</article-title>. <source>Sci Rep</source>. (<year>2023</year>) <volume>13</volume>:<fpage>18102</fpage>. <pub-id pub-id-type="doi">10.1038/s41598-023-44653-y</pub-id><pub-id pub-id-type="pmid">37872204</pub-id></citation></ref>
<ref id="B22"><label>22.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hsieh</surname><given-names>C</given-names></name><name><surname>Nobre</surname><given-names>IB</given-names></name><name><surname>Sousa</surname><given-names>SC</given-names></name><name><surname>Ouyang</surname><given-names>C</given-names></name><name><surname>Brereton</surname><given-names>M</given-names></name><name><surname>Nascimento</surname><given-names>JC</given-names></name><etal/></person-group> <article-title>MDF-net for abnormality detection by fusing x-rays with clinical data</article-title>. <source>Sci Rep</source>. (<year>2023</year>) <volume>13</volume>:<fpage>15873</fpage>. <pub-id pub-id-type="doi">10.1038/s41598-023-41463-0</pub-id><pub-id pub-id-type="pmid">37741833</pub-id></citation></ref>
<ref id="B23"><label>23.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Devasia</surname><given-names>J</given-names></name><name><surname>Goswami</surname><given-names>H</given-names></name><name><surname>Lakshminarayanan</surname><given-names>S</given-names></name><name><surname>Rajaram</surname><given-names>M</given-names></name><name><surname>Adithan</surname><given-names>S</given-names></name></person-group>. <article-title>Deep learning classification of active tuberculosis lung zones wise manifestations using chest x-rays: a multi label approach</article-title>. <source>Sci Rep</source>. (<year>2023</year>) <volume>13</volume>:<fpage>887</fpage>. <pub-id pub-id-type="doi">10.1038/s41598-023-28079-0</pub-id><pub-id pub-id-type="pmid">36650270</pub-id></citation></ref>
<ref id="B24"><label>24.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhou</surname><given-names>HY</given-names></name><name><surname>Yu</surname><given-names>Y</given-names></name><name><surname>Wang</surname><given-names>C</given-names></name><name><surname>Zhang</surname><given-names>S</given-names></name><name><surname>Gao</surname><given-names>Y</given-names></name><name><surname>Shao</surname><given-names>J</given-names></name><etal/></person-group> <article-title>A transformer-based representation-learning model with unified processing of multimodal input for clinical diagnostics</article-title>. <source>Nat Biomed Eng</source>. (<year>2023</year>) <volume>7</volume>:<fpage>743</fpage>&#x2013;<lpage>55</lpage>. <pub-id pub-id-type="doi">10.1038/s41551-023-01045-x</pub-id><pub-id pub-id-type="pmid">37308585</pub-id></citation></ref>
<ref id="B25"><label>25.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gupta</surname><given-names>A</given-names></name><name><surname>Sheth</surname><given-names>P</given-names></name><name><surname>Xie</surname><given-names>P</given-names></name></person-group>. <article-title>Neural architecture search for pneumonia diagnosis from chest x-rays</article-title>. <source>Sci Rep</source>. (<year>2022</year>) <volume>12</volume>:<fpage>11309</fpage>. <pub-id pub-id-type="doi">10.1038/s41598-022-15341-0</pub-id><pub-id pub-id-type="pmid">35788644</pub-id></citation></ref>
<ref id="B26"><label>26.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cho</surname><given-names>Y</given-names></name><name><surname>Kim</surname><given-names>JS</given-names></name><name><surname>Lim</surname><given-names>TH</given-names></name><name><surname>Lee</surname><given-names>I</given-names></name><name><surname>Choi</surname><given-names>J</given-names></name></person-group>. <article-title>Detection of the location of pneumothorax in chest x-rays using small artificial neural networks and a simple training process</article-title>. <source>Sci Rep</source>. (<year>2021</year>) <volume>11</volume>:<fpage>13054</fpage>. <pub-id pub-id-type="doi">10.1038/s41598-021-92523-2</pub-id><pub-id pub-id-type="pmid">34158562</pub-id></citation></ref>
<ref id="B27"><label>27.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname><given-names>L</given-names></name><name><surname>Lin</surname><given-names>ZQ</given-names></name><name><surname>Wong</surname><given-names>A</given-names></name></person-group>. <article-title>COVID-net: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest x-ray images</article-title>. <source>Sci Rep</source>. (<year>2020</year>) <volume>10</volume>:<fpage>19549</fpage>. <pub-id pub-id-type="doi">10.1038/s41598-020-76550-z</pub-id><pub-id pub-id-type="pmid">33177550</pub-id></citation></ref>
<ref id="B28"><label>28.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nugroho</surname><given-names>BA</given-names></name></person-group>. <article-title>An aggregate method for thorax diseases classification</article-title>. <source>Sci Rep</source>. (<year>2021</year>) <volume>11</volume>:<fpage>3242</fpage>. <pub-id pub-id-type="doi">10.1038/s41598-021-81765-9</pub-id><pub-id pub-id-type="pmid">33547338</pub-id></citation></ref>
<ref id="B29"><label>29.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pham</surname><given-names>HH</given-names></name><name><surname>Nguyen</surname><given-names>NH</given-names></name><name><surname>Tran</surname><given-names>TT</given-names></name><name><surname>Nguyen</surname><given-names>TNM</given-names></name><name><surname>Nguyen</surname><given-names>HQ</given-names></name></person-group>. <article-title>PediCXR: an open, large-scale chest radiograph dataset for interpretation of common thoracic diseases in children</article-title>. <source>Sci Data</source>. (<year>2023</year>) <volume>10</volume>:<fpage>240</fpage>. <pub-id pub-id-type="doi">10.1038/s41597-023-02102-5</pub-id><pub-id pub-id-type="pmid">37100784</pub-id></citation></ref>
<ref id="B30"><label>30.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Apostolidis</surname><given-names>KD</given-names></name><name><surname>Papakostas</surname><given-names>GA</given-names></name></person-group>. <article-title>A survey on adversarial deep learning robustness in medical image analysis</article-title>. <source>Electronics (Basel)</source>. (<year>2021</year>) <volume>10</volume>:<fpage>2132</fpage>. <pub-id pub-id-type="doi">10.3390/electronics10172132</pub-id></citation></ref>
<ref id="B31"><label>31.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Joel</surname><given-names>MZ</given-names></name><name><surname>Umrao</surname><given-names>S</given-names></name><name><surname>Chang</surname><given-names>E</given-names></name><name><surname>Choi</surname><given-names>R</given-names></name><name><surname>Yang</surname><given-names>DX</given-names></name><name><surname>Duncan</surname><given-names>JS</given-names></name><etal/></person-group> <article-title>Using adversarial images to assess the robustness of deep learning models trained on diagnostic images in oncology</article-title>. <source>JCO Clin Cancer Inform</source>. (<year>2022 Feb</year>) <volume>6</volume>:<fpage>e2100170</fpage>. <pub-id pub-id-type="doi">10.1200/CCI.21.00170</pub-id><pub-id pub-id-type="pmid">35271304</pub-id></citation></ref>
<ref id="B32"><label>32.</label><citation citation-type="other"><person-group person-group-type="author"><name><surname>Azizi</surname><given-names>S</given-names></name><name><surname>Culp</surname><given-names>L</given-names></name><name><surname>Freyburg</surname><given-names>J</given-names></name><name><surname>Mustafa</surname><given-names>B</given-names></name><name><surname>Baur</surname><given-names>S</given-names></name><name><surname>Kornblith</surname><given-names>S</given-names></name><etal/></person-group> <comment>Robust and efficient medical imaging with self-supervision. arXiv preprint arXiv:2205.09723</comment> (<year>2022</year>).</citation></ref>
<ref id="B33"><label>33.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dai</surname><given-names>Y</given-names></name><name><surname>Qian</surname><given-names>Y</given-names></name><name><surname>Lu</surname><given-names>F</given-names></name><name><surname>Wang</surname><given-names>B</given-names></name><name><surname>Gu</surname><given-names>Z</given-names></name><name><surname>Wang</surname><given-names>W</given-names></name><etal/></person-group> <article-title>Improving adversarial robustness of medical imaging systems via adding global attention noise,</article-title>. <source>Comput Biol Med</source>. (<year>2023</year>) <volume>164</volume>. <pub-id pub-id-type="doi">10.1016/j.compbiomed.2023.107251</pub-id></citation></ref>
<ref id="B34"><label>34.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>An</surname><given-names>G</given-names></name></person-group>. <article-title>The effects of adding noise during backpropagation training on a generalization performance</article-title>. <source>Neural Comput</source>. (<year>April 1996</year>) <volume>8</volume>(<issue>3</issue>):<fpage>643</fpage>&#x2013;<lpage>74</lpage>. <pub-id pub-id-type="doi">10.1162/neco.1996.8.3.643</pub-id></citation></ref>
<ref id="B35"><label>35.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Deng</surname><given-names>J</given-names></name><name><surname>Dong</surname><given-names>W</given-names></name><name><surname>Socher</surname><given-names>R</given-names></name><name><surname>Li</surname><given-names>L-J</given-names></name><name><surname>Li</surname><given-names>K</given-names></name><name><surname>Fei-Fei</surname><given-names>L</given-names></name></person-group>. <conf-name>Imagenet: a large-scale hierarchical image database</conf-name>. <conf-name>2009 IEEE Conference on Computer Vision and Pattern Recognition</conf-name>; <conf-loc>Miami, FL, USA</conf-loc> (<year>2009</year>). p. <fpage>248</fpage>&#x2013;<lpage>55</lpage>. <pub-id pub-id-type="doi">10.1109/CVPR.2009.5206848</pub-id></citation></ref>
<ref id="B36"><label>36.</label><citation citation-type="other"><person-group person-group-type="author"><name><surname>Zhang</surname><given-names>H</given-names></name><name><surname>Cisse</surname><given-names>M</given-names></name><name><surname>Dauphin</surname><given-names>Y</given-names></name><name><surname>Lopez-Paz</surname><given-names>D</given-names></name></person-group>. <comment>mixup: Beyond Empirical Risk Minimization</comment>. <italic>arXiv</italic> [Preprint] <italic>arXiv:1710.09412</italic> (2017).</citation></ref>
<ref id="B37"><label>37.</label><citation citation-type="other"><person-group person-group-type="author"><name><surname>Zhang</surname><given-names>L</given-names></name><name><surname>Deng</surname><given-names>Z</given-names></name><name><surname>Kawaguchi</surname><given-names>K</given-names></name><name><surname>Ghorbani</surname><given-names>A</given-names></name><name><surname>Zou</surname><given-names>J</given-names></name></person-group>. <comment>How Does Mixup Help With Robustness and Generalization?</comment>. <italic>arXiv</italic> [Preprint] <italic>arXiv:2010.04819</italic> (2020).</citation></ref>
<ref id="B38"><label>38.</label><citation citation-type="other"><person-group person-group-type="author"><name><surname>Boyat</surname><given-names>A</given-names></name><name><surname>Joshi</surname><given-names>B</given-names></name></person-group>. <comment>A Review Paper: Noise Models in Digital Image Processing</comment>. <italic>arXiv</italic> [Preprint] <italic>arXiv:1505.03489</italic> (<year>2015</year>).</citation></ref>
<ref id="B39"><label>39.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ahmed</surname><given-names>N</given-names></name><name><surname>Natarajan</surname><given-names>T</given-names></name><name><surname>Rao</surname><given-names>KR</given-names></name></person-group>. <article-title>Discrete cosine transform</article-title>. <source>IEEE Trans Comput</source>. (<year>1974</year>) <volume>C-23</volume>(<issue>1</issue>):<fpage>90</fpage>&#x2013;<lpage>3</lpage>. <pub-id pub-id-type="doi">10.1109/T-C.1974.223784</pub-id></citation></ref>
<ref id="B40"><label>40.</label><citation citation-type="other"><person-group person-group-type="author"><name><surname>Paszke</surname><given-names>A</given-names></name><name><surname>Gross</surname><given-names>S</given-names></name><name><surname>Massa</surname><given-names>F</given-names></name><name><surname>Lerer</surname><given-names>A</given-names></name><name><surname>Bradbury</surname><given-names>J</given-names></name><name><surname>Chanan</surname><given-names>G</given-names></name><etal/></person-group> <comment>PyTorch: An Imperative Style, High-Performance Deep Learning Library</comment>. <italic>arXiv</italic> [Preprint] <italic>arXiv:1912.01703</italic> (2019).</citation></ref>
<ref id="B41"><label>41.</label><citation citation-type="other"><person-group person-group-type="author"><name><surname>Abadi</surname><given-names>M</given-names></name><name><surname>Barham</surname><given-names>P</given-names></name><name><surname>Chen</surname><given-names>J</given-names></name><name><surname>Chen</surname><given-names>Z</given-names></name><name><surname>Davis</surname><given-names>A</given-names></name><name><surname>Dean</surname><given-names>J</given-names></name><etal/></person-group> <comment>TensorFlow: A system for large-scale machine learning</comment>. <italic>arXiv</italic> [Preprint] <italic>arXiv:1603.04467</italic> (2016).</citation></ref>
<ref id="B42"><label>42.</label><citation citation-type="other"><person-group person-group-type="author"><name><surname>Yuan</surname><given-names>Z</given-names></name><name><surname>Yan</surname><given-names>Y</given-names></name><name><surname>Sonka</surname><given-names>M</given-names></name><name><surname>Yang</surname><given-names>T</given-names></name></person-group>. <comment>Large-scale Robust Deep AUC Maximization: A New Surrogate Loss and Empirical Studies on Medical Image Classification</comment>. <italic>arXiv</italic> [Preprint] <italic>arXiv:2012.03173</italic> (<year>2020</year>).</citation></ref>
<ref id="B43"><label>43.</label><citation citation-type="other"><person-group person-group-type="author"><name><surname>Culjak</surname><given-names>I</given-names></name><name><surname>Abram</surname><given-names>D</given-names></name><name><surname>Pribanic</surname><given-names>T</given-names></name><name><surname>Dzapo</surname><given-names>H</given-names></name><name><surname>Cifrek</surname><given-names>M</given-names></name></person-group>. <conf-name>A brief introduction to OpenCV</conf-name>. <conf-name>2012 Proceedings of the 35th International Convention MIPRO</conf-name>; <conf-loc>Opatija, Croatia</conf-loc> (<year>2012</year>), pp. <fpage>1725</fpage>&#x2013;<lpage>30</lpage>.</citation></ref>
<ref id="B44"><label>44.</label><citation citation-type="other"><person-group person-group-type="author"><name><surname>DeVries</surname><given-names>T</given-names></name><name><surname>Taylor</surname><given-names>GW</given-names></name></person-group>. <article-title>Improved regularization of convolutional neural networks with cutout</article-title>. <comment>arXiv preprint arXiv:1708.04552</comment> (<year>2017</year>).</citation></ref></ref-list>
</back>
</article>