<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Vet. Sci.</journal-id>
<journal-title>Frontiers in Veterinary Science</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Vet. Sci.</abbrev-journal-title>
<issn pub-type="epub">2297-1769</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fvets.2023.1236579</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Veterinary Science</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>A deep learning model for automated kidney calculi detection on non-contrast computed tomography scans in dogs</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Ji</surname> <given-names>Yewon</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/2058173/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Hwang</surname> <given-names>Gyeongyeon</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/2339666/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Lee</surname> <given-names>Sang Jun</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1893669/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Lee</surname> <given-names>Kichang</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1592078/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Yoon</surname> <given-names>Hakyoung</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1028259/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Department of Veterinary Medical Imaging, College of Veterinary Medicine, Jeonbuk National University</institution>, <addr-line>Iksan</addr-line>, <country>Republic of Korea</country></aff>
<aff id="aff2"><sup>2</sup><institution>Division of Electronic Engineering, College of Engineering, Jeonbuk National University</institution>, <addr-line>Jeonju</addr-line>, <country>Republic of Korea</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Sang-Kwon Lee, Kyungpook National University, Republic of Korea</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Jihye Choi, Seoul National University, Republic of Korea; Biswajit Bhowmick, The University of Tennessee, Knoxville, United States</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Hakyoung Yoon <email>hyyoon&#x00040;jbnu.ac.kr</email></corresp>
</author-notes>
<pub-date pub-type="epub">
<day>20</day>
<month>09</month>
<year>2023</year>
</pub-date>
<pub-date pub-type="collection">
<year>2023</year>
</pub-date>
<volume>10</volume>
<elocation-id>1236579</elocation-id>
<history>
<date date-type="received">
<day>07</day>
<month>06</month>
<year>2023</year>
</date>
<date date-type="accepted">
<day>04</day>
<month>09</month>
<year>2023</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2023 Ji, Hwang, Lee, Lee and Yoon.</copyright-statement>
<copyright-year>2023</copyright-year>
<copyright-holder>Ji, Hwang, Lee, Lee and Yoon</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license> </permissions>
<abstract>
<p>Nephrolithiasis is one of the most common urinary disorders in dogs. Although a majority of kidney calculi are non-obstructive and are likely to be asymptomatic, they can lead to parenchymal loss and obstruction as they progress. Thus, early diagnosis of kidney calculi is important for patient monitoring and better prognosis. However, detecting kidney calculi and monitoring changes in the sizes of the calculi from computed tomography (CT) images is time-consuming for clinicians. This study, in a first of its kind, aims to develop a deep learning model for automatic kidney calculi detection using pre-contrast CT images of dogs. A total of 34,655 transverseimage slices obtained from 76 dogs with kidney calculi were used to develop the deep learning model. Because of the differences in kidney location and calculi sizes in dogs compared to humans, several processing methods were used. The first stage of the models, based on the Attention U-Net (AttUNet), was designed to detect the kidney for the coarse feature map. Five different models&#x02013;AttUNet, UTNet, TransUNet, SwinUNet, and RBCANet&#x02013;were used in the second stage to detect the calculi in the kidneys, and the performance of the models was evaluated. Compared with a previously developed model, all the models developed in this study yielded better dice similarity coefficients (DSCs) for the automatic segmentation of the kidney. To detect kidney calculi, RBCANet and SwinUNet yielded the best DSC, which was 0.74. In conclusion, the deep learning model developed in this study can be useful for the automated detection of kidney calculi.</p></abstract>
<kwd-group>
<kwd>artificial intelligence model</kwd>
<kwd>renal calculi</kwd>
<kwd>urolithiasis</kwd>
<kwd>computed tomography</kwd>
<kwd>canine</kwd>
</kwd-group>
<counts>
<fig-count count="5"/>
<table-count count="2"/>
<equation-count count="3"/>
<ref-count count="39"/>
<page-count count="11"/>
<word-count count="7384"/>
</counts>
<custom-meta-wrap>
<custom-meta>
<meta-name>section-at-acceptance</meta-name>
<meta-value>Veterinary Imaging</meta-value>
</custom-meta>
</custom-meta-wrap>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>1. Introduction</title>
<p>Urinary calculi in the kidneys and upper and lower urinary tracts are among the most common abnormal findings in canine urinary disorders. According to a recent study, prevalence of upper urinary tract and lower urinary tract uroliths were reported to be 19 and 41%, respectively (<xref ref-type="bibr" rid="B1">1</xref>). In dogs, most urinary calculi are reported to be in the lower urinary tract, for example, the bladder and urethra, or are voided in the urine (<xref ref-type="bibr" rid="B2">2</xref>). Less than 3&#x02013;4% of all urinary calculi in dogs are located in the renal pelvis (<xref ref-type="bibr" rid="B2">2</xref>, <xref ref-type="bibr" rid="B3">3</xref>), while most human patients with urinary calculi are reported to have nephroliths (<xref ref-type="bibr" rid="B3">3</xref>).</p>
<p>Renal calculi can be asymptomatic in many dogs; however, when the size or location of the calculi change, they are no longer silent, and can lead to clinical problems such as partial or complete ureteropelvic junction obstruction, hydronephrosis, renal parenchymal loss due to growing calculi, hematuria, and urinary tract infection due to infected calculi (<xref ref-type="bibr" rid="B4">4</xref>). In addition, a study in human medicine has also reported that the renal calculi can be associated with the increasing risk of chronic kidney diseases (<xref ref-type="bibr" rid="B5">5</xref>&#x02013;<xref ref-type="bibr" rid="B8">8</xref>).</p>
<p>Therefore, early detection and size quantification of urinary calculi are important to prevent severe kidney diseases associated with calculi, and to provide better and timely treatment. Owing to their importance, several diagnostic imaging modalities, including X-rays, ultrasound, and computed tomography (CT), have been used to detect urinary calculi in both veterinary and human medicine. Among these methods, CT is reported to be the most accurate for detecting calculi with high sensitivity and specificity (<xref ref-type="bibr" rid="B9">9</xref>). However, the limitation of these methods lies in the time-consuming nature of evaluation and size quantification of renal calculi in clinical field, as it is performed by manually measuring the size and the number of calculi.</p>
<p>Of late, numerous studies in human medicine have shown that deep learning models can be successfully applied to medical imaging fields for aspects such as classification, segmentation, and lesion detection (<xref ref-type="bibr" rid="B10">10</xref>&#x02013;<xref ref-type="bibr" rid="B13">13</xref>). Convolutional neural networks, a recent advancement in deep learning-based analysis methods, have shown promising performance in these tasks (<xref ref-type="bibr" rid="B14">14</xref>). To date, several novel architectures have been proposed for training using medical images. Attention U-Net (AttUNet), which integrates an attention gate into the U-Net model, consistently improves the prediction performance of U-Net on abdominal CT datasets for multiclass image segmentation (<xref ref-type="bibr" rid="B15">15</xref>). Recently, a hybrid transformer architecture called UTNet was proposed. UTNet integrates self-attention into a convolutional neural network that allows the initialization of transformer models without the need for a pre-training weight, while transformers require a large amount of data to learn vision inductive bias (<xref ref-type="bibr" rid="B16">16</xref>). In addition, TransUNet, an architecture using Transformer as an encoder in combination with U-Net aims to enhance the finer details, and has yielded promising performance on medical images for multi-organ segmentation and cardiac segmentation (<xref ref-type="bibr" rid="B17">17</xref>).</p>
<p>In human medicine, many studies have proposed various deep learning models for the segmentation and detection of kidneys (<xref ref-type="bibr" rid="B18">18</xref>&#x02013;<xref ref-type="bibr" rid="B20">20</xref>) and kidney tumors (<xref ref-type="bibr" rid="B21">21</xref>&#x02013;<xref ref-type="bibr" rid="B23">23</xref>). Many recent human medicine studies have proposed deep learning models with several architectures for automatic kidney stone detection on CT images (<xref ref-type="bibr" rid="B24">24</xref>&#x02013;<xref ref-type="bibr" rid="B27">27</xref>). In veterinary medicine, a recent study proposed a deep learning model based on the UNet Transformer to detect the kidney and automatically estimate its volume from the pre- and post-contrast CT images of dogs (<xref ref-type="bibr" rid="B28">28</xref>). However, no deep learning model has yet been proposed for the automated detection of kidney calculi from CT images in veterinary medicine.</p>
<p>In this study, we aimed to develop deep learning models for the automatic detection of kidney calculi and kidneys from non-contrast CT scans in dogs, and to evaluate the performance of these models.</p></sec>
<sec sec-type="materials and methods" id="s2">
<title>2. Materials and methods</title>
<sec>
<title>2.1. Dataset for CT scans</title>
<p>A total of 167 pre-contrast CT scans (instruments used were as follows: Alexion, TSX-034A, Canon Medical System Europe B.V. and Zoetermeer, Netherlands; Revolution ACT, GE Healthcare, Milwaukee, WI, USA; and Brivo CT385, GE Healthcare, Milwaukee, WI, USA) of 167 dogs were randomly collected from multiple centers. Among the 167 pre-contrast CT scans, 34,655 transverseimages from 76 CT scans included kidney calculi, and were used for training and validation. The imaging protocols were as follows; 120 kVp, 150 mAs, 512 &#x000D7; 512 matrix and 0.75 rotation time (Alexion); 120 kVp, 84 mAs, 512 &#x000D7; 512 matrix, and 1 rotation time (Revolution ACT); and 120 kVp, 69 mAs, 512 &#x000D7; 512 matrix, and 1 rotation time (Brivo CT385). The slice thickness of the CT scans included in the study varied from 0.75 mm to 2.5 mm. Postcontrast CT scans were not included in the present study.</p>
<p>The precontrast CT images included in this study were divided into training and validation data at a ratio of 80 to 20. Therefore, a total of 61 CT scans were randomly chosen as the training data and 15 CT scans were used as the validation data. The CT scans of dogs without medical records were excluded from the study. In addition, scans with motion artifacts, without volume information, or with an axis smaller than a certain size were excluded.</p>
</sec>
<sec>
<title>2.2. Patient dataset</title>
<p>In this study, 76 CT scans from 76 dogs with kidney calculi who underwent CT scans were included. The medical records of the dogs, including data on age, sex, neutering status, body weight, and laboratory examination results, were collected.</p>
</sec>
<sec>
<title>2.3. Manual segmentation</title>
<p>The pre-contrast CT scans of the dogs included in the study were manually segmented by 10 clinicians (residents at the Veterinary Medical Imaging Department of the Teaching Hospital of Jeonbuk National University) using the Medilabel software (Ingradient, Inc., Seoul, South Korea). From the images, all kidneys were segmented into the following three classes: (1) renal parenchyma; (2) renal pelvis and surrounding fat; and (3) calculi (<xref ref-type="fig" rid="F1">Figure 1</xref>). The renal pelvis and the fat around it were labeled separately to prevent false training results wherein the model recognizes the fat around the pelvis as the kidney.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>Examples of manual segmentation; original CT images at the level of kidney <bold>(A)</bold> and the example of manual segmentation <bold>(B)</bold>. The kidneys in the pre-contrast images were manually segmented using a segmentation tool (Medilabel software). In the pre-contrast images, the kidneys were segmented into three classes: parenchyma (Class 1, light blue color in the labeled image), renal pelvis, and surrounding fat (Class 2, orange color in the labeled image). Calculi in the kidneys were segmented into Class 3 (light pink color in the labeled image).</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fvets-10-1236579-g0001.tif"/>
</fig>
</sec>
<sec>
<title>2.4. Pre-processing</title>
<p>The training data were converted into a numpy array and pre-processed using the following steps: data resampling, intensity normalization, and non-zero region cropping, using Python and Pytorch framework.</p>
<p>A non-zero crop is a process used for exclusively obtaining the actual region of interest (RoI) by cropping out the background. External structures, such as fixation frames for fixing animals on the CT table during scans, were masked, and voxels with certain values (under&#x02212;1000 Hounsfield Unit) were considered as the background, and cropped. Data were resized to 512 &#x000D7; 512 pixels.</p>
<p>Intensity normalization was performed to clip the minimum and maximum to Hounsfield Unit (HU) values of &#x02212;155 and 195, respectively.</p>
<p>To address variations in the spatial spacing and slice thickness of the CT scans used in this study, resampling was performed to adjust the various pixel dimensions and standardize the data to an isotropic voxel spacing of x = 0.5, y = 0.5, and z = 1.4 (mm). To preprocess the training data, which only included pre-contrast CT scans, the window width and level were set to 350 and 30 HU, respectively, and the minimum and maximum HUvalues were clipped to &#x02212;155 and 195, respectively, before applying min-max normalization (Minimum = Window level - <inline-formula><mml:math id="M1"><mml:mfrac><mml:mrow><mml:mi>W</mml:mi><mml:mi>i</mml:mi><mml:mi>n</mml:mi><mml:mi>d</mml:mi><mml:mi>o</mml:mi><mml:mi>w</mml:mi><mml:mtext>&#x000A0;</mml:mtext><mml:mi>w</mml:mi><mml:mi>i</mml:mi><mml:mi>d</mml:mi><mml:mi>t</mml:mi><mml:mi>h</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:mfrac></mml:math></inline-formula>, Maximum = Window level &#x0002B; <inline-formula><mml:math id="M2"><mml:mfrac><mml:mrow><mml:mi>W</mml:mi><mml:mi>i</mml:mi><mml:mi>n</mml:mi><mml:mi>d</mml:mi><mml:mi>o</mml:mi><mml:mi>w</mml:mi><mml:mi>w</mml:mi><mml:mi>i</mml:mi><mml:mi>d</mml:mi><mml:mi>t</mml:mi><mml:mi>h</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:mfrac></mml:math></inline-formula>).</p>
<p>This technique ensured that the intensity values of the images were consistent and comparable across different scans.</p>
</sec>
<sec>
<title>2.5. Model architecture</title>
<p>In this study, several model architectures previously employed for various image segmentation tasks were utilized, including Attention U-Net (AttUNet) (<xref ref-type="bibr" rid="B15">15</xref>), UTNet (<xref ref-type="bibr" rid="B16">16</xref>), TransUNet (<xref ref-type="bibr" rid="B17">17</xref>), SwinUNet (<xref ref-type="bibr" rid="B29">29</xref>), and RBCANet (<xref ref-type="bibr" rid="B21">21</xref>); these have previously been used for various image segmentation tasks. The overall block diagram of the model architecture is shown in <xref ref-type="fig" rid="F2">Figure 2</xref>. Five different models based on transformer and CNN were selected based on the reliability and efficiency from previous studies which showed high accuracy and stability on medical images such as CT and Magnetic Resonance Imaging (MRI) in human medicine.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>Schematic illustration of the model architectures used in this study. For the stage 1, AttUNet, a convolution-based model integrated with attention gate was used. For the stage 2, five different models were used and compared. UTNet <bold>(A)</bold>, TransUNet <bold>(B)</bold>, nd SwinUNet <bold>(C)</bold> are architectures based on transformer models. AttUNet <bold>(D)</bold> and RBCANet <bold>(E)</bold> are based on convolution models.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fvets-10-1236579-g0002.tif"/>
</fig>
<p>The AttUNet model extends the original U-Net architecture by incorporating attention gates into the skip connections (<xref ref-type="bibr" rid="B15">15</xref>). The architecture maintains the encoder-decoder structure of the U-Net, with downsampling layers in the encoder and upsampling layers in the decoder. Attention gates are added to the skip connections, allowing the model to focus selectively on the most relevant features in the input image. The attention gates learn spatial dependencies and feature importance using attention mechanisms that are often implemented through additive or multiplicative approaches. This selective focus on relevant features leads to improved segmentation performance.</p>
<p>UTNet has a U-shaped architecture, similar to the original U-Net, with an encoder-decoder structure and skip connections (<xref ref-type="bibr" rid="B16">16</xref>). The main difference lies in the encoder part, which consists of transformer-based layers instead of standard convolutional layers. These layers capture both local and global features from the input image, while the decoder uses convolutional layers to upsample the feature maps and generate the final segmentation map. The combination of transformers and convolutions allows UTNet to effectively segment images with complex structures, such as ultrathin endoscope images.</p>
<p>TransUNet combines the U-Net architecture with a vision transformer to create a hybrid model (<xref ref-type="bibr" rid="B17">17</xref>). The vision transformer is used as an encoder, replacing the standard convolutional layers of the U-Net architecture. The vision transformer divides the input image into non-overlapping patches and processes them using self-attention and positional encoding, allowing it to effectively capture global contextual information. The decoder part of TransUNet remains similar to that of the original U-Net, using upsampling layers and skip connections to generate the final segmentation map. This combination of the vision transformer and U-Net architecture enables TransUNet to capture both local and global context information, resulting in improved segmentation performance.</p>
<p>SwinUNet incorporates the Swin transformer as its encoder, replacing the standard convolutional layers of the U-Net architecture (<xref ref-type="bibr" rid="B29">29</xref>). The swin transformer is a hierarchical transformer-based architecture that uses shifted windows to process input images, capturing both local and global context information while maintaining relatively low computational complexity. The decoder part of SwinUNet retains the original U-Net design, and consists of upsampling layers and skip connections. By combining the strengths of both the swin transformer and the U-Net architecture, SwinUNet achieves improved performance in various image segmentation tasks.</p>
<p>The RBCANet architecture utilizes a pre-trained DenseNet-161 encoder and U-Net&#x00027;s encoder&#x02013;decoder structure to effectively capture hierarchical features (<xref ref-type="bibr" rid="B21">21</xref>). The atrous spatial pyramid pooling module, integrated within the skip connections, processes input features at various scales, capturing both local and global contextual information that is crucial for accurate segmentation. Working in conjunction with the Reverse Boundary Attention and Channel Attention modules, RBCANet improves the segmentation performance by emphasizing accurate boundary predictions and focusing on the most informative channels.</p>
<p>One of the main impediment factors of this study was the variation of data source owing to the relatively small size of calculi in the full CT images, various size of dogs and the fact that the data included in this study were collected from multicenter. To overcome this problem, our approach involved two stages and employed five architectures. In Stage 1, we utilized an AttUNet-based model to obtain an approximate RoI of the kidney through coarse feature maps. This step involved automatic segmentation of the kidney from the input CT image, followed by extraction of the RoI by cropping a non-zero region that excludes the kidney. The extracted RoI was then resized to 128 &#x000D7; 128 pixels and used as input for Stage 2.</p>
<p>In Stage 2, the RoI obtained from Stage 1 was used as input. We evaluated the performance of five models for kidney stone segmentation: UTNet (<xref ref-type="fig" rid="F2">Figure 2A</xref>), TransUNet (<xref ref-type="fig" rid="F2">Figure 2B</xref>), SwinUNet (<xref ref-type="fig" rid="F2">Figure 2C</xref>), Attention U-Net (<xref ref-type="fig" rid="F2">Figure 2D</xref>), and RBCANet (<xref ref-type="fig" rid="F2">Figure 2E</xref>). The performance of each model was compared across different architectures to determine the most effective approach for kidney stone segmentation in CT images.</p>
</sec>
<sec>
<title>2.6. Implementation details</title>
<p>The input channel 1 and the output channel 3 were utilized. The PyTorch framework was used to construct the models. The combined loss function, including the weight addition of the cross-entropy and the dice loss function, is known to improve the performance of the segmentation network (<xref ref-type="bibr" rid="B30">30</xref>&#x02013;<xref ref-type="bibr" rid="B33">33</xref>). In this study, a combined loss function including generalized dice loss and focal loss was used to improve the performance of the models (even with data imbalance). The loss function used in this study was as follows:</p>
<disp-formula id="E1"><mml:math id="M3"><mml:mtable columnalign='left'><mml:mtr><mml:mtd><mml:mi>L</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>s</mml:mi><mml:mo>&#x000A0;</mml:mo><mml:mi>f</mml:mi><mml:mi>u</mml:mi><mml:mi>n</mml:mi><mml:mi>c</mml:mi><mml:mi>t</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mi>&#x003B1;</mml:mi><mml:mo>*</mml:mo><mml:mi>G</mml:mi><mml:mi>e</mml:mi><mml:mi>n</mml:mi><mml:mi>e</mml:mi><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>i</mml:mi><mml:mi>z</mml:mi><mml:mi>e</mml:mi><mml:mi>d</mml:mi><mml:mo>&#x000A0;</mml:mo><mml:mi>D</mml:mi><mml:mi>i</mml:mi><mml:mi>c</mml:mi><mml:mi>e</mml:mi><mml:mi>L</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>s</mml:mi></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mo>+</mml:mo><mml:mi>&#x003B2;</mml:mi><mml:mo>*</mml:mo><mml:mi>F</mml:mi><mml:mi>o</mml:mi><mml:mi>c</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>L</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>s</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mi>&#x003B1;</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mo>&#x000A0;</mml:mo><mml:mi>&#x003B2;</mml:mi><mml:mo>=</mml:mo><mml:mn>0.5</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="E2"><mml:math id="M4"><mml:mrow><mml:mtable columnalign='left'><mml:mtr columnalign='left'><mml:mtd columnalign='left'><mml:mrow><mml:mi>G</mml:mi><mml:mi>e</mml:mi><mml:mi>n</mml:mi><mml:mi>e</mml:mi><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>i</mml:mi><mml:mi>z</mml:mi><mml:mi>e</mml:mi><mml:mi>d</mml:mi><mml:mo>&#x000A0;</mml:mo><mml:mi>D</mml:mi><mml:mi>i</mml:mi><mml:mi>c</mml:mi><mml:mi>e</mml:mi><mml:mo>&#x000A0;</mml:mo><mml:mi>L</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>s</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x02212;</mml:mo><mml:mn>2</mml:mn><mml:mfrac><mml:mrow><mml:mstyle displaystyle='true'><mml:msubsup><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>l</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mn>2</mml:mn></mml:msubsup><mml:mrow><mml:msub><mml:mi>w</mml:mi><mml:mi>l</mml:mi></mml:msub></mml:mrow></mml:mstyle><mml:mstyle displaystyle='true'><mml:msub><mml:mo>&#x02211;</mml:mo><mml:mi>n</mml:mi></mml:msub><mml:mrow><mml:msub><mml:mi>r</mml:mi><mml:mrow><mml:mi>ln</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mi>ln</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mstyle></mml:mrow><mml:mrow><mml:mstyle displaystyle='true'><mml:msubsup><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>l</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mn>2</mml:mn></mml:msubsup><mml:mrow><mml:msub><mml:mi>w</mml:mi><mml:mi>l</mml:mi></mml:msub></mml:mrow></mml:mstyle><mml:mstyle displaystyle='true'><mml:msub><mml:mo>&#x02211;</mml:mo><mml:mi>n</mml:mi></mml:msub><mml:mrow><mml:msub><mml:mi>r</mml:mi><mml:mrow><mml:mi>ln</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mi>ln</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mstyle></mml:mrow></mml:mfrac></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:math></disp-formula>
<p>(<italic>p</italic><sub><italic>n</italic></sub>, predicted map of foreground label of number of image elements; <italic>r</italic><sub><italic>n</italic></sub>, ground truth of kidney and calculi; <italic>l</italic>, foreground label; <italic>w</italic><sub><italic>l</italic></sub>, <inline-formula><mml:math id="M5"><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:munderover><mml:msub><mml:mrow><mml:mi>r</mml:mi></mml:mrow><mml:mrow><mml:mo class="qopname">ln</mml:mo></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula>, weight addition for the number of label pixels)</p>
<disp-formula id="E3"><mml:math id="M6"><mml:mrow><mml:mi>F</mml:mi><mml:mi>o</mml:mi><mml:mi>c</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mtext>&#x000A0;</mml:mtext><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>s</mml:mi><mml:mo>=</mml:mo><mml:mo>&#x02212;</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mi>t</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mi>&#x003B3;</mml:mi></mml:msup><mml:mi>log</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:msubsup><mml:mrow></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo>&#x02212;</mml:mo><mml:mi>p</mml:mi><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mi>o</mml:mi><mml:mi>t</mml:mi><mml:mi>h</mml:mi><mml:mi>e</mml:mi><mml:mi>r</mml:mi><mml:mi>w</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>e</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mi>i</mml:mi><mml:mi>f</mml:mi><mml:mtext>&#x000A0;</mml:mtext><mml:mi>y</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup></mml:mrow></mml:mrow><mml:mo>,</mml:mo><mml:mo>&#x000A0;</mml:mo><mml:mi>&#x003B3;</mml:mi><mml:mo>=</mml:mo><mml:mn>2</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:math></disp-formula>
<p>To improve the model training performance, data augmentation was performed using ShiftScaleRotate, GridDistortion, Opticaldistortion, ElasticTransform, CoarseDropout, and GaussNoise from the albumentations library (<xref ref-type="bibr" rid="B34">34</xref>). The parameters for each step were as follows: scale limit (&#x02212;0.2, 0.2) and rotate limit (&#x02212;180, 180) were used for ShiftScaleRotate; the number of grid cells was five for each side, and the distort limit was set to (&#x02212;0.03, 0.03); the distort limit for the optical distortion was set to (&#x02212;0.05, 0.05); ElasticTransform was performed by displacement fields to convert pixels, and &#x003B1; and &#x003C3; were set to 1.1 and 0.5, respectively; the maximum height and minimum width were set to 8 for CoarseDropout; and GaussNoise was assigned a value of gaussian noise (0, 0.001) and an average of 0. In this study, data augmentation was applied only to the training process and not to the validation process.</p>
<p>Deep learning model training was conducted for 100 epochs using an NVIDIA RTX 3090, Python, and PyTorch framework graphics processing unit. For the training, exponentially learning rate scheduler was applied and each random data augmentation was performed with a probability of 0.5, and a learning rate of 0.01. SGD optimizer was used for the training with a batch size of 32, momentum of 0.9, and weight decay of 1e-4.</p>
</sec>
<sec>
<title>2.7. Model metrics and statistical evaluation</title>
<p>Several evaluation metrics were used to evaluate model performance. Dice Similarity coefficient (DSC) measures the relative voxel overlap between the ground truth and the predicted segmentation to evaluate the similarity between segmentations using an automated model and the ground truth. A DSC close to one implies high similarity. The DSC was measured using the following formula: <inline-formula><mml:math id="M7"><mml:mi>D</mml:mi><mml:mi>S</mml:mi><mml:mi>C</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mn>2</mml:mn><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mi>S</mml:mi></mml:mrow><mml:mrow><mml:mi>g</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02229;</mml:mo><mml:msub><mml:mrow><mml:mi>S</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mi>S</mml:mi></mml:mrow><mml:mrow><mml:mi>g</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:mo>&#x0002B;</mml:mo><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mi>S</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo></mml:mrow></mml:mfrac></mml:math></inline-formula> (<italic>S</italic><sub><italic>p</italic></sub>, predicted pixel value; <italic>S</italic><sub><italic>g</italic></sub>, segmentation pixel value of ground truth).</p>
<p>Intersection over Union, which is similar to DSC but penalizes under-segmentation and over-segmentation more than the DSC, was also used. The formula for Intersection over Union is as follows: <inline-formula><mml:math id="M8"><mml:mfrac><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mi>S</mml:mi></mml:mrow><mml:mrow><mml:mi>g</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02229;</mml:mo><mml:msub><mml:mrow><mml:mi>S</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mi>S</mml:mi><mml:msub><mml:mrow></mml:mrow><mml:mrow><mml:mi>g</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo></mml:mrow></mml:mfrac><mml:mtext>&#x000A0;</mml:mtext></mml:math></inline-formula>(<italic>S</italic><sub><italic>p</italic></sub>, predicted pixel value; <italic>S</italic><sub><italic>g</italic></sub>, segmentation pixel value of ground truth).</p>
<p>As sensitivity and specificity are recognized as standard metrics for performance evaluation in the medical field, both of the above were used in this study (<xref ref-type="bibr" rid="B35">35</xref>, <xref ref-type="bibr" rid="B36">36</xref>). They were calculated as below:</p>
<p>Sensitivity = <inline-formula><mml:math id="M9"><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mi>F</mml:mi><mml:mi>N</mml:mi></mml:mrow></mml:mfrac></mml:math></inline-formula> (TP, true positive; FN, false negative); Specificity = <inline-formula><mml:math id="M10"><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>N</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mi>F</mml:mi><mml:mi>P</mml:mi></mml:mrow></mml:mfrac><mml:mtext>&#x000A0;</mml:mtext></mml:math></inline-formula>(TN, true negative; FP, false positive).</p>
<p>Precision and Accuracy were also used to evaluate the models. The formulae for precision and accuracy are as follows: Precision = <inline-formula><mml:math id="M11"><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mi>F</mml:mi><mml:mi>P</mml:mi></mml:mrow></mml:mfrac></mml:math></inline-formula> (TP, true positive; FP, false positive); Accuracy = <inline-formula><mml:math id="M12"><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mi>T</mml:mi><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mi>T</mml:mi><mml:mi>N</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mi>F</mml:mi><mml:mi>P</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mi>F</mml:mi><mml:mi>N</mml:mi></mml:mrow></mml:mfrac></mml:math></inline-formula> (TP, true positive; TN, true negative; FP, false positive; FN, false negative).</p>
<p>Receiver Operating Characteristic (ROC) is a line plot that depicts the diagnostic ability of a classifier based on its performance with different thresholds. A ROC curve was established as a standard metric for comparing multiple models, and was used to evaluate the models (<xref ref-type="bibr" rid="B35">35</xref>). The area under the curve (AUC) shows the performance of models across different thresholds and provide an aggregate measure range from 0 to 1. The result near 1 implies higher performance.</p>
<p>For the statistical evaluation of the characteristics of the dogs included in the study, Kolmogorov&#x02013;Smirnov test and Shapiro&#x02013;Wilk test were performed as tests for normal distribution. Mann&#x02013;Whitney tests were used to evaluate differences in age among dogs with and without kidney calculi.</p>
</sec>
</sec>
<sec sec-type="results" id="s3">
<title>3. Results</title>
<sec>
<title>3.1. Evaluation of five models for kidney detection on pre-contrast CT scans</title>
<p>In Stage 1, SwinUNet showed the best DSC (0.943), followed by RBCANet (0.942), UTNet (0.935), and AttUNet and TransUNet (0.934). As the DSC measures the relative pixel overlap between the manual segmentation and the prediction of the models, DSC close to 1 is considered to have higher similarity between two segmentations in this study.</p>
<p>The sensitivity and specificity at the Youden point of the models were the highest for RBCANet (sensitivity 0.96, specificity 0.95), followed by TransUNet (sensitivity 0.95, specificity 0.94), SwinUNet (sensitivity 0.95, specificity 0.96), UTNet (sensitivity 0.95, specificity 0.95), and AttUNet (sensitivity 0.94, specificity 0.95).</p>
<p>The ROC curve for kidney detection for the five models on the test set is shown in <xref ref-type="fig" rid="F3">Figure 3A</xref>. The AUCfor detecting the kidney was 0.99 for TransUNet and SwinUNet, and 0.98 for UTNet, RBCANet, and AttUNet. The AUC close to 1 is considered to have better performance in this study.</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>Receiver Operator Characteristic (ROC) curves of the five models. <bold>(A)</bold> Shows the ROC curves of the models for automatic segmentation of the kidneys. Area-under-the-curve (AUC) values for the models are as follows: 0.99 (TransUNet, SwinUNet), and 0.98 (UTNet, RBCANet, AttUNet). <bold>(B)</bold> Shows the ROC curves of the models for the automatic detection of kidney calculi. The AUC values of the models are as follows: 0.98 (TransUNet), 0.97 (SwinUNet), 0.92 (UTNet), 0.91 (RBCANet), and 0.90 (AttUNet). The sensitivity and specificity at the Youden Point are shown in the graphs.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fvets-10-1236579-g0003.tif"/>
</fig>
<p>The IOU and precision metrics of the models are summarized in <xref ref-type="table" rid="T1">Table 1</xref>.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Quantitative results of the models for detecting kidneys.</p></caption> 
<table frame="box" rules="all">
<thead>
<tr style="background-color:&#x00023;919498;color:&#x00023;ffffff">
<th/>
<th valign="top" align="center"><bold>UTNet</bold></th>
<th valign="top" align="center"><bold>TransUNet</bold></th>
<th valign="top" align="center"><bold>SwinUNet</bold></th>
<th valign="top" align="center"><bold>AttUNet</bold></th>
<th valign="top" align="center"><bold>RBCANet</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">DSC</td>
<td valign="top" align="center">0.93446</td>
<td valign="top" align="center">0.93393</td>
<td valign="top" align="center">0.94318</td>
<td valign="top" align="center">0.93418</td>
<td valign="top" align="center">0.94176</td>
</tr> <tr>
<td valign="top" align="left">IoU</td>
<td valign="top" align="center">0.87699</td>
<td valign="top" align="center">0.87605</td>
<td valign="top" align="center">0.89248</td>
<td valign="top" align="center">0.87649</td>
<td valign="top" align="center">0.88993</td>
</tr> <tr>
<td valign="top" align="left">Sensitivity</td>
<td valign="top" align="center">0.92480</td>
<td valign="top" align="center">0.94072</td>
<td valign="top" align="center">0.93632</td>
<td valign="top" align="center">0.92298</td>
<td valign="top" align="center">0.92660</td>
</tr> <tr>
<td valign="top" align="left">Specificity</td>
<td valign="top" align="center">0.96267</td>
<td valign="top" align="center">0.94944</td>
<td valign="top" align="center">0.96632</td>
<td valign="top" align="center">0.96368</td>
<td valign="top" align="center">0.97179</td>
</tr> <tr>
<td valign="top" align="left">Precision</td>
<td valign="top" align="center">0.93985</td>
<td valign="top" align="center">0.91745</td>
<td valign="top" align="center">0.93958</td>
<td valign="top" align="center">0.93656</td>
<td valign="top" align="center">0.95007</td>
</tr>
<tr>
<td valign="top" align="left">Accuracy</td>
<td valign="top" align="center">0.95432</td>
<td valign="top" align="center">0.9506</td>
<td valign="top" align="center">0.95889</td>
<td valign="top" align="center">0.95583</td>
<td valign="top" align="center">0.95979</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>DSC, Dice Similarity Coefficient; IoU, Intersection over Union; Sensitivity and Specificity refers to that at the Youden point.</p>
</table-wrap-foot>
</table-wrap>
</sec>
<sec>
<title>3.2. Evaluation and comparative analysis of the models for kidney calculi detection on pre-contrast CT scans</title>
<p>The performance of several models in detecting kidney calculi was assessed in this study. AttUNet outperformed the other models, with the highest DSC of 0.741, suggesting its superior capability to accurately delineate the intricate structures of kidney calculi. The DSCs of the other models were as follows: SwinUNet, 0.736; RBCANet, 0.733; UTNet, 0.701; and TransUNet, 0.682.</p>
<p>In terms of sensitivity and specificity at the Youden point, TransUNet topped the list with values of 0.89 and 0.99, respectively. UTNet, SwinUNet, RBCANet, and AttUNet also exhibited competitive sensitivity and specificity values, demonstrating their ability to correctly identify true positive and true negative cases. The sensitivity and specificity of the models are summarized in <xref ref-type="table" rid="T2">Table 2</xref>.</p>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Quantitative results of the models for detecting kidney calculi.</p></caption> 
<table frame="box" rules="all">
<thead>
<tr style="background-color:&#x00023;919498;color:&#x00023;ffffff">
<th/>
<th valign="top" align="center"><bold>UTNet</bold></th>
<th valign="top" align="center"><bold>TransUNet</bold></th>
<th valign="top" align="center"><bold>SwinUNet</bold></th>
<th valign="top" align="center"><bold>AttUNet</bold></th>
<th valign="top" align="center"><bold>RBCANet</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">DSC</td>
<td valign="top" align="center">0.70051</td>
<td valign="top" align="center">0.68243</td>
<td valign="top" align="center">0.73590</td>
<td valign="top" align="center">0.74108</td>
<td valign="top" align="center">0.73277</td>
</tr> <tr>
<td valign="top" align="left">IoU</td>
<td valign="top" align="center">0.53907</td>
<td valign="top" align="center">0.51795</td>
<td valign="top" align="center">0.58215</td>
<td valign="top" align="center">0.58867</td>
<td valign="top" align="center">0.57825</td>
</tr> <tr>
<td valign="top" align="left">Sensitivity</td>
<td valign="top" align="center">0.86189</td>
<td valign="top" align="center">0.88586</td>
<td valign="top" align="center">0.86326</td>
<td valign="top" align="center">0.84275</td>
<td valign="top" align="center">0.83546</td>
</tr> <tr>
<td valign="top" align="left">Specificity</td>
<td valign="top" align="center">0.99078</td>
<td valign="top" align="center">0.98882</td>
<td valign="top" align="center">0.98086</td>
<td valign="top" align="center">0.99151</td>
<td valign="top" align="center">0.99015</td>
</tr> <tr>
<td valign="top" align="left">Precision</td>
<td valign="top" align="center">0.66803</td>
<td valign="top" align="center">0.61981</td>
<td valign="top" align="center">0.73345</td>
<td valign="top" align="center">0.72945</td>
<td valign="top" align="center">0.72854</td>
</tr>
<tr>
<td valign="top" align="left">Accuracy</td>
<td valign="top" align="center">0.99876</td>
<td valign="top" align="center">0.99861</td>
<td valign="top" align="center">0.99896</td>
<td valign="top" align="center">0.99894</td>
<td valign="top" align="center">0.99896</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>DSC, Dice Similarity Coefficient; IoU, Intersection over Union; Sensitivity and Specificity refers to that the Youden point.</p>
</table-wrap-foot>
</table-wrap>
<p>Further comparisons were made based on the ROC curves shown in <xref ref-type="fig" rid="F3">Figure 3B</xref>. TransUNet yielded the highest AUC of 0.98 for calculi detection, indicating its superior ability to differentiate between positive and negative cases of kidney calculi. The AUCs for the other models (in decreasing order) were as follows: SwinUNet, 0.97; UTNet, 0.92; RBCANet, 0.91; and AttUNet, 0.90. <xref ref-type="fig" rid="F4">Figure 4</xref> shows the predictive performance of the models with reference to absenteeism, and provides a comparative perspective on their robustness and reliability across different predictive tasks.</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p>Manual visual analysis of segmented kidneys and kidney calculi (Ground Truth, Red) and the predictions (Predictions, Blue) generated by the models dependently.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fvets-10-1236579-g0004.tif"/>
</fig>
</sec>
<sec>
<title>3.3. Statistical evaluation of dogs included in the study</title>
<p>Among the dogs with kidney calculi included in this study, 32 were neutered males, 3 were intact males, 32 were neutered females, and 9 were intact female dogs. The mean age of dogs with kidney calculi was 10.79 &#x000B1; 4.00 years (mean &#x000B1; SD), ranging from 7 months to 18 years. The mean body weight (BW) of the dogs was 5.25 &#x000B1; 2.49 kg (mean &#x000B1; SD), ranging from 1.7 kg to 14.85 kg. The distribution of breeds among these dogs was as follow: 21 Malteses, 10 Poodles, 7 Shih Tzus, 6 Yorkshire terriers, 6 mixed breeds, 5 Pomeranians, 5 Schnauzers, 2 Cocker Spaniels, 2 Dachshunds, and 12 others.</p>
<p>Among the dogs without kidney calculi, 45 were neutered males, 6 were intact males, 32 were neutered females, and 8 were intact female dogs. The mean age of the dogs without kidney calculi was 7.72 &#x000B1; 4.08 years (mean &#x000B1; SD), ranging from 4 months to 16 years. The mean BW of the dogs was 7.87 &#x000B1; 7.52 kg (mean &#x000B1; SD), ranging from 1.75 kg to 41.5 kg. The breed distribution of these dogs was as follows: 18 Malteses, 15 Poodles, 7 Mixed breeds, 7 Shih Tzus, 6 Pomeranians, 4 Cocker Spaniels, 4 Dachshunds, 3 Bichon Frises, and 27 others.</p>
<p>Mann-Whitney tests for age (<italic>p</italic> &#x0003C; 0.001) and BW (<italic>p</italic> = 0.005) showed statistically significant differences between dogs with and without kidney calculi. Dogs with kidney calculi were significantly older and smaller than those without calculi. The age and BW of each group are depicted using a box plot in <xref ref-type="fig" rid="F5">Figure 5</xref>.</p>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption><p>Boxplot of age <bold>(A)</bold> and body weight <bold>(B)</bold> in dogs with and without calculi. Statistically significant difference was found between the dogs with and without calculi for both the parameters. The dogs with calculi were significantly older (<italic>p</italic> &#x0003C; 0.001) and smaller (<italic>p</italic> = 0.005) than the dogs without calculi. The upper and lower edges of the box represent the 25th (Lower quartile, Q1) and 75th (Upper quartile, Q3) percentiles. The vertical line (whiskers) between the lower and upper extremes on each box represents the distribution range of the data. The mild outliers (empty circles) are data points located outside of the whiskers, below Q1 &#x02013; 1.5 &#x000D7; Interquartile range (IQR) or above Q3 &#x0002B; 1.5 &#x000D7; IQR. The extreme outliers (asterisks) are data points more extreme than Q1 &#x02013; 3 &#x000D7; IQR or Q3 &#x0002B; 3 &#x000D7; IQR.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fvets-10-1236579-g0005.tif"/>
</fig></sec></sec>
<sec sec-type="discussion" id="s4">
<title>4. Discussion</title>
<p>This is the first study in veterinary medicine to propose deep learning models to detect kidney calculi on CT images of dogs, and to evaluate their performance. All five models developed in this study showed improved performance on detecting kidney from pre-contrast CT images compared to the previous study using UNETR (<xref ref-type="bibr" rid="B28">28</xref>). For the kidney calculi detection, the models developed in this study showed promising performance comparable to previous models developed in human medicine.</p>
<p>In this study, detection of the kidney was considered essential for the proper detection of kidney calculi; it was expected that better performance in detecting the kidney would result in a more accurate detection of kidney calculi. Therefore, we developed models that could detect kidney calculi as well as the kidney itself in the first stage of the analysis. Recently, several studies have proposed deep learning models for the automatic segmentation of kidneys on CT images in human medicine. da Cruz et al. (<xref ref-type="bibr" rid="B19">19</xref>) reported a model with a DSC of 0.96. Another study reported a model with a DSC of 0.95 and 0.93 for the left and right kidneys, respectively, using ConvNet-Coarse, and 0.94 and 0.93 for the left and right kidney, respectively, using ConvNet-Fine (<xref ref-type="bibr" rid="B20">20</xref>). For automatic kidney detection, a previous study in veterinary medicine based on UNet Transformer showed a DSC of 0.912 and 0.915 before and after post-processing, respectively (<xref ref-type="bibr" rid="B28">28</xref>). All the models developed in this study showed improved performance for automatic kidney detection compared to this previous study, but showed a slightly lower DSC compared to the models developed for application in human medicine. SwinUNet exhibited the best DSC (0.943), followed by RBCANet (0.942), UTNet (0.935), and AttUNet and TransUNet (0.934). Further studies with more training data and novel architectures can help develop models with DSCs comparable to the models developed in the human medicine field.</p>
<p>In addition, several studies have proposed deep learning models for detecting kidney calculi on CT images in human medicine. Elton et al. (<xref ref-type="bibr" rid="B24">24</xref>) reported a sensitivity of 0.88 and specificity of 0.91 on a validation set; Parakh et al. (<xref ref-type="bibr" rid="B37">37</xref>) reported a sensitivity of 0.94 and specificity of 0.96 by GrayNet, and sensitivity of 0.90 and specificity of 0.92 by ImageNet. Li et al. (<xref ref-type="bibr" rid="B21">21</xref>) evaluated the performances of five different models, and reported that Res U-Net showed a sensitivity of 0.79 and specificity of 0.99, and 3D U-Net showed a sensitivity of 0.80 and specificity of 0.99. The models developed in this study showed a comparable performance to those developed in the field of human medical imaging, with the highest sensitivity value at 0.89 and specificity at 0.99 for TransUNet. Elton et al. (<xref ref-type="bibr" rid="B24">24</xref>) reported an AUC of 0.95 on a validation set, while Parakh et al. (<xref ref-type="bibr" rid="B37">37</xref>) reported an AUC of 0.954 by GrayNet, and 0.936 by ImageNet on urinary stone detection. In this study, the models using TransUNet, SwinUNet, UTNet, RBCANet, and AttUNet achieved AUCs of 0.98, 0.97, 0.92, 0.91, and 0.90, respectively. TransUNet and SwinUNet performed better than previous models applied in human medicine. Therefore, the use of TransUNet- and SwinUNet-based models for the detection of kidney calculi is promising.</p>
<p>The evaluation of each model in this study reveals its unique strengths and weaknesses. Despite the lower DSC of SwinUNet for the detection of calculi, its high AUC underscores its overall commendable performance. In contrast, UTNet, despite its notable sensitivity and specificity, showed lower DSC and AUC values for calculi detection, suggesting possible difficulties in detecting intricate structures such as kidney stones. Interestingly, AttUNet, despite having the highest DSC (which is indicative of a strong ability to identify kidney calculi), had the lowest AUC, suggesting potential limitations in its overall prediction accuracy. TransUNet demonstrated a balanced performance with the highest AUC but the lowest DSC for kidney calculi detection, suggesting possible challenges for accurate structure delineation. In addition, despite excelling in kidney detection, RBCANet showed a lower DSC and AUC for calculi detection. This suggests that while RBCANet is proficient at handling larger structures (such as kidneys) owing to its effective hierarchical feature capture, it may not be able to effectively identify smaller structures such as kidney stones. Therefore, the selection of an appropriate model should be tailored to the specific task, and should consider the unique strengths and weaknesses of each model.</p>
<p>Compared with the results of models developed in the human medical field, the models developed in this study showed lower DSC values for detecting kidneys. Compared to the current study, a previous study in veterinary medicine using UNet Transformer reported a lower DSC value; a wide range of body sizes associated with various breeds was considered an obstacle in the training process and a factor leading to the lower DSC compared to other models developed for humans (<xref ref-type="bibr" rid="B19">19</xref>&#x02013;<xref ref-type="bibr" rid="B21">21</xref>, <xref ref-type="bibr" rid="B28">28</xref>). This is consistent with the observations in the current study.</p>
<p>The main reason for the lower DSC values for detecting kidney calculi of our model compared to those developed in human medicine was the smaller size of the kidneys and calculi. The size of the calculi is reportedly associated with model performance. In a previous study in human medicine, kidney calculi were classified into small (0&#x02013;6 mm), medium (6&#x02013;20 mm), and large (above 20 mm) sizes (<xref ref-type="bibr" rid="B26">26</xref>). The DSC was highest in large size group, at 83.39 &#x000B1; 2.33; it was 76.08 &#x000B1; 3.46 in the middle size group, and the lowest in the small size group, at 60.11 &#x000B1; 0.84. Among the architectures studied in this previous study, SegNet showed the highest difference in DSC between the large and small size groups (80.86 &#x000B1; 4.54 in the large group and 34.38 &#x000B1; 1.67 in the small group). DeepLabV3&#x0002B; could not detect calculi in the small size group. Another study classified kidney calculi into five different size groups results that were consistent with the above. The AUC was highest in the largest calculi group (&#x0003E;125 mm<sup>3</sup>) and decreased as the size decreased. The size of the kidney calculi included in the current study varied, but the majority of the calculi were small (&#x0003C;3 mm), which meets the criteria of the small size group in human medicine. Compared to the results of previous models in human medicine, our present models showed promising performance in detecting small calculi. Further studies, including more data on larger kidney calculi, can help improve the performance of the models.</p>
<p>In this study, the age of dogs with kidney calculi was significantly higher than that of dogs without calculi, which is consistent with the results of previous studies. In a human medicine study, it was reported that the prevalence of kidney calculi increased with age, and increasing age was considered a risk factor (<xref ref-type="bibr" rid="B38">38</xref>). However, a limitation of the present study is that we were not able to investigate and compare the ages at which calculi first developed, as the relevant data were not available due to the retrospective nature of the study. For consistency, the age at which the CT scan was obtained was considered to be the age of the dog, even if the dog had visited multiple times. Therefore, the age used in this study may have been biased toward older age.</p>
<p>The BW of the dogs with calculi was significantly lower than that of the dogs without calculi. Similarly, a recent study that investigated the prevalence and predictors of upper urinary tract uroliths in dogs found that dogs with upper urinary tract uroliths were significantly older and smaller than those without urolithiasis, which is consistent with our results (<xref ref-type="bibr" rid="B1">1</xref>). Also, a previous study showed that body height was inversely associated with the prevalence of kidney calculi diseases in human (<xref ref-type="bibr" rid="B39">39</xref>). Several factors have been considered as possible reasons for these results. If BW correlates directly with the ureteral diameter or length, the calculi are likely to spontaneously pass through the ureter more easily in those with a higher BW; however, studies on this topic are lacking. In addition, BW differs by breed and genetic factors might impact urolithiasis risk (<xref ref-type="bibr" rid="B1">1</xref>).</p>
<p>One of the limitations of this study was the relatively small size of the kidneys and kidney calculi in dogs compared to those in humans, which acted as an impediment factor for model development. Moreover, some calculi were smaller than the minimum pixel of the labeling program. Therefore, the margins of several calculi in the labeled image did not meet the actual margins of the calculi, which could have resulted the lower DSC values for small calculi in this study and could be considered as false negative. Another limitation of this study was the small number of CT scans included. Further studies with more CT scans may result in a better model performance. In addition, despite the result of this study, the lack of external validation in this study can be considered as a limitation. External validation using independent datasets from clinical fields with different image conditions and qualities would help demonstrate the applicability of the models in the context of practical approach. Another limitation of this study is that the data used for the model development were retrospectively collected from multicenter with different CT scanners, which acted as a major impediment factor of this study. Further prospective study with controlled data could result in the development of models with advanced performance.</p>
<p>In conclusion, the deep learning models proposed in this study showed promising results for the detection of small kidney calculi and highly encouraging results for the automatic segmentation of kidneys from pre-contrast CT images in dogs. These models can potentially assist clinicians in the detection of kidney calculi. Further studies using models that can automatically provide accurate volumes of calculi may have considerable clinical utility.</p></sec>
<sec sec-type="data-availability" id="s5">
<title>Data availability statement</title>
<p>The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.</p></sec>
<sec sec-type="ethics-statement" id="s6">
<title>Ethics statement</title>
<p>The animal studies were approved by the Institutional Animal Care and Use Committee of Jeonbuk National University. The studies were conducted in accordance with the local legislation and institutional requirements. Written informed consent was obtained from the owners for the participation of their animals in this study.</p></sec>
<sec sec-type="author-contributions" id="s7">
<title>Author contributions</title>
<p>YJ and HY: conception, design, drafting, and acquisition of data. YJ, GH, SL, KL, and HY: analysis, interpretation of data, revision for intellectual content, and final approval of the completed article. All authors contributed to the article and approved the submitted version.</p></sec>
</body>
<back>
<sec sec-type="funding-information" id="s8">
<title>Funding</title>
<p>This work was supported by the National Research Foundation of Korea and funded by a grant from the Korean Government (No. 2021R1C1C1006794).</p>
</sec>
<ack><p>The authors would like to thank Clinicians of Veterinary Medical Imaging Department of the Teaching Hospital of Jeonbuk National University for their assistance with manual segmentation of CT images used in this study.</p>
</ack>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s9">
<title>Publisher&#x00027;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<fn-group>
<title>Abbreviations</title>
<fn fn-type="abbr"><p>CT, computed tomography; HU, Hounsfield Unit; AttUNet, Attention U-Net; DSC, dice similarity coefficient; TP, true positive; TN, true negative; FP, false positive; FN, false negative; ROC, Receiver Operating Characteristic; AUC, area under the curve; BW, body weight.</p></fn></fn-group>
<ref-list>
<title>References</title>
<ref id="B1">
<label>1.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hoelmer</surname> <given-names>AM</given-names></name> <name><surname>Lulich</surname> <given-names>JP</given-names></name> <name><surname>Rendahl</surname> <given-names>AK</given-names></name> <name><surname>Furrow</surname> <given-names>E</given-names></name></person-group>. <article-title>Prevalence and predictors of radiographically apparent upper urinary tract urolithiasis in eight dog breeds predisposed to calcium oxalate urolithiasis and mixed breed dogs</article-title>. <source>Vet Sci.</source> (<year>2022</year>) <volume>9</volume>:<fpage>283</fpage>. <pub-id pub-id-type="doi">10.3390/vetsci9060283</pub-id><pub-id pub-id-type="pmid">35737335</pub-id></citation></ref>
<ref id="B2">
<label>2.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ling</surname> <given-names>GV</given-names></name> <name><surname>Ruby</surname> <given-names>AL</given-names></name> <name><surname>Johnson</surname> <given-names>DL</given-names></name> <name><surname>Thurmond</surname> <given-names>M</given-names></name> <name><surname>Franti</surname> <given-names>CE</given-names></name></person-group>. <article-title>Renal calculi in dogs and cats: prevalence, mineral type, breed, age, and gender interrelationships (1981&#x02013;1993)</article-title>. <source>J Vet Intern Med.</source> (<year>1998</year>) <volume>12</volume>:<fpage>11</fpage>&#x02013;<lpage>21</lpage>. <pub-id pub-id-type="doi">10.1111/j.1939-1676.1998.tb00491.x</pub-id><pub-id pub-id-type="pmid">9503355</pub-id></citation></ref>
<ref id="B3">
<label>3.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rubin</surname> <given-names>SI</given-names></name></person-group>. <article-title>Chronic renal failure and its management and nephrolithiasis</article-title>. <source>Vet Clin North Am Small Anim Pract.</source> (<year>1997</year>) <volume>27</volume>:<fpage>1331</fpage>&#x02013;<lpage>54</lpage>. <pub-id pub-id-type="doi">10.1016/S0195-5616(97)50129-X</pub-id><pub-id pub-id-type="pmid">9348633</pub-id></citation></ref>
<ref id="B4">
<label>4.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Berent</surname> <given-names>A</given-names></name> <name><surname>Adams</surname> <given-names>LG</given-names></name></person-group>. <source>Interventional Management of Complicated Nephrolithiasis. Veterinary Image-Guided Interventions</source>. <publisher-loc>Ames, IW</publisher-loc>: <publisher-name>John Wiley &#x00026; Sons Inc</publisher-name>. (<year>2015</year>), <fpage>289</fpage>&#x02013;<lpage>300</lpage>.</citation>
</ref>
<ref id="B5">
<label>5.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sigurjonsdottir</surname> <given-names>VK</given-names></name> <name><surname>Runolfsdottir</surname> <given-names>HL</given-names></name> <name><surname>Indridason</surname> <given-names>OS</given-names></name> <name><surname>Palsson</surname> <given-names>R</given-names></name> <name><surname>Edvardsson</surname> <given-names>VO</given-names></name></person-group>. <article-title>Impact of nephrolithiasis on kidney function</article-title>. <source>BMC Nephrol.</source> (<year>2015</year>) <volume>16</volume>:<fpage>1</fpage>&#x02013;<lpage>7</lpage>. <pub-id pub-id-type="doi">10.1186/s12882-015-0126-1</pub-id><pub-id pub-id-type="pmid">26316205</pub-id></citation></ref>
<ref id="B6">
<label>6.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gambaro</surname> <given-names>G</given-names></name> <name><surname>Croppi</surname> <given-names>E</given-names></name> <name><surname>Bushinsky</surname> <given-names>D</given-names></name> <name><surname>Jaeger</surname> <given-names>P</given-names></name> <name><surname>Cupisti</surname> <given-names>A</given-names></name> <name><surname>Ticinesi</surname> <given-names>A</given-names></name> <etal/></person-group>. <article-title>The risk of chronic kidney disease associated with urolithiasis and its urological treatments: a review</article-title>. <source>UrolJ.</source> (<year>2017</year>) <volume>198</volume>:<fpage>268</fpage>&#x02013;<lpage>73</lpage>. <pub-id pub-id-type="doi">10.1016/j.juro.2016.12.135</pub-id><pub-id pub-id-type="pmid">28286070</pub-id></citation></ref>
<ref id="B7">
<label>7.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhe</surname> <given-names>M</given-names></name> <name><surname>Hang</surname> <given-names>Z</given-names></name></person-group>. <article-title>Nephrolithiasis as a risk factor of chronic kidney disease: a meta-analysis of cohort studies with 4,770,691 participants</article-title>. <source>Urolithiasis.</source> (<year>2017</year>) <volume>45</volume>:<fpage>441</fpage>&#x02013;<lpage>8</lpage>. <pub-id pub-id-type="doi">10.1007/s00240-016-0938-x</pub-id><pub-id pub-id-type="pmid">27837248</pub-id></citation></ref>
<ref id="B8">
<label>8.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Khan</surname> <given-names>S</given-names></name></person-group>. <article-title>Stress oxidative: nephrolithiasis and chronic kidney diseases</article-title>. <source>Minerva Med.</source> (<year>2013</year>) <volume>104</volume>:<fpage>23</fpage>&#x02013;<lpage>30</lpage>.<pub-id pub-id-type="pmid">23392535</pub-id></citation></ref>
<ref id="B9">
<label>9.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Boulay</surname> <given-names>I</given-names></name> <name><surname>Holtz</surname> <given-names>P</given-names></name> <name><surname>Foley</surname> <given-names>WD</given-names></name> <name><surname>White</surname> <given-names>B</given-names></name> <name><surname>Begun</surname> <given-names>FP</given-names></name></person-group>. <article-title>Ureteral calculi: diagnostic efficacy of helical CT and implications for treatment of patients</article-title>. <source>AJR Am J Roentgenol.</source> (<year>1999</year>) <volume>172</volume>:<fpage>1485</fpage>&#x02013;<lpage>90</lpage>. <pub-id pub-id-type="doi">10.2214/ajr.172.6.10350277</pub-id><pub-id pub-id-type="pmid">10350277</pub-id></citation></ref>
<ref id="B10">
<label>10.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hesamian</surname> <given-names>MH</given-names></name> <name><surname>Jia</surname> <given-names>W</given-names></name> <name><surname>He</surname> <given-names>X</given-names></name> <name><surname>Kennedy</surname> <given-names>P</given-names></name></person-group>. <article-title>Deep learning techniques for medical image segmentation: achievements and challenges</article-title>. <source>J Digit Imaging.</source> (<year>2019</year>) <volume>32</volume>:<fpage>582</fpage>&#x02013;<lpage>96</lpage>. <pub-id pub-id-type="doi">10.1007/s10278-019-00227-x</pub-id><pub-id pub-id-type="pmid">31144149</pub-id></citation></ref>
<ref id="B11">
<label>11.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Roth</surname> <given-names>HR</given-names></name> <name><surname>Shen</surname> <given-names>C</given-names></name> <name><surname>Oda</surname> <given-names>H</given-names></name> <name><surname>Oda</surname> <given-names>M</given-names></name> <name><surname>Hayashi</surname> <given-names>Y</given-names></name> <name><surname>Misawa</surname> <given-names>K</given-names></name> <etal/></person-group>. <article-title>Deep learning and its application to medical image segmentation</article-title>. <source>Med Imaging Technol.</source> (<year>2018</year>) <volume>36</volume>:<fpage>63</fpage>&#x02013;<lpage>71</lpage>. <pub-id pub-id-type="doi">10.11409/mit.36.63</pub-id><pub-id pub-id-type="pmid">36533842</pub-id></citation></ref>
<ref id="B12">
<label>12.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yan</surname> <given-names>K</given-names></name> <name><surname>Wang</surname> <given-names>X</given-names></name> <name><surname>Lu</surname> <given-names>L</given-names></name> <name><surname>Summers</surname> <given-names>RM</given-names></name></person-group>. <article-title>DeepLesion: automated mining of large-scale lesion annotations and universal lesion detection with deep learning</article-title>. <source>J Med Imaging.</source> (<year>2018</year>) <volume>5</volume>:<fpage>036501</fpage>. <pub-id pub-id-type="doi">10.1117/1.JMI.5.3.036501</pub-id><pub-id pub-id-type="pmid">30035154</pub-id></citation></ref>
<ref id="B13">
<label>13.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>H</given-names></name> <name><surname>Chen</surname> <given-names>Y</given-names></name> <name><surname>Song</surname> <given-names>Y</given-names></name> <name><surname>Xiong</surname> <given-names>Z</given-names></name> <name><surname>Yang</surname> <given-names>Y</given-names></name> <name><surname>Wu</surname> <given-names>QJ</given-names></name></person-group>. <article-title>Automatic kidney lesion detection for CT images using morphological cascade convolutional neural networks</article-title>. <source>IEEE Access</source>. (<year>2019</year>) <volume>7</volume>:<fpage>83001</fpage>&#x02013;<lpage>11</lpage>. <pub-id pub-id-type="doi">10.1109/ACCESS.2019.2924207</pub-id></citation>
</ref>
<ref id="B14">
<label>14.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>Q</given-names></name> <name><surname>Cai</surname> <given-names>W</given-names></name> <name><surname>Wang</surname> <given-names>X</given-names></name> <name><surname>Zhou</surname> <given-names>Y</given-names></name> <name><surname>Feng</surname> <given-names>DD</given-names></name> <name><surname>Chen</surname> <given-names>M</given-names></name></person-group>. <article-title>Medical image classification with convolutional neural network</article-title>. <source>ICARCV.</source> (<year>2014</year>) <volume>24</volume>:<fpage>414</fpage>. <pub-id pub-id-type="doi">10.1109/ICARCV.2014.7064414</pub-id><pub-id pub-id-type="pmid">35232390</pub-id></citation></ref>
<ref id="B15">
<label>15.</label>
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Oktay</surname> <given-names>O</given-names></name> <name><surname>Schlemper</surname> <given-names>J</given-names></name> <name><surname>Folgoc</surname> <given-names>LL</given-names></name> <name><surname>Lee</surname> <given-names>M</given-names></name> <name><surname>Heinrich</surname> <given-names>M</given-names></name> <name><surname>Misawa</surname> <given-names>K</given-names></name></person-group>. <source>Attention u-net: Learning Where to Look for the Pancreas</source>. (<year>2018</year>). Available online at: <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/1804.03999">https://arxiv.org/abs/1804.03999</ext-link> (accessed June 3, 2023).<pub-id pub-id-type="pmid">35474556</pub-id></citation></ref>
<ref id="B16">
<label>16.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Gao</surname> <given-names>Y</given-names></name> <name><surname>Zhou</surname> <given-names>M</given-names></name> <name><surname>Metaxas</surname> <given-names>DN</given-names></name></person-group>. <source>UTNet: A Hybrid Transformer Architecture for Medical Image Segmentation. International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI).</source> <publisher-loc>Cham</publisher-loc>: <publisher-name>Springer</publisher-name> (<year>2018</year>).</citation>
</ref>
<ref id="B17">
<label>17.</label>
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>J</given-names></name> <name><surname>Lu</surname> <given-names>Y</given-names></name> <name><surname>Yu</surname> <given-names>Q</given-names></name> <name><surname>Luo</surname> <given-names>X</given-names></name> <name><surname>Adeli</surname> <given-names>E</given-names></name> <name><surname>Wang</surname> <given-names>Y</given-names></name></person-group>. <source>Transunet: Transformers Make Strong Encoders for Medical Image Segmentation</source>. (<year>2021</year>). Available online at: <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/2102.04306">https://arxiv.org/abs/2102.04306</ext-link> (accessed June 3, 2023).<pub-id pub-id-type="pmid">37109505</pub-id></citation></ref>
<ref id="B18">
<label>18.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Daniel</surname> <given-names>AJ</given-names></name> <name><surname>Buchanan</surname> <given-names>CE</given-names></name> <name><surname>Allcock</surname> <given-names>T</given-names></name> <name><surname>Scerri</surname> <given-names>D</given-names></name> <name><surname>Cox</surname> <given-names>EF</given-names></name> <name><surname>Prestwich</surname> <given-names>BL</given-names></name> <etal/></person-group>. <article-title>Automated renal segmentation in healthy and chronic kidney disease subjects using a convolutional neural network</article-title>. <source>Magn Reson Med.</source> (<year>2021</year>) <volume>86</volume>:<fpage>1125</fpage>&#x02013;<lpage>36</lpage>. <pub-id pub-id-type="doi">10.1002/mrm.28768</pub-id><pub-id pub-id-type="pmid">33755256</pub-id></citation></ref>
<ref id="B19">
<label>19.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>da Cruz</surname> <given-names>LB</given-names></name> <name><surname>Araujo</surname> <given-names>JDL</given-names></name> <name><surname>Ferreira</surname> <given-names>JL</given-names></name> <name><surname>Diniz</surname> <given-names>JOB</given-names></name> <name><surname>Silva</surname> <given-names>AC</given-names></name> <name><surname>de Almeida</surname> <given-names>JDS</given-names></name> <etal/></person-group>. <article-title>Kidney segmentation from computed tomography images using deep neural network</article-title>. <source>Comput Biol Med.</source> (<year>2020</year>) <volume>123</volume>:<fpage>103906</fpage>. <pub-id pub-id-type="doi">10.1016/j.compbiomed.2020.103906</pub-id><pub-id pub-id-type="pmid">32768047</pub-id></citation></ref>
<ref id="B20">
<label>20.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Thong</surname> <given-names>W</given-names></name> <name><surname>Kadoury</surname> <given-names>S</given-names></name> <name><surname>Pich&#x000E9;</surname> <given-names>N</given-names></name> <name><surname>Pal</surname> <given-names>CJ</given-names></name></person-group>. <article-title>Convolutional networks for kidney segmentation in contrast-enhanced CT scans</article-title>. <source>Comput Methods Biomech Biomed Eng Imaging Vis.</source> (<year>2016</year>) <volume>6</volume>:<fpage>277</fpage>&#x02013;<lpage>82</lpage>. <pub-id pub-id-type="doi">10.1080/21681163.2016.1148636</pub-id></citation>
</ref>
<ref id="B21">
<label>21.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hwang</surname> <given-names>G</given-names></name> <name><surname>Yoon</surname> <given-names>H</given-names></name> <name><surname>Ji</surname> <given-names>Y</given-names></name> <name><surname>Lee</surname> <given-names>SJ</given-names></name></person-group>. <article-title>RBCA-Net: Reverse boundary channel attention network for kidney tumor segmentation in CT images</article-title>. <source>ICTC</source>. (<year>2022</year>) <volume>14</volume>:<fpage>2114</fpage>&#x02013;<lpage>7</lpage>. <pub-id pub-id-type="doi">10.1109/ICTC55196.2022.9952992</pub-id></citation>
</ref>
<ref id="B22">
<label>22.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gharaibeh</surname> <given-names>M</given-names></name></person-group>. <article-title>Alzu&#x00027;bi D, Abdullah M, Hmeidi I, Al Nasar MR, Abualigah L, et al. Radiology imaging scans for early diagnosis of kidney tumors: a review of data analytics-based machine learning and deep learning approaches</article-title>. <source>Big Data Cogn Comp.</source> (<year>2022</year>) <volume>6</volume>:<fpage>29</fpage>. <pub-id pub-id-type="doi">10.3390/bdcc6010029</pub-id></citation>
</ref>
<ref id="B23">
<label>23.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Han</surname> <given-names>S</given-names></name> <name><surname>Hwang</surname> <given-names>SI</given-names></name> <name><surname>Lee</surname> <given-names>HJ</given-names></name></person-group>. <article-title>The classification of renal cancer in 3-phase CT images using a deep learning method</article-title>. <source>J Digit Imaging.</source> (<year>2019</year>) <volume>32</volume>:<fpage>638</fpage>&#x02013;<lpage>43</lpage>. <pub-id pub-id-type="doi">10.1007/s10278-019-00230-2</pub-id><pub-id pub-id-type="pmid">31098732</pub-id></citation></ref>
<ref id="B24">
<label>24.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Elton</surname> <given-names>DC</given-names></name> <name><surname>Turkbey</surname> <given-names>EB</given-names></name> <name><surname>Pickhardt</surname> <given-names>PJ</given-names></name> <name><surname>Summers</surname> <given-names>RM</given-names></name> <name><surname>A</surname></name></person-group>. <article-title>deep learning system for automated kidney stone detection and volumetric segmentation on noncontrast CT scans</article-title>. <source>Med Phys.</source> (<year>2022</year>) <volume>49</volume>:<fpage>2545</fpage>&#x02013;<lpage>54</lpage>. <pub-id pub-id-type="doi">10.1002/mp.15518</pub-id><pub-id pub-id-type="pmid">35156216</pub-id></citation></ref>
<ref id="B25">
<label>25.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>D</given-names></name> <name><surname>Xiao</surname> <given-names>C</given-names></name> <name><surname>Liu</surname> <given-names>Y</given-names></name> <name><surname>Chen</surname> <given-names>Z</given-names></name> <name><surname>Hassan</surname> <given-names>H</given-names></name> <name><surname>Su</surname> <given-names>L</given-names></name> <etal/></person-group>. <article-title>Deep segmentation networks for segmenting kidneys and detecting kidney stones in unenhanced abdominal CT images</article-title>. <source>Diagnostics.</source> (<year>2022</year>) <volume>12</volume>:<fpage>1788</fpage>. <pub-id pub-id-type="doi">10.3390/diagnostics12081788</pub-id><pub-id pub-id-type="pmid">35892498</pub-id></citation></ref>
<ref id="B26">
<label>26.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yildirim</surname> <given-names>K</given-names></name> <name><surname>Bozdag</surname> <given-names>PG</given-names></name> <name><surname>Talo</surname> <given-names>M</given-names></name> <name><surname>Yildirim</surname> <given-names>O</given-names></name> <name><surname>Karabatak</surname> <given-names>M</given-names></name> <name><surname>Acharya</surname> <given-names>UR</given-names></name></person-group>. <article-title>Deep learning model for automated kidney stone detection using coronal CT images</article-title>. <source>Comput Biol Med.</source> (<year>2021</year>) <volume>135</volume>:<fpage>104569</fpage>. <pub-id pub-id-type="doi">10.1016/j.compbiomed.2021.104569</pub-id><pub-id pub-id-type="pmid">34157470</pub-id></citation></ref>
<ref id="B27">
<label>27.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Islam</surname> <given-names>MN</given-names></name> <name><surname>Hasan</surname> <given-names>M</given-names></name> <name><surname>Hossain</surname> <given-names>MK</given-names></name> <name><surname>Alam</surname> <given-names>MGR</given-names></name> <name><surname>Uddin</surname> <given-names>MZ</given-names></name> <name><surname>Soylu</surname> <given-names>A</given-names></name></person-group>. <article-title>Vision transformer and explainable transfer learning models for auto detection of kidney cyst, stone and tumor from CT-radiography</article-title>. <source>Sci Rep.</source> (<year>2022</year>) <volume>12</volume>:<fpage>11440</fpage>. <pub-id pub-id-type="doi">10.1038/s41598-022-15634-4</pub-id><pub-id pub-id-type="pmid">35794172</pub-id></citation></ref>
<ref id="B28">
<label>28.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ji</surname> <given-names>Y</given-names></name> <name><surname>Cho</surname> <given-names>H</given-names></name> <name><surname>Seon</surname> <given-names>S</given-names></name> <name><surname>Lee</surname> <given-names>K</given-names></name> <name><surname>Yoon</surname> <given-names>H</given-names></name> <name><surname>A</surname></name></person-group>. <article-title>deep learning model for CT-based kidney volume determination in dogs and normal reference definition</article-title>. <source>Front Vet Sci.</source> (<year>2022</year>) <volume>9</volume>:<fpage>1011804</fpage>. <pub-id pub-id-type="doi">10.3389/fvets.2022.1011804</pub-id><pub-id pub-id-type="pmid">36387402</pub-id></citation></ref>
<ref id="B29">
<label>29.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Cao</surname> <given-names>H</given-names></name> <name><surname>Wang</surname> <given-names>Y</given-names></name> <name><surname>Chen</surname> <given-names>J</given-names></name> <name><surname>Jiang</surname> <given-names>D</given-names></name> <name><surname>Zhang</surname> <given-names>X</given-names></name> <name><surname>Tian</surname> <given-names>Q</given-names></name> <etal/></person-group>. <article-title>&#x0201C;Swin-unet: Unet-like pure transformer for medical image segmentation,&#x0201D;</article-title> in <source>Computer Vision&#x02013;ECCV 2022 Workshops: Tel Aviv, Israel, October 23&#x02013;27, 2022, Proceedings, Part III</source>. <publisher-loc>Cham</publisher-loc>: <publisher-name>Springer</publisher-name> (<year>2023</year>).</citation>
</ref>
<ref id="B30">
<label>30.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>Z</given-names></name> <name><surname>Sabuncu</surname> <given-names>M</given-names></name></person-group>. <article-title>Generalized cross entropy loss for training deep neural networks with noisy labels</article-title>. 32nd Conference on Neural Information Processing Systems (<year>2018</year>) <fpage>31</fpage>.</citation>
</ref>
<ref id="B31">
<label>31.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yeung</surname> <given-names>M</given-names></name> <name><surname>Sala</surname> <given-names>E</given-names></name> <name><surname>Sch&#x000F6;nlieb</surname> <given-names>C-B</given-names></name> <name><surname>Rundo</surname> <given-names>L</given-names></name></person-group>. <article-title>Unified focal loss: Generalising dice and cross entropy-based losses to handle class imbalanced medical image segmentation</article-title>. <source>Comput Med Imaging Graph.</source> (<year>2022</year>) <volume>95</volume>:<fpage>102026</fpage>. <pub-id pub-id-type="doi">10.1016/j.compmedimag.2021.102026</pub-id><pub-id pub-id-type="pmid">34953431</pub-id></citation></ref>
<ref id="B32">
<label>32.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Jadon</surname> <given-names>S</given-names></name></person-group>. <source>A Survey of Loss Functions for Semantic Segmentation. IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB: 2020)</source>. : <publisher-loc>Manhattan, NY</publisher-loc>: <publisher-name>IEEE</publisher-name> (<year>2020</year>).</citation>
</ref>
<ref id="B33">
<label>33.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Taghanaki</surname> <given-names>SA</given-names></name> <name><surname>Zheng</surname> <given-names>Y</given-names></name> <name><surname>Zhou</surname> <given-names>SK</given-names></name> <name><surname>Georgescu</surname> <given-names>B</given-names></name> <name><surname>Sharma</surname> <given-names>P</given-names></name> <name><surname>Xu</surname> <given-names>D</given-names></name> <etal/></person-group>. <article-title>Combo loss: Handling input and output imbalance in multi-organ segmentation</article-title>. <source>Comput Med Imaging Graph.</source> (<year>2019</year>) <volume>75</volume>:<fpage>24</fpage>&#x02013;<lpage>33</lpage>. <pub-id pub-id-type="doi">10.1016/j.compmedimag.2019.04.005</pub-id><pub-id pub-id-type="pmid">31129477</pub-id></citation></ref>
<ref id="B34">
<label>34.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Buslaev</surname> <given-names>A</given-names></name> <name><surname>Iglovikov</surname> <given-names>VI</given-names></name> <name><surname>Khvedchenya</surname> <given-names>E</given-names></name> <name><surname>Parinov</surname> <given-names>A</given-names></name> <name><surname>Druzhinin</surname> <given-names>M</given-names></name> <name><surname>Kalinin</surname> <given-names>AA</given-names></name></person-group>. <article-title>Albumentations: fast and flexible image augmentations</article-title>. <source>Information.</source> (<year>2020</year>) <volume>11</volume>:<fpage>125</fpage>. <pub-id pub-id-type="doi">10.3390/info11020125</pub-id></citation>
</ref>
<ref id="B35">
<label>35.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Taha</surname> <given-names>AA</given-names></name> <name><surname>Hanbury</surname> <given-names>A</given-names></name></person-group>. <article-title>Metrics for evaluating 3D medical image segmentation: analysis, selection, and tool</article-title>. <source>BMC Med Imaging.</source> (<year>2015</year>) <volume>15</volume>:<fpage>1</fpage>&#x02013;<lpage>28</lpage>. <pub-id pub-id-type="doi">10.1186/s12880-015-0068-x</pub-id><pub-id pub-id-type="pmid">26263899</pub-id></citation></ref>
<ref id="B36">
<label>36.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Popovic</surname> <given-names>A</given-names></name></person-group>. <article-title>De la Fuente M, Engelhardt M, Radermacher K. Statistical validation metric for accuracy assessment in medical image segmentation</article-title>. <source>Int J Comput Assist Radiol Surg.</source> (<year>2007</year>) <volume>2</volume>:<fpage>169</fpage>&#x02013;<lpage>81</lpage>. <pub-id pub-id-type="doi">10.1007/s11548-007-0125-1</pub-id></citation>
</ref>
<ref id="B37">
<label>37.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Parakh</surname> <given-names>A</given-names></name> <name><surname>Lee</surname> <given-names>H</given-names></name> <name><surname>Lee</surname> <given-names>JH</given-names></name> <name><surname>Eisner</surname> <given-names>BH</given-names></name> <name><surname>Sahani</surname> <given-names>DV</given-names></name> <name><surname>Do</surname> <given-names>S</given-names></name></person-group>. <article-title>Urinary stone detection on CT images using deep convolutional neural networks: evaluation of model performance and generalization</article-title>. <source>Radiol Artif Intell.</source> (<year>2019</year>) <volume>1</volume>:<fpage>e180066</fpage>. <pub-id pub-id-type="doi">10.1148/ryai.2019180066</pub-id><pub-id pub-id-type="pmid">33937795</pub-id></citation></ref>
<ref id="B38">
<label>38.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ramello</surname> <given-names>A</given-names></name> <name><surname>Vitale</surname> <given-names>C</given-names></name> <name><surname>Marangella</surname> <given-names>M</given-names></name></person-group>. <article-title>Epidemiology of nephrolithiasis</article-title>. <source>J Nephrol.</source> (<year>2001</year>) <volume>13</volume>:<fpage>S45</fpage>&#x02013;<lpage>50</lpage>.</citation>
</ref>
<ref id="B39">
<label>39.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Curhan</surname> <given-names>GC</given-names></name> <name><surname>Willett</surname> <given-names>WC</given-names></name> <name><surname>Rimm</surname> <given-names>EB</given-names></name> <name><surname>Speizer</surname> <given-names>FE</given-names></name> <name><surname>Stampfer</surname> <given-names>MJ</given-names></name></person-group>. <article-title>Body size and risk of kidney stones</article-title>. <source>J Am Soc Nephrol.</source> (<year>1998</year>) <volume>9</volume>:<fpage>1645</fpage>&#x02013;<lpage>52</lpage>. <pub-id pub-id-type="doi">10.1681/ASN.V991645</pub-id><pub-id pub-id-type="pmid">9727373</pub-id></citation></ref>
</ref-list>
</back>
</article>