<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="research-article" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Bioeng. Biotechnol.</journal-id>
<journal-title>Frontiers in Bioengineering and Biotechnology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Bioeng. Biotechnol.</abbrev-journal-title>
<issn pub-type="epub">2296-4185</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">747217</article-id>
<article-id pub-id-type="doi">10.3389/fbioe.2021.747217</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Bioengineering and Biotechnology</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>A Multi-Task Deep Learning Method for Detection of Meniscal Tears in MRI Data from the Osteoarthritis Initiative Database</article-title>
<alt-title alt-title-type="left-running-head">Tack et&#x20;al.</alt-title>
<alt-title alt-title-type="right-running-head">Meniscal Tear Detection in MRI</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Tack</surname>
<given-names>Alexander</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/527572/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Shestakov</surname>
<given-names>Alexey</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>L&#xfc;dke</surname>
<given-names>David</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1421476/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Zachow</surname>
<given-names>Stefan</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/8759/overview"/>
</contrib>
</contrib-group>
<aff id="aff1">
<label>
<sup>1</sup>
</label>Dept. for Visual and Data-Centric Computing, Zuse Institute Berlin, <addr-line>Berlin</addr-line>, <country>Germany</country>
</aff>
<aff id="aff2">
<label>
<sup>2</sup>
</label>Charit&#xe9;&#x2013;University Medicine, <addr-line>Berlin</addr-line>, <country>Germany</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/60387/overview">Fabio Galbusera</ext-link>, Galeazzi Orthopedic Institute (IRCCS), Italy</p>
</fn>
<fn fn-type="edited-by">
<p>
<bold>Reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1432376/overview">Andrea Cina</ext-link>, Galeazzi Orthopedic Institute (IRCCS), Italy</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1026453/overview">Jonas Schwer</ext-link>, University of Ulm, Germany</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/234229/overview">Fuyuan Liao</ext-link>, Xi&#x2019;an Technological University, China</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Alexander Tack, <email>tack@zib.de</email>
</corresp>
<fn fn-type="other">
<p>This article was submitted to Biomechanics, a section of the journal Frontiers in Bioengineering and Biotechnology</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>02</day>
<month>12</month>
<year>2021</year>
</pub-date>
<pub-date pub-type="collection">
<year>2021</year>
</pub-date>
<volume>9</volume>
<elocation-id>747217</elocation-id>
<history>
<date date-type="received">
<day>25</day>
<month>07</month>
<year>2021</year>
</date>
<date date-type="accepted">
<day>15</day>
<month>10</month>
<year>2021</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2021 Tack, Shestakov, L&#xfc;dke and Zachow.</copyright-statement>
<copyright-year>2021</copyright-year>
<copyright-holder>Tack, Shestakov, L&#xfc;dke and Zachow</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these&#x20;terms.</p>
</license>
</permissions>
<abstract>
<p>We present a novel and computationally efficient method for the detection of meniscal tears in Magnetic Resonance Imaging (MRI) data. Our method is based on a Convolutional Neural Network (CNN) that operates on complete 3D MRI scans. Our approach detects the presence of meniscal tears in three anatomical sub-regions (anterior horn, body, posterior horn) for both the Medial Meniscus (MM) and the Lateral Meniscus (LM) individually. For optimal performance of our method, we investigate how to preprocess the MRI data and how to train the CNN such that only relevant information within a Region of Interest (RoI) of the data volume is taken into account for meniscal tear detection. We propose meniscal tear detection combined with a bounding box regressor in a multi-task deep learning framework to let the CNN implicitly consider the corresponding RoIs of the menisci. We evaluate the accuracy of our CNN-based meniscal tear detection approach on 2,399 Double Echo Steady-State (DESS) MRI scans from the Osteoarthritis Initiative database. In addition, to show that our method is capable of generalizing to other MRI sequences, we also adapt our model to Intermediate-Weighted Turbo Spin-Echo (IW TSE) MRI scans. To judge the quality of our approaches, Receiver Operating Characteristic (ROC) curves and Area Under the Curve (AUC) values are evaluated for both MRI sequences. For the detection of tears in DESS MRI, our method reaches AUC values of 0.94, 0.93, 0.93 (anterior horn, body, posterior horn) in MM and 0.96, 0.94, 0.91 in LM. For the detection of tears in IW TSE MRI data, our method yields AUC values of 0.84, 0.88, 0.86 in MM and 0.95, 0.91, 0.90 in LM. In conclusion, the presented method achieves high accuracy for detecting meniscal tears in both DESS and IW TSE MRI data. Furthermore, our method can be easily trained and applied to other MRI sequences.</p>
</abstract>
<kwd-group>
<kwd>knee joint</kwd>
<kwd>meniscal lesions</kwd>
<kwd>convolutional neural networks&#x2013;CNN</kwd>
<kwd>residual learning</kwd>
<kwd>explainable AI (XAI)</kwd>
<kwd>multi-task deep learning</kwd>
<kwd>bounding box regression</kwd>
<kwd>object detection</kwd>
</kwd-group>
<contract-sponsor id="cn001">Deutsche Forschungsgemeinschaft<named-content content-type="fundref-id">10.13039/501100001659</named-content>
</contract-sponsor>
<contract-sponsor id="cn002">Bundesministerium f&#xfc;r Bildung und Forschung<named-content content-type="fundref-id">10.13039/501100002347</named-content>
</contract-sponsor>
</article-meta>
</front>
<body>
<sec id="s1">
<title>1 Introduction</title>
<p>Menisci are hydrated fibrocartilaginous soft tissues within the knee joint that absorb shocks, provide lubrication, and allow for joint stability during movement (<xref ref-type="bibr" rid="B29">Markes et&#x20;al., 2020</xref>). In patients with symptomatic osteoarthritis, meniscal damage is also found very frequently with a prevalence of up to 91% (<xref ref-type="bibr" rid="B5">Bhattacharyya et&#x20;al., 2003</xref>). Meniscal tears are usually caused by trauma and degeneration (<xref ref-type="bibr" rid="B4">Beaufils and Pujol, 2017</xref>) and might lead to a loss of function, early osteoarthritis, tibiofemoral osteophytes, and cartilage loss (<xref ref-type="bibr" rid="B11">Ding et&#x20;al., 2007</xref>; <xref ref-type="bibr" rid="B42">Snoeker et&#x20;al., 2021</xref>). Magnetic Resonance Imaging (MRI) is commonly used for the noninvasive assessment of meniscal morphology since MRI provides a three-dimensional view of the knee joint with high contrast between soft tissues. Hence, MRI is the recognized screening tool for diagnostic assessment before performing therapeutic arthroscopy or any other treatment (<xref ref-type="bibr" rid="B10">Crawford et&#x20;al., 2007</xref>). Among other factors, a proper treatment concept for meniscal damage depends highly on the type of tear and its location (<xref ref-type="bibr" rid="B13">Englund et&#x20;al., 2001</xref>; <xref ref-type="bibr" rid="B4">Beaufils and Pujol, 2017</xref>). An appropriate medical intervention can delay further development of arthritic changes, improve quality of life, and reduce healthcare expenditures. However, in practice, the optimal treatment is not always apparent (<xref ref-type="bibr" rid="B24">Khan et&#x20;al., 2014</xref>; <xref ref-type="bibr" rid="B26">Kise et&#x20;al., 2016</xref>), while an improper procedure might even lead to an acceleration of osteoarthritis progression (<xref ref-type="bibr" rid="B38">Roemer et&#x20;al., 2017</xref>). For this reason, an accurate and reliable diagnosis of meniscal tears in view of their location, type, and orientation is important.</p>
<p>The diagnosis of meniscal tears in MRI is a time consuming and tedious procedure. These defects are often difficult to detect due to their small sizes and arbitrary orientations. It is frequently necessary to go back and forth in the MRI slices and switch view directions for a thorough assessment of occurrences and spatial extents of pathological changes. In addition, the meniscal representation in the image data depends on the chosen MRI sequence. What appears clearly visible in one sequence may be barely noticeable in another due to insufficient contrast. Computer-Aided Diagnosis (CAD) attempts to overcome some of these limitations. CAD tools can be employed to increase the sensitivity and specificity of physicians in detecting and classifying meniscal tears (<xref ref-type="bibr" rid="B6">Bien et&#x20;al., 2018</xref>; <xref ref-type="bibr" rid="B30">Pedoia et&#x20;al., 2019</xref>; <xref ref-type="bibr" rid="B27">Kunze et&#x20;al., 2020</xref>). Moreover, CAD could speed up the diagnosis, reduce the number of unintentionally missed defects, avoid unnecessary interventions (e.g., arthroscopic interventions), and lead to fewer treatment delays. Several CAD approaches for an automated detection of meniscal tears in MRI data have been proposed in recent years. A distinction can be made between methods that evaluate the 2D contents of cross-sectional images often coming from a set of curated slices (2D approaches) and those that evaluate 3D image information in the MRI data volume (3D approaches). In the context of image analysis by means of Convolutional Neural Networks (CNNs), we distinguish between 2D CNNs and 3D CNNs. In the case of the 2D approaches, there exists a pseudo-3D variant in which sets of (neighboring) sectional images are included in the evaluation. In these pseudo-3D variants, 2D CNNs are employed to encode 2D slices of a 3D MRI dataset. Afterwards, the respective 2D encodings are condensed (e.g., by global max- or average-pooling), concatenated, and passed to a classifier.</p>
<p>
<xref ref-type="bibr" rid="B37">Roblot et&#x20;al. (2019)</xref> proposed a method to detect meniscal tears from a curated set of sagittal 2D MRI slices. Their approach is based on the 2D &#x201c;faster R-CNN&#x201d; (<xref ref-type="bibr" rid="B34">Ren et&#x20;al., 2015</xref>) and comprises three steps: Firstly, the positions of both meniscal horns are detected; secondly, the presence of a tear is classified; and thirdly, the respective tear orientation is determined. The method yields an Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) of 0.92 for the detection of the meniscal horns&#x2019; positions, an AUC of 0.94 for detecting the presence of meniscal tears, and an AUC of 0.83 for the determination of the tear orientations. <xref ref-type="bibr" rid="B9">Couteaux et&#x20;al. (2019)</xref> presented a similar method, also detecting meniscal tears from a curated set of sagittal 2D MRI slices. They employed a masked region-based 2D CNN (<xref ref-type="bibr" rid="B16">He et&#x20;al., 2017</xref>) to locate the anterior and the posterior horns of the Medial Meniscus (MM) as well as the Lateral Meniscus (LM). Their method yields on average an AUC of 0.906 for all three tasks, i.e. the location of the respective region, the detection of meniscal tears, and the classification of the tear orientation.</p>
<p>Processing of all MRI slices instead of individually selected ones was performed by <xref ref-type="bibr" rid="B6">Bien et&#x20;al. (2018)</xref> who proposed a 2D CNN for the detection of meniscal tears. Their method achieves an AUC of 0.847. <xref ref-type="bibr" rid="B30">Pedoia et&#x20;al. (2019)</xref> adopted a method that combined a 2D CNN for meniscus segmentation with a 3D CNN for detection and severity assessment of meniscal tears. This approach was able to differentiate between tears and no tears with an AUC of 0.89. <xref ref-type="bibr" rid="B45">Tsai et&#x20;al. (2020)</xref> proposed a so-called &#x201c;Efficiently-Layered Network&#x201d; for detection of meniscal tears, reaching an AUC of 0.904 and 0.913 for two different datasets. <xref ref-type="bibr" rid="B3">Azcona et&#x20;al. (2020)</xref> demonstrated the use of a 2D CNN as a pseudo-3D variant for detection of torn menisci. Their method relies on transfer learning while using data augmentation and reaches an AUC of 0.934. <xref ref-type="bibr" rid="B14">Fritz et&#x20;al. (2020)</xref> presented a deep 3D CNN to detect tears in MRI data for MM and LM, respectively. Their method reaches AUC values of 0.882, 0.781, and 0.961 for the detection of medial, lateral, and overall meniscal tears. <xref ref-type="bibr" rid="B36">Rizk et&#x20;al. (2021)</xref> also proposed a 3D CNN for meniscal tear detection in MRI data for MM and LM individually. Their approach yields an AUC of 0.93 for MM and 0.84 for&#x20;LM.</p>
<p>A common limitation among many of the methods listed above is their strong reliance on segmentations of the menisci (or at least of bounding boxes), which can be challenging to obtain due to the inhomogeneous appearance of pathological menisci in MRI data as well as an insufficient contrast to adjacent tissues (<xref ref-type="bibr" rid="B32">Rahman et&#x20;al., 2020</xref>). Furthermore, some approaches merely operate on 2D slices. A major limitation of such methods is that the trained 2D CNNs cannot take whole MRI volumes into account, thus possibly missing important feature correlations in 3D space. Besides, an appropriate selection of curated slices requires expert knowledge. Therefore, the applicability of these methods to 3D volumes is unclear since they were not trained on 3D data. Finally, none of the presented methods is able to detect meniscal tears for all anatomical sub-regions of the menisci individually, i.e.,&#x20;the anterior horn, the meniscal body, and the posterior&#x20;horn.</p>
<p>Our motivation is to detect meniscal tears in MRI data more accurately than previous methods in terms of correctness and localization. For this purpose, we present a method that detects tears in anatomical sub-regions of both the MM and the LM. We design our study in a manner that allows for a comparison of different possible approaches. Moreover, the study shows our progression in addressing the task of meniscal tear detection in 3D MR images. We investigate how to handle best the input data such that the least pre-processing is required for inference and the best accuracy is achieved. Furthermore, we show that our proposed method generalizes well to different MRI sequences. We employ two ResNet architectures (<xref ref-type="bibr" rid="B17">He et&#x20;al., 2016</xref>; <xref ref-type="bibr" rid="B47">Yu et&#x20;al., 2017</xref>) to classify meniscal tears in each sub-region of the MM and the LM, respectively, utilizing three different approaches.</p>
<p>In a first approach (i), we train a 3D CNN on the complete 3D MRI dataset as input. We call it <italic>Full-scale</italic> approach within the remainder of this article.</p>
<p>Since large input data requires a lot of GPU memory, longer time for training and inference, and contains image information not necessarily needed for an assessment of meniscal tears, we decided to crop the data to the Regions of Interest (RoI) of both menisci in an automated pre-processing step that requires segmentations of sufficient quality for training and testing (<xref ref-type="bibr" rid="B43">Tack et&#x20;al., 2018</xref>). Hence, in a second approach (ii), a 3D CNN is trained on these cropped MRIs detecting meniscal tears more accurately than in our first approach. We refer to the second approach as <italic>BB-crop</italic> approach.</p>
<p>We enhanced the performance of our first approach by adding a bounding box regression task. Thus, our final approach (iii) trains a CNN to detect meniscal tears in complete 3D MRI, combined with an additional bounding box regression task leading to an auxiliary loss (the <italic>BB-loss</italic> approach). Framing the problem of meniscal tear detection in this multi-task learning setting &#x2013; simultaneously solving meniscal tear detection and meniscal bounding box regression &#x2013; allows our model to implicitly learn to focus on the meniscal regions. Furthermore, segmentation masks are only required during training. Hence, our final approach requires the least data pre-processing at inference time and achieves the best results.</p>
<p>This study presents a method that detects meniscal tears in 3D MRI data on a sub-region level, i.e.,&#x20;the anterior horn, the meniscal body, and the posterior horn for both MM and LM. Formulating the problem in a multi-task learning setting, by adding the information of the location of the menisci as an auxiliary loss to our 3D CNN, state-of-the-art results are achieved. In order to provide an explanation to our CNN&#x2019;s decision, SmoothGrad saliency maps (<xref ref-type="bibr" rid="B41">Smilkov et&#x20;al., 2017</xref>) are computed and visualized. That way a visual guidance can be given to the clinical domain experts for confirming the results of our approach.</p>
</sec>
<sec id="s2">
<title>2 Materials and Methods</title>
<p>In <xref ref-type="sec" rid="s2-1">section 2.1</xref> of this chapter, the data to our method is presented. Thereafter, in <xref ref-type="sec" rid="s2-2">sections 2.2</xref> we introduce our data pre-processing and bounding box generation. Section <xref ref-type="sec" rid="s2-3">2.3</xref> is a description of the model architectures utilized in our approach and of their respective components. The particular configuration of our three approaches is illustrated in detail in <xref ref-type="sec" rid="s2-4">sections 2.4</xref>, <xref ref-type="sec" rid="s2-5">2.5</xref>, and <xref ref-type="sec" rid="s2-6">2.6</xref>, followed by an explanation of our experimental set-up and training in <xref ref-type="sec" rid="s2-7">section 2.7</xref>. Finally, a statistical evaluation is summarized in <xref ref-type="sec" rid="s2-8">section 2.8</xref> and a method for saliency maps is proposed in&#x20;<xref ref-type="sec" rid="s2-9">section 2.9</xref>.</p>
<sec id="s2-1">
<title>2.1 Data from the OAI Database</title>
<p>The publicly available database of the Osteoarthritis Initiative (OAI)<xref ref-type="fn" rid="fn1">
<sup>1</sup>
</xref> was established to provide researchers with resources to promote the prevention and treatment of knee osteoarthritis. We use 2,399 sagittal Double Echo Steady-State (DESS) 3D MRI scans from the OAI database acquired using Siemens Trio 3.0 Tesla scanners (<xref ref-type="bibr" rid="B31">Peterfy et&#x20;al., 2006</xref>). Additionally, 2,396 sagittal Intermediate-Weighted Turbo Spin-Echo (IW TSE) MRI scans are investigated for the same patients. The demographics of our study are shown in <xref ref-type="table" rid="T1">Table&#x20;1</xref>.</p>
<table-wrap id="T1" position="float">
<label>TABLE 1</label>
<caption>
<p>Demographics: In this study, 2,399 DESS and 2,396 IW TSE MRI scans from the OAI database are analyzed. In these data, slightly more normal than diseased medial menisci (MM) and lateral menisci (LM) are contained. Here, normal is defined as no conspicuous features with respect to the MOAKS scoring system in any sub-region.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left"/>
<th align="center">DESS</th>
<th align="center">IW TSE</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">Number of MR images</td>
<td align="center">2,399</td>
<td align="center">2,396</td>
</tr>
<tr>
<td align="left">In-plane resolution</td>
<td align="center">0.36&#xa0;mm &#xd7; 0.36&#xa0;mm</td>
<td align="center">0.36&#xa0;mm &#xd7; 0.36&#xa0;mm</td>
</tr>
<tr>
<td align="left">Usual slice dimension</td>
<td align="center">384&#x20;&#xd7; 384</td>
<td align="center">442&#x20;&#xd7; 448</td>
</tr>
<tr>
<td align="left">Slice thickness</td>
<td align="center">0.7&#xa0;mm</td>
<td align="center">3&#xa0;mm</td>
</tr>
<tr>
<td align="left">Number of slices</td>
<td align="center">160</td>
<td align="center">35 to 43</td>
</tr>
<tr>
<td align="left">Side (left; right)</td>
<td align="center">1104; 1295</td>
<td align="center">1104; 1292</td>
</tr>
<tr>
<td align="left">Sex (female; male)</td>
<td align="center">1489; 910</td>
<td align="center">1487; 909</td>
</tr>
<tr>
<td align="left">Age [years]</td>
<td align="center">61.88&#x20;&#xb1; 8.87</td>
<td align="center">61.89&#x20;&#xb1; 8.86</td>
</tr>
<tr>
<td align="left">BMI [kg/m<sup>2</sup>]</td>
<td align="center">29.01&#x20;&#xb1; 4.79</td>
<td align="center">29.08&#x20;&#xb1; 4.79</td>
</tr>
<tr>
<td align="left">MM (% normal)</td>
<td align="center">60.0%</td>
<td align="center">59.9%</td>
</tr>
<tr>
<td align="left">LM (% normal)</td>
<td align="center">80.0%</td>
<td align="center">79.9%</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The OAI database includes multiple reading studies of respective osteoarthritis characteristics, which can be assessed in medical image data. As a gold standard, we utilize labels from MOAKS (<xref ref-type="bibr" rid="B19">Hunter et&#x20;al., 2011</xref>) image reading studies performed by clinical experts. In the MOAKS scoring system, the menisci are divided into three anatomical sub-regions: anterior horn, body, posterior horn. We consider a sub-region as not containing a tear if the MOAKS score is &#x201c;normal&#x201d; or indicates a signal abnormality (which is not extending through the meniscal surface and, hence, is no tear). We considered any other type of abnormality (radial, horizontal, vertical, etc.) as a meniscal tear (c.f. <xref ref-type="sec" rid="s12">Supplementary Table S1</xref>). Examples of the MRI sequences, signal abnormalities, and meniscal tears are shown in <xref ref-type="fig" rid="F1">Figure&#x20;1</xref>.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption>
<p>Examples of normal menisci, signal abnormalities, and subjects with meniscal tears shown for DESS as well as IW TSE MRI data. For a summary of different types of meniscal tears per sub-region the reader is referred to <xref ref-type="sec" rid="s12">Supplementary Table S1</xref>.</p>
</caption>
<graphic xlink:href="fbioe-09-747217-g001.tif"/>
</fig>
</sec>
<sec id="s2-2">
<title>2.2 Data Pre-processing and Localization of Menisci</title>
<p>In a first step of our pre-processing, the intensities of all MR images are scaled to a range of [0, 1] using min-max normalization. Following that, a standardization is applied to each MR image <inline-formula id="inf1">
<mml:math id="m1">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">I</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula> according to:<disp-formula id="e1">
<mml:math id="m2">
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="script">I</mml:mi>
</mml:mrow>
<mml:mo>&#x303;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">I</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>&#x3bc;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>&#x3c3;</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mo>,</mml:mo>
</mml:math>
<label>(1)</label>
</disp-formula>where <italic>&#x3bc;</italic> is the mean intensity and <italic>&#x3c3;</italic> is the standard deviation of the training population of normalized scans. Leveraging meniscal segmentations generated by the method of <xref ref-type="bibr" rid="B43">Tack et&#x20;al. (2018)</xref> RoIs spanning the MM and LM are created for DESS MRI data (see <xref ref-type="fig" rid="F2">Figure&#x20;2</xref>). RoIs are computed by querying the minimum and maximum position of the menisci along each dimension of the binary segmentation masks: <italic>x</italic>
<sub>min</sub>, <italic>x</italic>
<sub>max</sub>, <italic>y</italic>
<sub>min</sub>, <italic>y</italic>
<sub>max</sub>, <italic>z</italic>
<sub>min</sub>, <italic>z</italic>
<sub>max</sub>. The bounding boxes are uniquely defined as the 3D center coordinate<disp-formula id="e2">
<mml:math id="m3">
<mml:mi>B</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi>B</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>r</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mfenced open="[" close="]">
<mml:mrow>
<mml:mtable class="matrix">
<mml:mtr>
<mml:mtd columnalign="center">
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>/</mml:mo>
<mml:mn>2</mml:mn>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="center">
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>/</mml:mo>
<mml:mn>2</mml:mn>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="center">
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>/</mml:mo>
<mml:mn>2</mml:mn>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mfenced>
<mml:mspace width="0.3333em"/>
<mml:mo>,</mml:mo>
</mml:math>
<label>(2)</label>
</disp-formula>and with the respective height (<italic>x</italic>
<sub>max</sub> &#x2212; <italic>x</italic>
<sub>min</sub>), width (<italic>y</italic>
<sub>max</sub> &#x2212; <italic>y</italic>
<sub>min</sub>), and depth (<italic>z</italic>
<sub>max</sub> &#x2212; <italic>z</italic>
<sub>min</sub>). These values are represented as relative image coordinates. Hence, a bounding box is defined by 6 floating values: <inline-formula id="inf2">
<mml:math id="m4">
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:mrow>
<mml:mi>B</mml:mi>
<mml:msubsup>
<mml:mrow>
<mml:mi>B</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>r</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:mi>B</mml:mi>
<mml:msubsup>
<mml:mrow>
<mml:mi>B</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>r</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:mi>B</mml:mi>
<mml:msubsup>
<mml:mrow>
<mml:mi>B</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>r</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>z</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:mi>h</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>g</mml:mi>
<mml:mi>h</mml:mi>
<mml:mi>t</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>w</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>h</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>d</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>p</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>h</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>.</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption>
<p>CNN pipeline for detection of meniscal tears in six sub-regions. Approach <italic>Full-scale</italic> uses a ResNet50 encoder followed by a classifier head with <inline-formula id="inf3">
<mml:math id="m5">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">L</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>B</mml:mi>
<mml:mi>C</mml:mi>
<mml:mi>E</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula> for classification of meniscal tears in 3D MRI data <bold>(A)</bold>. Approach <italic>BB-crop</italic> reduces the 3D MRI input to the meniscal RoI and uses a DRN-C-26 encoder followed by a classifier head with <inline-formula id="inf4">
<mml:math id="m6">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">L</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>B</mml:mi>
<mml:mi>C</mml:mi>
<mml:mi>E</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula> to detect meniscal tears <bold>(B)</bold>. Approach <italic>BB-loss</italic> uses a ResNet50 encoder followed by a classifier head with <inline-formula id="inf5">
<mml:math id="m7">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">L</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>B</mml:mi>
<mml:mi>C</mml:mi>
<mml:mi>E</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula> as well as another bounding box regression head with <inline-formula id="inf6">
<mml:math id="m8">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">L</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>L</mml:mi>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula> and <inline-formula id="inf7">
<mml:math id="m9">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">L</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>G</mml:mi>
<mml:mi>I</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>U</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula> in order to predict bounding boxes of the menisci in the 3D MRI data <bold>(C)</bold>. The ResNet50 is made up of an initial convolutional layer followed by max-pooling before 16 ResNet bottleneck blocks with residual connections are stacked. The DRN-C-26 starts with the same convolutional layer but is immediately followed by ten residual building blocks and, lastly, two building blocks without a residual connection. After average pooling, the encoders generate 2048 and 512 features, respectively. Finally, SmoothGrad saliency maps are presented as overlaid heatmaps on top of the respective MR image to highlight these regions that mostly influenced the detection of tears (bottom right corner).</p>
</caption>
<graphic xlink:href="fbioe-09-747217-g002.tif"/>
</fig>
<p>For the IW TSE data 600 segmentations are generated in a semi-automated fashion using Amira ZIB Edition<xref ref-type="fn" rid="fn2">
<sup>2</sup>
</xref> (<xref ref-type="bibr" rid="B33">Reddy, 2017</xref>). These masks are defined as voxel-wise annotations of the tissue belonging to the respective meniscus. The method of <xref ref-type="bibr" rid="B43">Tack et&#x20;al. (2018)</xref> was originally developed and evaluated on DESS MRI data. Since the DESS and IW TSE MRI sequences differ significantly in the image resolution (number of slices), that could pose an issue, we have decided to train the self-adapting nnU-net framework (<xref ref-type="bibr" rid="B22">Isensee et&#x20;al., 2021</xref>) on these 600 training datasets. The nnU-net offers 2D and 3D architectures with 3D architectures usually yielding better results (<xref ref-type="bibr" rid="B22">Isensee et&#x20;al., 2021</xref>). For this reason, we have used a 3D variant of the nnU-net that employs 3D convolutions in an encoder-decoder framework with skip-connections. For the IW TSE data, the nnU-net has been automatically configured to have an input size of 24&#x20;&#xd7; 256&#x20;&#xd7; 256 pixels and seven layers of 3D convolutions (<xref ref-type="bibr" rid="B22">Isensee et&#x20;al., 2021</xref>). We train the nnU-net with data augmentation such as random rotations and random cropping using a dice similarity coefficient loss (<xref ref-type="bibr" rid="B22">Isensee et&#x20;al., 2021</xref>) until convergence is reached. Hereby, the dice similarity coefficient is computed between the output of the nnU-net and the respective hand-labelled target segmentation masks. Afterwards, the nnU-net is employed to segment all 2,396 IW TSE MRI scans to yield the respective meniscal RoIs. In order to achieve this, multiple patches of the MRI with a size of 24&#x20;&#xd7; 256&#x20;&#xd7; 256 pixels are being processed by the nnU-net. These patches overlap by half of the patch size in each dimension. Afterwards, the nnU-net framework merges all patches to a final 3D segmentation mask employing a majority voting for every&#x20;pixel.</p>
</sec>
<sec id="s2-3">
<title>2.3 Model Architecture</title>
<p>Two distinct models, which are based on 3D counterparts of ResNet architectures (<xref ref-type="bibr" rid="B17">He et&#x20;al., 2016</xref>; <xref ref-type="bibr" rid="B47">Yu et&#x20;al., 2017</xref>) are introduced. ResNets have been widely applied to the medical domain and provide good properties due to the employed skip connections. In theory, the residual connections allow the design of very deep ResNets without exhibiting problems of vanishing gradients (<xref ref-type="bibr" rid="B20">Ide and Kurita, 2017</xref>). We have chosen 3D counterparts of 2D ResNets since 3D convolutions are able to comprehend three-dimensional context inherently. It has previously been shown in the context of musculoskeletal MRI analysis that 3D convolutions are more powerful than concatenation of 2D slices as well as a provision of multiple 2D slices as input to a CNN that employs 2D convolutions (<xref ref-type="bibr" rid="B2">Ambellan et&#x20;al., 2019</xref>; <xref ref-type="bibr" rid="B44">Tack and Zachow, 2019</xref>). We adapt these 3D ResNet architectures to the three different approaches and their associated input volume sizes. Each model consists of a ResNet encoder followed by one or two Multi-Layer Perceptron (MLP) heads. The <italic>BB-crop</italic> approach has a dilation ResNet-C-26 architecture with an MLP head for the multi-label classification. The <italic>Full-scale</italic> approach has a ResNet50 encoder with a classifier MLP head, and the <italic>BB-loss</italic> approach consists of a ResNet50 encoder with two MLP heads. The performance of the classification task is improved in the BB-loss approach by solving additionally a second task, which is to learn a bounding box regression simultaneously. Again, the first MLP head is employed for multi-label classification. The second MLP head is responsible for the bounding box regression task. All ResNets comprise of a series of convolutional layers, each followed by batch normalization (<xref ref-type="bibr" rid="B21">Ioffe and Szegedy, 2015</xref>) and a Rectified Linear Unit (ReLU) activation function (<xref ref-type="bibr" rid="B1">Agarap, 2018</xref>).</p>
<p>Our approaches that will be presented in the following sections are designed based on (a selection of) encoders and MLP heads:</p>
<sec id="s2-3-1">
<title>ResNet50 Encoder</title>
<p>
<xref ref-type="bibr" rid="B17">He et&#x20;al. (2016)</xref> proposed a residual layer connection as a way to train deep neural networks without suffering from vanishing gradients. One of their proposed architectures is the ResNet50, with a total of 50 convolutional layers (see <xref ref-type="fig" rid="F2">Figure&#x20;2</xref>). The network comprises an initial convolutional layer with kernel size 7 &#xd7; 7&#x20;&#xd7; 7 followed by a max-pooling layer with kernel size 3 &#xd7; 3&#x20;&#xd7; 3 and stride 2. The following residual layers are grouped in so-called &#x201c;bottleneck blocks&#x201d; (see <xref ref-type="fig" rid="F2">Figure&#x20;2</xref>), which are constructed of three convolutional layers. The first and the last are convolutional layers, with kernel size 1 &#xd7; 1&#x20;&#xd7; 1, where the first one downsamples the number of volume features, and the last one applies feature upsampling. Between these layers, there is a convolutional layer with kernel size 3 &#xd7; 3&#x20;&#xd7; 3. The bottleneck blocks are arranged in four groups of sizes 3, 4, 6, and 3, where each group starts with a stride of 2 in the first convolutional layer to downsample the feature volumes&#x2019; spatial dimensions. Finally, the residual blocks are terminated with a global average pooling (<xref ref-type="bibr" rid="B28">Lin et&#x20;al., 2013</xref>) over the 2048 individual 3D feature volumes coming from the last layer of the ResNet encoder. Computing the average value of each feature map via global average pooling results in a 1D tensor with 2048 features.</p>
</sec>
<sec id="s2-3-2">
<title>Dilation ResNet-C-26 Encoder</title>
<p>The DRN-C-26 is a dilated residual CNN architecture with 26 layers introduced by <xref ref-type="bibr" rid="B47">Yu et&#x20;al. (2017)</xref>. The original ResNet downsamples the input images by a factor of 32. Downsampling our cropped and uneven sized image volumes by such an amount would result in a loss of information about small and salient parts caused by less expressive feature maps. However, simply reducing the convolutional stride restricts the receptive field of subsequent layers. For this reason, <xref ref-type="bibr" rid="B47">Yu et&#x20;al. (2017)</xref> presented an approach with which downsampling could be reduced while sustaining a sufficiently large receptive field and improving classification results. To construct the DRN-C-26&#x20;<xref ref-type="bibr" rid="B47">Yu et&#x20;al. (2017)</xref> applied the following changes to the ResNet18 (<xref ref-type="bibr" rid="B17">He et&#x20;al., 2016</xref>) made of so-called ResNet &#x201c;building blocks&#x201d; with two convolutional layers with kernel size 3 &#xd7; 3&#x20;&#xd7; 3 (see <xref ref-type="fig" rid="F2">Figure&#x20;2</xref>). First, the convolutional stride in the last two groups is replaced by dilation. Second, the initial max-pooling layer is replaced by two residual building blocks. Lastly, to reduce aliasing artefacts, a decrease in dilation is added with two final building blocks without residual connections. Again, the residual blocks of the DRN-C-26 are followed by a global average pooling over the 512 feature maps of the last ResNet layer, resulting in a 1D tensor with 512 features.</p>
</sec>
<sec id="s2-3-3">
<title>MLP Heads</title>
<p>The features obtained by the respective ResNet encoders are passed through a simple three-layered feed-forward network, also known as MLP, to achieve the respective classifications and regressions. As shown in <xref ref-type="fig" rid="F2">Figure&#x20;2</xref>, the MLP input dimension matches the feature dimensions of the CNN (i.e.,&#x20;2048 neurons in case of ResNet50 and 512 neurons for a DRN-C-26). The hidden layers of all MLP&#x2019;s consist of 2048 neurons. The classifier head has six output nodes. In the <italic>BB-loss</italic> setting, an additional three-layered MLP with twelve output nodes was added to perform a bounding box regression.</p>
</sec>
</sec>
<sec id="s2-4">
<title>2.4&#x20;<italic>Full-Scale</italic> Approach: Detection of Meniscal Tears in Complete MRI Scans</title>
<p>In our first and most straightforward approach, the complete 3D MRI is provided as input to the CNN. The CNN consists of a ResNet50 encoder followed by an MLP head. The outputs of the MLP after a sigmoid activation represent the probabilities for the six meniscal sub-regions to contain a tear.</p>
<p>The CNN is trained by minimizing the binary cross-entropy loss <inline-formula id="inf8">
<mml:math id="m10">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">L</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>B</mml:mi>
<mml:mi>C</mml:mi>
<mml:mi>E</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula> for a given batch of <italic>N</italic> samples. With a target matrix <inline-formula id="inf9">
<mml:math id="m11">
<mml:mi mathvariant="bold">Y</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="double-struck">Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>N</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>C</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula> and an output matrix <inline-formula id="inf10">
<mml:math id="m12">
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="bold">Y</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">&#x302;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">R</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>N</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>C</mml:mi>
</mml:mrow>
</mml:msup>
</mml:math>
</inline-formula> for all <italic>C</italic> meniscal sub-region labels the definition of <inline-formula id="inf11">
<mml:math id="m13">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">L</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>B</mml:mi>
<mml:mi>C</mml:mi>
<mml:mi>E</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula> is:<disp-formula id="e3">
<mml:math id="m14">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">L</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>B</mml:mi>
<mml:mi>C</mml:mi>
<mml:mi>E</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:munderover accentunder="false" accent="false">
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>C</mml:mi>
</mml:mrow>
</mml:munderover>
<mml:munderover accentunder="false" accent="false">
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:munderover>
<mml:msub>
<mml:mrow>
<mml:mi>w</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>c</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>c</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mi>l</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>g</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>&#x3c3;</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">&#x302;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>c</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#x2b;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>c</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mi>l</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>g</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>&#x3c3;</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">&#x302;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>c</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
<mml:mo>,</mml:mo>
</mml:math>
<label>(3)</label>
</disp-formula>where <italic>w</italic>
<sub>
<italic>c</italic>
</sub> is an inverse weighting of label frequencies and <italic>&#x3c3;</italic>(&#x22c5;) is a sigmoid activation function. The <italic>Full-scale</italic> approach is visualized under A) in <xref ref-type="fig" rid="F2">Figure&#x20;2</xref>.</p>
</sec>
<sec id="s2-5">
<title>2.5&#x20;<italic>BB-Crop</italic> Approach: Detection of Meniscal Tears in Cropped MRI Datasets</title>
<p>Cropping 3D MRI data to the meniscal RoI is expected to provide two desirable properties. First, it provides smaller volumes reducing the required GPU memory as well as the run time. Second, the <italic>Full-scale</italic> 3D MR images can be considered noisy as they provide additional and unnecessary information about surrounding anatomical structures. By cropping the data to the RoI of the menisci, this unnecessary information is suppressed. Leveraging the RoI generated as described in <xref ref-type="sec" rid="s2-2">section 2.2</xref> the 3D MR images are cropped with a 5% margin around the menisci. Each cropped image is then resampled with trilinear interpolation to the closest multiples of 16, given the biggest bounding box in the training set. <xref ref-type="fig" rid="F2">Figure&#x20;2</xref> visualizes the cropping and resampling process. Consequently, the cropped and resampled images have a size of (64, 64, 176) for the DESS data and (16, 64, 176) for the IW TSE data. <italic>BB-crop</italic> utilizes a Dilation Resnet-C-26 encoder followed by an MLP classifier head. The CNN is trained by minimizing the <inline-formula id="inf12">
<mml:math id="m15">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">L</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>B</mml:mi>
<mml:mi>C</mml:mi>
<mml:mi>E</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula> as given in <xref ref-type="disp-formula" rid="e3">Eq. 3</xref>. The framework is visualized under B) in <xref ref-type="fig" rid="F2">Figure&#x20;2</xref>.</p>
</sec>
<sec id="s2-6">
<title>2.6&#x20;<italic>BB-Loss</italic> Approach: Detection of Meniscal Tears in Complete MRI Scans Enhanced by Regression of Meniscal Bounding Boxes</title>
<p>The <italic>BB-crop</italic> approach requires segmentation of both menisci (or at least the determination of a meniscal region) in training and testing. Since generating segmentations is time-consuming (the method of <xref ref-type="bibr" rid="B43">Tack et&#x20;al. (2018)</xref> requires approximately 5&#xa0;min of run time), it is beneficial to avoid this step. Moreover, this approach heavily relies on high-quality bounding boxes in training and inference, which are difficult to obtain and strongly influence the performance quality. Thus, the motivation for our final <italic>BB-loss</italic> approach is to detect meniscal tears in 3D MRI data without extensive pre-processing requirements such as segmenting the menisci or computing bounding boxes for meniscal regions. Instead, the location of the menisci is added as an additional loss term for the training. The encoder is kept identical to the <italic>Full-scale</italic> approach, namely a ResNet-50 encoder. Furthermore, an identical MLP head is utilized for the meniscal tear detection. Additionally, we show that the meniscal position information helps the CNN to focus on these regions in the image yielding better results. A second MLP head is employed in the <italic>BB-loss</italic> approach to regress the coordinates of the meniscal RoI. By incorporating this knowledge as a loss in the training process, the locations of the menisci must not be explicitly provided at test time. The total loss in the <italic>BB-loss</italic> setting is computed considering the multi-label classification and the bounding box regression task. For detection of meniscal tears <inline-formula id="inf13">
<mml:math id="m16">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">L</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>B</mml:mi>
<mml:mi>C</mml:mi>
<mml:mi>E</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula> is employed (<xref ref-type="disp-formula" rid="e3">Eq. 3</xref>). In the bounding box regression the outputs of the MLP head are 6 coordinates <italic>d</italic> for the MM and LM, respectively. Utilizing a sigmoid activation function, these values are given as relative positions within the image in a range of [0, 1] of the respective dimension. For a detailed description of the bounding box generation procedure, we refer the reader to <xref ref-type="sec" rid="s2-2">section 2.2</xref>. The first component of the bounding box loss is an L1-term <inline-formula id="inf14">
<mml:math id="m17">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">L</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>L</mml:mi>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula> defined as<disp-formula id="e4">
<mml:math id="m18">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">L</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>L</mml:mi>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mo stretchy="false">&#x2016;</mml:mo>
<mml:mi>B</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>B</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">&#x302;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo stretchy="false">&#x2016;</mml:mo>
<mml:mo>,</mml:mo>
</mml:math>
<label>(4)</label>
</disp-formula>with a predicted bounding box <inline-formula id="inf15">
<mml:math id="m19">
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>B</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">&#x302;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</inline-formula> and a target bounding box <italic>B</italic> that is derived from the automated segmentation masks. These <italic>N</italic>&#x20;&#xd7; 2<italic>d</italic> matrices contain <italic>N</italic> rows with medial and lateral bounding box values. Where <italic>b</italic>
<sub>
<italic>n</italic>,<italic>i</italic>
</sub> and <inline-formula id="inf16">
<mml:math id="m20">
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>b</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">&#x302;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula> describe the <italic>n</italic>th element of the batch and the <italic>i</italic>th value of the concatenated bounding boxes. With this formulation the loss is given as<disp-formula id="e5">
<mml:math id="m21">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">L</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>L</mml:mi>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:munderover accentunder="false" accent="false">
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:munderover>
<mml:munderover accentunder="false" accent="false">
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mi>d</mml:mi>
</mml:mrow>
</mml:munderover>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>b</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>b</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">&#x302;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mo>.</mml:mo>
</mml:math>
<label>(5)</label>
</disp-formula>
</p>
<p>The second component of the bounding box loss is a modified Intersection over Union (IoU) term, more specifically the Generalized-IoU (GIoU) <inline-formula id="inf17">
<mml:math id="m22">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">L</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>G</mml:mi>
<mml:mi>I</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>U</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula> (<xref ref-type="bibr" rid="B35">Rezatofighi et&#x20;al., 2019</xref>) defined as:<disp-formula id="e6">
<mml:math id="m23">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">L</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>G</mml:mi>
<mml:mi>I</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>U</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>I</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>U</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>C</mml:mi>
<mml:mo>\</mml:mo>
<mml:mo>&#x5c;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>B</mml:mi>
<mml:mo>&#x222a;</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>B</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">&#x302;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>C</mml:mi>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
</mml:mfrac>
<mml:mo>,</mml:mo>
</mml:math>
<label>(6)</label>
</disp-formula>where C is a convex hull enclosing the predicted and the target box. The operator &#x7c;&#x22c5;&#x7c; computes the box volume. The convex hull is the smallest possible region that encloses both the output and the target bounding boxes. It can be defined as a bounding box, fully characterised by the 6 coordinates elaborated above. It is computed by taking the minimum and maximum extent of both the target bounding box and the predicted bounding box coordinates along the x-, y- and z-axis. The numerator of the third term of the <inline-formula id="inf18">
<mml:math id="m24">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">L</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>G</mml:mi>
<mml:mi>I</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>U</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula> is the convex hull volume subtracted by the volume of <italic>B</italic> and <inline-formula id="inf19">
<mml:math id="m25">
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>B</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">&#x302;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</inline-formula>, and the denominator is the volume of the convex hull. Hence, the third term of the <inline-formula id="inf20">
<mml:math id="m26">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">L</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>G</mml:mi>
<mml:mi>I</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>U</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula> can be considered as the relative volume of the convex hull not covered by the union of predicted and target bounding box. The IoU is defined as <inline-formula id="inf21">
<mml:math id="m27">
<mml:mfrac>
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>B</mml:mi>
<mml:mo>&#x2229;</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>B</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">&#x302;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>B</mml:mi>
<mml:mo>&#x222a;</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>B</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">&#x302;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:math>
</inline-formula>, that is, the ratio of the intersecting voxels of <italic>B</italic> and <inline-formula id="inf22">
<mml:math id="m28">
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>B</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">&#x302;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</inline-formula> to their union. The <inline-formula id="inf23">
<mml:math id="m29">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">L</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>G</mml:mi>
<mml:mi>I</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>U</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula> is computed for each meniscal RoI and averaged for the given batch. The overall loss <inline-formula id="inf24">
<mml:math id="m30">
<mml:mi mathvariant="script">L</mml:mi>
</mml:math>
</inline-formula> for the <italic>BB-loss</italic> approach is given as<disp-formula id="e7">
<mml:math id="m31">
<mml:mi mathvariant="script">L</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">L</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>B</mml:mi>
<mml:mi>C</mml:mi>
<mml:mi>E</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">L</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>L</mml:mi>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">L</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>G</mml:mi>
<mml:mi>I</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>U</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>.</mml:mo>
</mml:math>
<label>(7)</label>
</disp-formula>
</p>
<p>The <italic>BB-loss</italic> approach is visualized under C) in <xref ref-type="fig" rid="F2">Figure&#x20;2</xref>.</p>
</sec>
<sec id="s2-7">
<title>2.7 Experimental Setup and Training of CNNs</title>
<p>The given MRI data of the OAI are randomly split into 50% training data, 15% validation data and 35% testing data. Hence, our two experiments have 1200/359/840 and 1197/359/840 training/validation/testing scans for the DESS data and the IW TSE data, respectively. We implemented the CNNs of all approaches in PyTorch 1.9. Convolutional weights are initialized using a normal distribution as in <xref ref-type="bibr" rid="B18">He et&#x20;al. (2015)</xref> tailored towards our deep neural networks with asymmetric ReLU activation functions. While, batch normalization weights and biases are initialized constant with 1 and 0. We train our CNNs on an Nvidia A100 GPU with 40&#xa0;GB memory. Training our three ResNets, separate learning rates and dropout probabilities for the ResNet-encoders and the MLP-heads are introduced. Suitable learning rate, dropout and batch size hyper-parameters are found using the validation data of the DESS scans. The learning rate values for all parts (ResNet encoder, classifier head and bounding box head) are evaluated in an interval of [<inline-formula id="inf31">
<mml:math id="m39">
<mml:mi>1</mml:mi>
<mml:mo>e</mml:mo>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>5</mml:mn>
</mml:math>
</inline-formula>, 0.01]. Dropout percentages are varied in an interval of [0.1, 0.9]. Further, the training batch size limited by the input size is varied from 2 to 64 for the <italic>BB-crop</italic> approach. Due to a larger input volume in approach <italic>Full-scale</italic> and <italic>BB-loss</italic>&#x2009; batch size was kept constant at a value of 4. For a complete summary of our hyper-parameter values, please refer to <xref ref-type="sec" rid="s12">Supplementary Table S2</xref>. Training is performed using the ADAM optimizer (<xref ref-type="bibr" rid="B25">Kingma and Ba, 2014</xref>) with <italic>&#x3b2;</italic>
<sub>1</sub> &#x3d; 0.9, <italic>&#x3b2;</italic>
<sub>2</sub> &#x3d; 0.999 and <italic>&#x3f5;</italic> &#x3d;<inline-formula id="inf32">
<mml:math id="m40">
<mml:mi>1</mml:mi>
<mml:mo>e</mml:mo>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>08</mml:mn>
</mml:math>
</inline-formula> with a learning rate decay of 0.5 every 50 epochs. Training on the IW TSE sequence is not performed from scratch, instead, both ResNet encoder and MLP weights are fine-tuned. In both DRN-C-26 and ResNet50 cases, we use the CNNs that achieve the lowest validation loss on the DESS sequence.</p>
<p>On-the-fly data augmentation is performed during training. Specifically, this means, random cropping around the RoI, horizontal flips, rotations, Gaussian noise, and intensity scaling are applied with 50% probability. For the <italic>Full-scale</italic> approach, we perform random cropping of up to 10% along coronal, 20% sagittal and 20% axial direction. In the <italic>BB-crop</italic> approach, random crops are performed by uniformly cropping within a 20<italic>%</italic> margin around the menisci. The <italic>BB-loss</italic> approach uniformly samples possible crops around the menisci. All cropped images are resampled with trilinear interpolation to attain consistent sizes per approach and dataset. Input images for the <italic>Full-scale</italic> and <italic>BB-loss</italic> approach are sampled for the DESS sequence data to (160, 384, 384) and for IW TSE images to (44, 448, 448). The <italic>BB-crop</italic> approach resamples to (64, 64, 176) and (16, 64, 176), respectively. The added Gaussian noise is pixel-wise sampled as <inline-formula id="inf25">
<mml:math id="m32">
<mml:mi>&#x3f5;</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:mi mathvariant="script">N</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mn>0.1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mn>0.5</mml:mn>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>. The random rotation is uniformly sampled from <inline-formula id="inf26">
<mml:math id="m33">
<mml:mi mathvariant="script">U</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>5</mml:mn>
<mml:mo>&#xb0;</mml:mo>
<mml:mo>,</mml:mo>
<mml:mo>&#x2b;</mml:mo>
<mml:mn>5</mml:mn>
<mml:mo>&#xb0;</mml:mo>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula> and image intensity is scaled by a uniformly sampled multiplication factor <inline-formula id="inf27">
<mml:math id="m34">
<mml:mi>b</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:mi mathvariant="script">U</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mn>0.9</mml:mn>
<mml:mo>,</mml:mo>
<mml:mn>1.1</mml:mn>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>.</p>
</sec>
<sec id="s2-8">
<title>2.8 Statistical Assessment of Detection Quality</title>
<p>For all experiments, we plot the true positive rate (TPR&#x20;&#x3d;&#x20;sensitivity) against the false positive rate (FPR &#x3d; 1&#x2013;specificity) at various decision thresholds to create ROC curves (<xref ref-type="bibr" rid="B7">Brown and Davis, 2006</xref>). Additionally, we compute the ROC AUC to assess the quality of our classifiers. The quality of our predicted bounding boxes is assessed by computing the IoU with the target bounding boxes. We consider IoU values over 0.5 as successful localization of the menisci since this is a common value in object detection tasks (<xref ref-type="bibr" rid="B15">Girshick et&#x20;al., 2014</xref>).</p>
</sec>
<sec id="s2-9">
<title>2.9 SmoothGrad Saliency Map Visualizations for Areas Addressed by the CNN</title>
<p>Gradient saliency maps (<xref ref-type="bibr" rid="B40">Simonyan et&#x20;al., 2013</xref>) (otherwise called pixel attribution maps or sensitivity maps) highlight pixel regions in the input image that mostly influenced a neural network&#x2019;s decision. To attain such pixel attributions, one computes the derivative of the final linear layer in a neural network with respect to the input via back-propagation. More formally, a gradient saliency map <italic>S</italic>
<sub>
<italic>c</italic>
</sub> for a sub-region c for which our neural network <italic>f</italic> yields a detection of meniscal tears is calculated as:<disp-formula id="e8">
<mml:math id="m35">
<mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>c</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="script">I</mml:mi>
</mml:mrow>
<mml:mo>&#x303;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>&#x2202;</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>c</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="script">I</mml:mi>
</mml:mrow>
<mml:mo>&#x303;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>&#x2202;</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="script">I</mml:mi>
</mml:mrow>
<mml:mo>&#x303;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfrac>
<mml:mo>.</mml:mo>
</mml:math>
<label>(8)</label>
</disp-formula>
</p>
<p>For our two most promising approaches <italic>BB-crop</italic> and <italic>BB-loss</italic>, these maps are computed by applying a slight enhancement to the original mechanism - the SmoothGrad method (<xref ref-type="bibr" rid="B41">Smilkov et&#x20;al., 2017</xref>). Similar to the SmoothGrad approach of <xref ref-type="bibr" rid="B41">Smilkov et&#x20;al. (2017)</xref> we augmented the input image slightly, introducing noise, such that through averaging, the saliency maps of different noise levels are smoothed out. We apply Gaussian distributed noise <inline-formula id="inf28">
<mml:math id="m36">
<mml:mi>&#x3f5;</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:mi mathvariant="script">N</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mn>0.1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mn>0.5</mml:mn>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>, random horizontal flips, uniformly sampled rotations <inline-formula id="inf29">
<mml:math id="m37">
<mml:mi>r</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:mi mathvariant="script">U</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>5</mml:mn>
<mml:mo>&#xb0;</mml:mo>
<mml:mo>,</mml:mo>
<mml:mo>&#x2b;</mml:mo>
<mml:mn>5</mml:mn>
<mml:mo>&#xb0;</mml:mo>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula> and uniformly sampled pixel intensity shift with a multiplication factor <inline-formula id="inf30">
<mml:math id="m38">
<mml:mi>b</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:mi mathvariant="script">U</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mn>0.9</mml:mn>
<mml:mo>,</mml:mo>
<mml:mn>1.1</mml:mn>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>. Each image is augmented 20&#x20;times with a probability of 50<italic>%</italic> per augmentation, and the resulting maps are averaged.</p>
</sec>
</sec>
<sec id="s3">
<title>3 Results</title>
<p>We applied all approaches to DESS as well as IW TSE data from the OAI database. Each of our approaches detects meniscal tears for the MM and the LM. In particular, tears are detected in the three anatomical sub-regions anterior horn, meniscal body, and posterior horn. All results are presented in this section.</p>
<sec id="s3-1">
<title>3.1 Detection of Meniscal Tears in DESS MRI Data</title>
<p>Employing the <italic>Full-scale</italic> approach, the AUC values are 0.74, 0.84, 0.85 for the anterior horn, body, and posterior horn of the MM. For the LM, the AUC values are 0.94, 0.92, 0.91. The <italic>BB-crop</italic> approach usually yields higher AUC values, being 0.87, 0.89, 0.89 and 0.95, 0.93, 0.91. The <italic>BB-loss</italic> gives the highest AUC values, being 0.94, 0.93, 0.93 and 0.96, 0.94, 0.91. The ROC curves employing all three approaches are shown in <xref ref-type="fig" rid="F3">Figure&#x20;3</xref>. In addition, all ROC AUC results are summarized in <xref ref-type="table" rid="T2">Table&#x20;2</xref>.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption>
<p>ROC curves for detection of meniscal tears in DESS MRI&#x20;data.</p>
</caption>
<graphic xlink:href="fbioe-09-747217-g003.tif"/>
</fig>
<table-wrap id="T2" position="float">
<label>TABLE 2</label>
<caption>
<p>ROC AUC results for medial menisci (MM) and lateral menisci (LM) in DESS MRI data.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left"/>
<th colspan="3" align="center">MM</th>
<th colspan="3" align="center">LM</th>
</tr>
<tr>
<td align="left">Method</td>
<td align="center">Anterior</td>
<td align="center">Body</td>
<td align="center">Posterior</td>
<td align="center">Anterior</td>
<td align="center">Body</td>
<td align="center">Posterior</td>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">
<italic>Full-scale</italic>
</td>
<td align="center">0.74</td>
<td align="center">0.84</td>
<td align="center">0.85</td>
<td align="center">0.94</td>
<td align="center">0.92</td>
<td align="center">0.91</td>
</tr>
<tr>
<td align="left">
<italic>BB-crop</italic>
</td>
<td align="center">0.87</td>
<td align="center">0.89</td>
<td align="center">0.89</td>
<td align="center">0.95</td>
<td align="center">0.93</td>
<td align="center">
<bold>0.91</bold>
</td>
</tr>
<tr>
<td align="left">
<italic>BB-loss</italic>
</td>
<td align="center">
<bold>0.94</bold>
</td>
<td align="center">
<bold>0.93</bold>
</td>
<td align="center">
<bold>0.93</bold>
</td>
<td align="center">
<bold>0.96</bold>
</td>
<td align="center">
<bold>0.94</bold>
</td>
<td align="center">
<bold>0.91</bold>
</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>The best results for each anatomical sub-region are highlighted in bold.</p>
</fn>
</table-wrap-foot>
</table-wrap>
</sec>
<sec id="s3-2">
<title>3.2 Detection of Meniscal Tears in IW TSE MRI Data</title>
<p>Employing the <italic>Full-scale</italic> approach, the AUC values are 0.82, 0.87, 0.82 for the anterior horn, body, and posterior horn of the MM. For the LM, the AUC values are 0.88, 0.85, 0.85. The <italic>BB-crop</italic> approach usually yields higher AUC values, being 0.84, 0.89, 0.86, and 0.92, 0.90, 0.90. The <italic>BB-loss</italic> gives similar AUC values, being 0.84, 0.88, 0.86, and 0.95, 0.91, 0.90. The ROC curves of all approaches are shown in <xref ref-type="fig" rid="F4">Figure&#x20;4</xref>. Further, all AUC values are summarized in <xref ref-type="table" rid="T3">Table&#x20;3</xref>.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption>
<p>ROC curves for detection of meniscal tears in IW TSE MRI&#x20;data.</p>
</caption>
<graphic xlink:href="fbioe-09-747217-g004.tif"/>
</fig>
<table-wrap id="T3" position="float">
<label>TABLE 3</label>
<caption>
<p>ROC AUC results for medial menisci (MM) and lateral menisci (LM) in IW TSE data.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left"/>
<th colspan="3" align="center">MM</th>
<th colspan="3" align="center">LM</th>
</tr>
<tr>
<td align="left">Method</td>
<td align="center">Anterior</td>
<td align="center">Body</td>
<td align="center">Posterior</td>
<td align="center">Anterior</td>
<td align="center">Body</td>
<td align="center">Posterior</td>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">
<italic>Full-scale</italic>
</td>
<td align="center">0.82</td>
<td align="center">0.87</td>
<td align="center">0.82</td>
<td align="center">0.88</td>
<td align="center">0.85</td>
<td align="center">0.85</td>
</tr>
<tr>
<td align="left">
<italic>BB-crop</italic>
</td>
<td align="center">
<bold>0.84</bold>
</td>
<td align="center">
<bold>0.89</bold>
</td>
<td align="center">
<bold>0.86</bold>
</td>
<td align="center">0.92</td>
<td align="center">0.90</td>
<td align="center">
<bold>0.90</bold>
</td>
</tr>
<tr>
<td align="left">
<italic>BB-loss</italic>
</td>
<td align="center">
<bold>0.84</bold>
</td>
<td align="center">0.88</td>
<td align="center">
<bold>0.86</bold>
</td>
<td align="center">
<bold>0.95</bold>
</td>
<td align="center">
<bold>0.91</bold>
</td>
<td align="center">
<bold>0.90</bold>
</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>The best results for each anatomical sub-region are highlighted in bold.</p>
</fn>
</table-wrap-foot>
</table-wrap>
</sec>
<sec id="s3-3">
<title>3.3 Localization of Menisci via the <italic>BB-Loss</italic> Approach</title>
<p>To investigate the bounding box regression quality of the proposed method we evaluate the distribution of the IoU values for the predicted bounding boxes (<xref ref-type="fig" rid="F5">Figure&#x20;5</xref>). For the DESS dataset (our primary benchmark), we observed a very high quality of MM and LM bounding box predictions. With the values being close to normally distributed around a mean value of 0.71 (95% confidence interval (CI): 0.71&#x2013;0.72) and standard deviation of 0.13. With the IoU threshold of 0.5, we conclude that 95% of the resulted bounding boxes are identified correctly. Unfortunately, we observed a clear decrease in the object detection performance in the IW TSE dataset. With a mean value of 0.58 (95% CI: 0.57&#x2013;0.59) and a standard deviation of 0.14. Applying the same detection threshold as above we testify, that only around 76% of menisci were detected correctly, with the overall quality of the bounding boxes being more widely spread.</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption>
<p>The distribution of the IoU values for the bounding boxes of MM and LM in DESS and IW TSE MRI&#x20;data.</p>
</caption>
<graphic xlink:href="fbioe-09-747217-g005.tif"/>
</fig>
</sec>
<sec id="s3-4">
<title>3.4 Visualization of Areas Addressed by the CNN</title>
<p>
<xref ref-type="fig" rid="F6">Figure&#x20;6</xref> shows SmoothGrad saliency maps for the <italic>BB-crop</italic> and <italic>BB-loss</italic> approach overlaid to MR images. Examples are shown for randomly selected test cases, displaying different kinds of meniscal tears for DESS and IW TSE data. The RoIs for the <italic>BB-loss</italic> approach were extracted using predicted bounding boxes and the respective close-ups are shown. Red arrows point at the location of meniscal defects. Most saliency maps obtained this way display a plausible localization of the meniscal tears. The plausibility of these maps was qualitatively evaluated by their correspondence to the target labels of the regions in which the tears could also be confirmed with the help of visual inspection of the image data. SmoothGrad saliency maps are capable of highlighting more than just one affected sub-region, i.e.,&#x20;in the presence of defects in multiple sub-regions of one meniscus, one similarly observes these being correctly highlighted. With the Dilation ResNet-C-26 employed in the <italic>BB-crop</italic> approach, we observed that this CNN yields smoother and less noisy SmoothGrad saliency maps. However, in many cases, ResNet-50 saliency maps targeted the affected region better, but did not outline this region sharply.</p>
<fig id="F6" position="float">
<label>FIGURE 6</label>
<caption>
<p>SmoothGrad saliency maps overlaid over DESS MRI data <bold>(A)</bold> and IW TSE MRI data <bold>(B)</bold>.</p>
</caption>
<graphic xlink:href="fbioe-09-747217-g006.tif"/>
</fig>
</sec>
<sec id="s3-5">
<title>3.5 Detection Performance&#x2014;Different Sub-regions and Defect Types</title>
<p>Even though the occurrence of defects varies between meniscal sub-regions (see <xref ref-type="sec" rid="s12">Supplementary Figure S2</xref>), we observe only minimal differences between AUC values of sub-regions in DESS MRI data (c.f. <xref ref-type="table" rid="T2">Table&#x20;2</xref>). However, we analyzed the false positive classifications and found that for all sub-regions, signal abnormalities were more often misclassified than normal menisci were (see <xref ref-type="sec" rid="s12">Supplementary Figure S1</xref>). The misclassification rate of signal abnormalities is highest for the posterior horn of the lateral meniscus, the region with the least AUC for the DESS data. Conversely, the lowest signal abnormality misclassification rate is prevalent in the posterior horn of the medial meniscus, the sub-region with the highest number of signal abnormalities (<xref ref-type="sec" rid="s12">Supplementary Table&#x20;S1</xref>).</p>
<p>The least common types of tears occurring in the data are radial and vertical tears, amounting to 72 and 69, respectively. Vertical tears were most challenging for our method to detect in DESS data and led to the most false negative results (see <xref ref-type="sec" rid="s12">Supplementary Figure S2</xref>). Radial meniscal tears were the ones yielding the second highest rate of misclassifications.</p>
</sec>
</sec>
<sec id="s4">
<title>4 Discussion</title>
<p>The primary goal of our work was to develop a method that provides an efficient, robust and automated way to detect and better locate meniscal tears in MRI data, that is, the detection of tears with respect to the anatomical regions in which they occur. We devised a procedure that utilizes a 3D CNN to process arbitrary 3D MRI data without the need for any extensive pre-processing.</p>
<p>Many previously proposed methods already yield a high accuracy in the detection of meniscal tears. To compare our results to the related work, we focus our assessment on the results of our <italic>BB-loss</italic> approach on the DESS MRI data. Our method detects meniscal tears in anatomical sub-regions of MM and LM. However, it has not been explicitly trained for menisci tear detection in the entire knee as well as the two menisci. Therefore, to obtain the respected values, we performed max operations on our CNNs&#x2019; outputs. A comparison of the different approaches with their respective detection AUC is summarized in <xref ref-type="table" rid="T4">Table&#x20;4</xref>. Our <italic>BB-loss</italic> approach achieved state-of-the-art results in detecting meniscal tears in the medial and lateral meniscus with an AUC of 0.94 and 0.93. For the task of meniscal tear detection in the entire knee <italic>BB-loss</italic> approach had an AUC of 0.94 is second to the approach of <xref ref-type="bibr" rid="B14">Fritz et&#x20;al. (2020)</xref>. However, the proposed methods from the related work still leave a desire for a more precise spatial assignment of the findings. For instance, localizing tears per meniscus or in anatomical sub-regions thereof. For tear detection per meniscus, our method performs better than related work (<xref ref-type="bibr" rid="B14">Fritz et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B36">Rizk et&#x20;al., 2021</xref>). However, the novelty of our method is the detection of tears for each anatomical sub-region of the menisci in 3D MRI data, providing an anatomically more detailed localization.</p>
<table-wrap id="T4" position="float">
<label>TABLE 4</label>
<caption>
<p>Comparison of our results on DESS MRI data to the related work. The &#x201c;3D data&#x201d; column indicates whether the method is trained on and applied to complete 3D MR images. The explainable AI &#x201c;XAI&#x201d; column indicates if concepts of saliency maps are employed in order to highlight the areas responsible for the CNNs&#x2019; decisions.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left"/>
<th align="center">
<xref ref-type="bibr" rid="B37">Roblot et&#x20;al. (2019)</xref>&#x2a;</th>
<th align="center">
<xref ref-type="bibr" rid="B9">Couteaux et&#x20;al. (2019)</xref>&#x2a;</th>
<th align="center">
<xref ref-type="bibr" rid="B6">Bien et&#x20;al. (2018)</xref>
</th>
<th align="center">
<xref ref-type="bibr" rid="B30">Pedoia et&#x20;al. (2019)</xref>
</th>
<th align="center">
<xref ref-type="bibr" rid="B45">Tsai et&#x20;al. (2020)</xref>
</th>
<th align="center">
<xref ref-type="bibr" rid="B3">Azcona et&#x20;al. (2020)</xref>
</th>
<th align="center">
<xref ref-type="bibr" rid="B14">Fritz et&#x20;al. (2020)</xref>
</th>
<th align="center">
<xref ref-type="bibr" rid="B36">Rizk et&#x20;al. (2021)</xref>
</th>
<th align="center">Ours: <italic>Full-scale</italic>
</th>
<th align="center">Ours: <italic>BB-crop</italic>
</th>
<th align="center">Ours: <italic>BB-loss</italic>
</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">3D data</td>
<td align="center">&#xd7;</td>
<td align="center">&#xd7;</td>
<td align="center">
<italic>&#x2713;</italic>
</td>
<td align="center">
<italic>&#x2713;</italic>
</td>
<td align="center">
<italic>&#x2713;</italic>
</td>
<td align="center">
<italic>&#x2713;</italic>
</td>
<td align="center">
<italic>&#x2713;</italic>
</td>
<td align="center">
<italic>&#x2713;</italic>
</td>
<td align="center">
<italic>&#x2713;</italic>
</td>
<td align="center">
<italic>&#x2713;</italic>
</td>
<td align="center">
<italic>&#x2713;</italic>
</td>
</tr>
<tr>
<td align="left">XAI</td>
<td align="center">&#xd7;</td>
<td align="center">
<italic>&#x2713;</italic>
</td>
<td align="center">
<italic>&#x2713;</italic>
</td>
<td align="center">&#xd7;</td>
<td align="center">
<italic>&#x2713;</italic>
</td>
<td align="center">&#xd7;</td>
<td align="center">
<italic>&#x2713;</italic>
</td>
<td align="center">
<italic>&#x2713;</italic>
</td>
<td align="center">
<italic>&#x2713;</italic>
</td>
<td align="center">
<italic>&#x2713;</italic>
</td>
<td align="center">
<italic>&#x2713;</italic>
</td>
</tr>
<tr>
<td align="left">Anywhere</td>
<td align="center">0.94</td>
<td align="center">0.906</td>
<td align="center">0.847</td>
<td align="center">0.89</td>
<td align="center">0.904 and 0.913</td>
<td align="center">0.934</td>
<td align="center">
<bold>0.961</bold>
</td>
<td align="center">&#x2014;</td>
<td align="center">0.81</td>
<td align="center">0.89</td>
<td align="center">0.94</td>
</tr>
<tr>
<td align="left">Any MM</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">0.882</td>
<td align="center">0.93</td>
<td align="center">0.79</td>
<td align="center">0.89</td>
<td align="center">
<bold>0.94</bold>
</td>
</tr>
<tr>
<td align="left">Any LM</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">0.781</td>
<td align="center">0.84</td>
<td align="center">0.87</td>
<td align="center">0.92</td>
<td align="center">
<bold>0.93</bold>
</td>
</tr>
<tr>
<td align="left">MM-AH</td>
<td align="center">
<italic>&#x2713;</italic>
</td>
<td align="center">
<italic>&#x2713;</italic>
</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">0.85</td>
<td align="center">0.84</td>
<td align="center">
<bold>0.94</bold>
</td>
</tr>
<tr>
<td align="left">MM-B</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">0.82</td>
<td align="center">0.89</td>
<td align="center">
<bold>0.93</bold>
</td>
</tr>
<tr>
<td align="left">MM-PH</td>
<td align="center">
<italic>&#x2713;</italic>
</td>
<td align="center">
<italic>&#x2713;</italic>
</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">0.78</td>
<td align="center">0.89</td>
<td align="center">
<bold>0.93</bold>
</td>
</tr>
<tr>
<td align="left">LM-AH</td>
<td align="center">
<italic>&#x2713;</italic>
</td>
<td align="center">
<italic>&#x2713;</italic>
</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">0.90</td>
<td align="center">0.95</td>
<td align="center">
<bold>0.96</bold>
</td>
</tr>
<tr>
<td align="left">LM-B</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">0.86</td>
<td align="center">0.92</td>
<td align="center">
<bold>0.94</bold>
</td>
</tr>
<tr>
<td align="left">LM-PH</td>
<td align="center">
<italic>&#x2713;</italic>
</td>
<td align="center">
<italic>&#x2713;</italic>
</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">0.88</td>
<td align="center">
<bold>0.91</bold>
</td>
<td align="center">
<bold>0.91</bold>
</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>&#x2a;<xref ref-type="bibr" rid="B37">Roblot et&#x20;al. (2019)</xref> and <xref ref-type="bibr" rid="B9">Couteaux et&#x20;al. (2019)</xref> detected meniscal tears in 2D slices for AH and PH, but reported overall results&#x20;only.</p>
<p>The best methods for the respective task are highlighted in bold.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>With AUC values being consistently higher than 0.90 for DESS MRI data, our approach achieves excellent detection quality for all meniscal sub-regions using uncropped 3D MRI volumes. We also show that our method generalizes well to other MRI sequences, that is, from DESS to IW TSE data. IW TSE data provides a more challenging setting with a higher slice thickness in the mediolateral direction. Moreover, for certain meniscal defects, such as horizontal tears in the meniscal body, a lower resolution in the acquired MR image direction significantly reduces the visibility of the features required for an accurate classification. The result could be improved by using an input image with an isotropic resolution. Such an image can be obtained by either upsampling an existing image or, even better&#x2014;acquiring a new image, at a higher resolution.</p>
<p>Signal abnormalities are still a challenge. In cases where menisci with tears are to be distinguished from menisci without tears, signal abnormalities are currently regarded as the latter. A fine-grained differentiation between tears and signal abnormalities is likewise a challenge to our method, primarily through the ambiguous image appearance. Potentially, more training data, as present for the region with the most signal abnormalities&#x2014;the MM posterior horn, would allow our CNN to better learn to distinguish signal abnormalities from&#x20;tears.</p>
<p>We expected our model to generalize to all meniscal pathologies but observed problems detecting vertical and radial tears. However, these tears were less common in the available training data, and we believe that more data on such cases would enable our method to detect vertical and radial tears with higher accuracy. Furthermore, coronal and axial imaging sequence orientation could provide additional insights (<xref ref-type="bibr" rid="B6">Bien et&#x20;al., 2018</xref>), possibly improving the detection of otherwise barely visible&#x20;tears.</p>
<p>One major limitation that we see is that our method still requires a localization of the menisci in training. However, other segmentation approaches or (non-automatic) approaches could be applied to attain bounding boxes, possibly improving results by providing more accurate bounding boxes for training.</p>
</sec>
<sec id="s5">
<title>5 Conclusions and Future Work</title>
<p>We present a method in an efficient and fully automated multi-task learning setting that accurately detects meniscal tears on a sub-region level in MM and LM. Our method yields the best results on sagittal DESS MRI data and generalizes well to sagittal IW TSE data. Further, visual support for clinical detection of meniscal tears is provided by SmoothGrad saliency maps highlighting regions that mainly contributed to the decision.</p>
<p>Future work could comprise an analysis of anomaly detection (normal vs. signal abnormality vs. torn menisci) or a classification of different types of tears (horizontal, radial, complex, etc.). Since some of these types occur only rarely for specific sub-regions, deep learning-based methods probably require a lot more image data or data generated with generative models. Also, new issues of class imbalances will arise for the classification of tear&#x20;types.</p>
<p>From the method perspective, the choice of an encoder provides opportunities for improvement. For instance, recent self-attention mechanisms, so-called &#x201c;transformer&#x201d; architectures (<xref ref-type="bibr" rid="B46">Vaswani et&#x20;al., 2017</xref>; <xref ref-type="bibr" rid="B12">Dosovitskiy et&#x20;al., 2020</xref>) are worth an investigation. Since transformers typically require a vast amount of training data, they might not necessarily lead to better accuracy, but the self-attention maps (<xref ref-type="bibr" rid="B8">Caron et&#x20;al., 2021</xref>) may result in a more meaningful explanatory power than classical methods of saliency mapping. Also, generative adversarial networks have been recently employed for explaining the decision of CNN&#x2019;s (<xref ref-type="bibr" rid="B23">Katzmann et&#x20;al., 2021</xref>; <xref ref-type="bibr" rid="B39">Shih et&#x20;al., 2021</xref>). As deep learning methods become more precise in localizing meniscal tears coupled with further sophisticated concepts on explainability, CAD tools will become practical for clinical decision support. In future work, we plan to investigate whether our method better assists physicians in their diagnostic&#x20;tasks.</p>
</sec>
</body>
<back>
<sec id="s6">
<title>Data Availability Statement</title>
<p>Publicly available datasets were analyzed in this study. This data can be found here: <ext-link ext-link-type="uri" xlink:href="https://nda.nih.gov/oai/">https://nda.nih.gov/oai/</ext-link> for the MR images analyzed in this study as well as the medical image annotations from the NIH OAI archive. The employed segmentation masks of all MM and LM will be made publicly available at <ext-link ext-link-type="uri" xlink:href="https://pubdata.zib.de">https://pubdata.zib.de</ext-link> upon publication of this paper.</p>
</sec>
<sec id="s7">
<title>Ethics Statement</title>
<p>The studies involving human participants were reviewed and approved by the OAI coordinating center and by each OAI clinical site. The patients/participants provided their written informed consent to participate in this study.</p>
</sec>
<sec id="s8">
<title>Author Contributions</title>
<p>AT, AS, DL, and SZ designed the study. AT, AS, and DL implemented the proposed methods. AT, AS, and DL collected the data, performed the statistical evaluation and executed the experiments. SZ obtained the funding resources for this project. AT, AS, DL, and SZ drafted and wrote the manuscript.</p>
</sec>
<sec id="s9">
<title>Funding</title>
<p>The authors gratefully acknowledge the financial support by the German federal ministry of education and research (BMBF) within the research network on musculoskeletal diseases, grant no. 01EC1408B (Overload/PrevOP). Furthermore, the authors are funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) within research project ZA 592/4-1.</p>
</sec>
<sec sec-type="COI-statement" id="s10">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s11">
<title>Publisher&#x2019;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ack>
<p>The Osteoarthritis Initiative is a public-private partnership comprised of five contracts (N01-AR-2-2258; N01-AR-2-2259; N01-AR-2-2260; N01-AR-2-2261; N01-AR-2-2262) funded by the National Institutes of Health, a branch of the Department of Health and Human Services, and conducted by the OAI Study Investigators. Private funding partners include Merck Research Laboratories; Novartis Pharmaceuticals Corporation, GlaxoSmithKline; and Pfizer, Inc. Private sector funding for the OAI is managed by the Foundation for the National Institutes of Health. This manuscript was prepared using an OAI public use data set and does not necessarily reflect the opinions or views of the OAI investigators, the NIH, or the private funding partners.</p>
</ack>
<sec id="s12">
<title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fbioe.2021.747217/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fbioe.2021.747217/full&#x23;supplementary-material</ext-link>
</p>
<supplementary-material xlink:href="DataSheet2.PDF" id="SM1" mimetype="application/PDF" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Image1.tiff" id="SM2" mimetype="application/tiff" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="DataSheet1.PDF" id="SM3" mimetype="application/PDF" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Image2.tiff" id="SM4" mimetype="application/tiff" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<fn-group>
<fn id="fn1">
<label>1</label>
<p>
<ext-link ext-link-type="uri" xlink:href="https://nda.nih.gov/oai/">https://nda.nih.gov/oai/</ext-link>
</p>
</fn>
<fn id="fn2">
<label>2</label>
<p>
<ext-link ext-link-type="uri" xlink:href="https://amira.zib.de">https://amira.zib.de</ext-link>
</p>
</fn>
</fn-group>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Agarap</surname>
<given-names>A. F.</given-names>
</name>
</person-group> (<year>2018</year>). <source>Deep Learning Using Rectified Linear Units (Relu)</source>. <publisher-loc>New York</publisher-loc>: <publisher-name>arXiv preprint, arXiv:1803.08375</publisher-name>. </citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ambellan</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Tack</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Ehlke</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Zachow</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Automated Segmentation of Knee Bone and Cartilage Combining Statistical Shape Knowledge and Convolutional Neural Networks: Data from the Osteoarthritis Initiative</article-title>. <source>Med. Image Anal.</source> <volume>52</volume>, <fpage>109</fpage>&#x2013;<lpage>118</lpage>. <pub-id pub-id-type="doi">10.1016/j.media.2018.11.009</pub-id> </citation>
</ref>
<ref id="B3">
<citation citation-type="other">
<person-group person-group-type="author">
<name>
<surname>Azcona</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>McGuinness</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Smeaton</surname>
<given-names>A. F.</given-names>
</name>
</person-group> (<year>2020</year>). &#x201c;<article-title>A Comparative Study of Existing and New Deep Learning Methods for Detecting Knee Injuries Using the Mrnet Dataset</article-title>,&#x201d; in <conf-name>Proceedings of the 2020 International Conference on Intelligent Data Science Technologies and Applications (IDSTA)</conf-name>, <conf-loc>Valencia, Spain</conf-loc>, <conf-date>October 2020</conf-date> (<publisher-name>IEEE</publisher-name>), <fpage>149</fpage>&#x2013;<lpage>155</lpage>. </citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Beaufils</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Pujol</surname>
<given-names>N.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Management of Traumatic Meniscal Tear and Degenerative Meniscal Lesions. Save the Meniscus</article-title>. <source>Orthopaedics Traumatol. Surg. Res.</source> <volume>103</volume>, <fpage>S237</fpage>&#x2013;<lpage>S244</lpage>. <pub-id pub-id-type="doi">10.1016/j.otsr.2017.08.003</pub-id> </citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bhattacharyya</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Gale</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Dewire</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Totterman</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Gale</surname>
<given-names>M. E.</given-names>
</name>
<name>
<surname>McLaughlin</surname>
<given-names>S.</given-names>
</name>
<etal/>
</person-group> (<year>2003</year>). <article-title>The Clinical Importance of Meniscal Tears Demonstrated by Magnetic Resonance Imaging in Osteoarthritis of the Knee</article-title>. <source>JBJS</source> <volume>85</volume>, <fpage>4</fpage>&#x2013;<lpage>9</lpage>. <pub-id pub-id-type="doi">10.2106/00004623-200301000-00002</pub-id> </citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bien</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Rajpurkar</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Ball</surname>
<given-names>R. L.</given-names>
</name>
<name>
<surname>Irvin</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Park</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Jones</surname>
<given-names>E.</given-names>
</name>
<etal/>
</person-group> (<year>2018</year>). <article-title>Deep-learning-assisted Diagnosis for Knee Magnetic Resonance Imaging: Development and Retrospective Validation of Mrnet</article-title>. <source>PLoS Med.</source> <volume>15</volume>, <fpage>e1002699</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pmed.1002699</pub-id> </citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Brown</surname>
<given-names>C. D.</given-names>
</name>
<name>
<surname>Davis</surname>
<given-names>H. T.</given-names>
</name>
</person-group> (<year>2006</year>). <article-title>Receiver Operating Characteristics Curves and Related Decision Measures: A Tutorial</article-title>. <source>Chemometrics Intell. Lab. Syst.</source> <volume>80</volume>, <fpage>24</fpage>&#x2013;<lpage>38</lpage>. <pub-id pub-id-type="doi">10.1016/j.chemolab.2005.05.004</pub-id> </citation>
</ref>
<ref id="B8">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Caron</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Touvron</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Misra</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>J&#xe9;gou</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Mairal</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Bojanowski</surname>
<given-names>P.</given-names>
</name>
<etal/>
</person-group> (<year>2021</year>). <source>Emerging Properties in Self-Supervised Vision Transformers</source>. <publisher-loc>New York</publisher-loc>: <publisher-name>arXiv preprint arXiv:2104.14294</publisher-name>. </citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Couteaux</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Si-Mohamed</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Nempont</surname>
<given-names>O.</given-names>
</name>
<name>
<surname>Lefevre</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Popoff</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Pizaine</surname>
<given-names>G.</given-names>
</name>
<etal/>
</person-group> (<year>2019</year>). <article-title>Automatic Knee Meniscus Tear Detection and Orientation Classification with Mask-Rcnn</article-title>. <source>Diagn. Interv. Imaging</source> <volume>100</volume>, <fpage>235</fpage>&#x2013;<lpage>242</lpage>. <pub-id pub-id-type="doi">10.1016/j.diii.2019.03.002</pub-id> </citation>
</ref>
<ref id="B10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Crawford</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Walley</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Bridgman</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Maffulli</surname>
<given-names>N.</given-names>
</name>
</person-group> (<year>2007</year>). <article-title>Magnetic Resonance Imaging versus Arthroscopy in the Diagnosis of Knee Pathology, Concentrating on Meniscal Lesions and Acl Tears: a Systematic Review</article-title>. <source>Br. Med. Bull.</source> <volume>84</volume>, <fpage>5</fpage>&#x2013;<lpage>23</lpage>. <pub-id pub-id-type="doi">10.1093/bmb/ldm022</pub-id> </citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ding</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Martel-Pelletier</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Pelletier</surname>
<given-names>J.-P.</given-names>
</name>
<name>
<surname>Abram</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Raynauld</surname>
<given-names>J.-P.</given-names>
</name>
<name>
<surname>Cicuttini</surname>
<given-names>F.</given-names>
</name>
<etal/>
</person-group> (<year>2007</year>). <article-title>Meniscal Tear as an Osteoarthritis Risk Factor in a Largely Non-osteoarthritic Cohort: a Cross-Sectional Study</article-title>. <source>J.&#x20;Rheumatol.</source> <volume>34</volume>, <fpage>776</fpage>&#x2013;<lpage>784</lpage>. </citation>
</ref>
<ref id="B12">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Dosovitskiy</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Beyer</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Kolesnikov</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Weissenborn</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Zhai</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Unterthiner</surname>
<given-names>T.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <source>An Image Is worth 16x16 Words: Transformers for Image Recognition at Scale</source>. <publisher-loc>New York</publisher-loc>: <publisher-name>arXiv preprint arXiv:2010.11929</publisher-name>. </citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Englund</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Roos</surname>
<given-names>E. M.</given-names>
</name>
<name>
<surname>Roos</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Lohmander</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2001</year>). <article-title>Patient-relevant Outcomes Fourteen Years after Meniscectomy: Influence of Type of Meniscal Tear and Size of Resection</article-title>. <source>Rheumatology</source> <volume>40</volume>, <fpage>631</fpage>&#x2013;<lpage>639</lpage>. <pub-id pub-id-type="doi">10.1093/rheumatology/40.6.631</pub-id> </citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fritz</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Marbach</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Civardi</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Fucentese</surname>
<given-names>S. F.</given-names>
</name>
<name>
<surname>Pfirrmann</surname>
<given-names>C. W.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Deep Convolutional Neural Network-Based Detection of Meniscus Tears: Comparison with Radiologists and Surgery as Standard of Reference</article-title>. <source>Skeletal Radiol.</source> <volume>49</volume>, <fpage>1207</fpage>&#x2013;<lpage>1217</lpage>. <pub-id pub-id-type="doi">10.1007/s00256-020-03410-2</pub-id> </citation>
</ref>
<ref id="B15">
<citation citation-type="other">
<person-group person-group-type="author">
<name>
<surname>Girshick</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Donahue</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Darrell</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Malik</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2014</year>). &#x201c;<article-title>Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation</article-title>,&#x201d; in <conf-name>Proceedings of the IEEE conference on computer vision and pattern recognition</conf-name>, <conf-loc>Columbus, OH, USA</conf-loc>, <conf-date>June 2014</conf-date>, <fpage>580</fpage>&#x2013;<lpage>587</lpage>. </citation>
</ref>
<ref id="B16">
<citation citation-type="other">
<person-group person-group-type="author">
<name>
<surname>He</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Gkioxari</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Doll&#xe1;r</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Girshick</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2017</year>). &#x201c;<article-title>Mask R-Cnn</article-title>,&#x201d; in <conf-name>Proceedings of the IEEE international conference on computer vision</conf-name>, <conf-date>October, 2017</conf-date>, <fpage>2961</fpage>&#x2013;<lpage>2969</lpage>. </citation>
</ref>
<ref id="B17">
<citation citation-type="other">
<person-group person-group-type="author">
<name>
<surname>He</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Ren</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2016</year>). &#x201c;<article-title>Deep Residual Learning for Image Recognition</article-title>,&#x201d; in <conf-name>Proceedings of the IEEE conference on computer vision and pattern recognition</conf-name>, <conf-loc>Las Vegas, NV, USA</conf-loc>, <conf-date>June 2016</conf-date>, <fpage>770</fpage>&#x2013;<lpage>778</lpage>. </citation>
</ref>
<ref id="B18">
<citation citation-type="other">
<person-group person-group-type="author">
<name>
<surname>He</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Ren</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2015</year>). &#x201c;<article-title>Delving Deep into Rectifiers: Surpassing Human-Level Performance on Imagenet Classification</article-title>,&#x201d; in <conf-name>Proceedings of the IEEE international conference on computer vision</conf-name>, <conf-loc>Santiago, Chile</conf-loc>, <conf-date>December 2015</conf-date>, <fpage>1026</fpage>&#x2013;<lpage>1034</lpage>. </citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hunter</surname>
<given-names>D. J.</given-names>
</name>
<name>
<surname>Guermazi</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Lo</surname>
<given-names>G. H.</given-names>
</name>
<name>
<surname>Grainger</surname>
<given-names>A. J.</given-names>
</name>
<name>
<surname>Conaghan</surname>
<given-names>P. G.</given-names>
</name>
<name>
<surname>Boudreau</surname>
<given-names>R. M.</given-names>
</name>
<etal/>
</person-group> (<year>2011</year>). <article-title>Evolution of Semi-quantitative Whole Joint Assessment of Knee Oa: Moaks (Mri Osteoarthritis Knee Score)</article-title>. <source>Osteoarthritis and Cartilage</source> <volume>19</volume>, <fpage>990</fpage>&#x2013;<lpage>1002</lpage>. <pub-id pub-id-type="doi">10.1016/j.joca.2011.05.004</pub-id> </citation>
</ref>
<ref id="B20">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Ide</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Kurita</surname>
<given-names>T.</given-names>
</name>
</person-group> (<year>2017</year>). &#x201c;<article-title>Improvement of Learning for Cnn with Relu Activation by Sparse Regularization</article-title>,&#x201d; in <conf-name>Proceedings of the International Joint Conference on Neural Networks (IJCNN)</conf-name>, <conf-loc>Anchorage, AK, USA</conf-loc>, <conf-date>May 2017</conf-date> (<publisher-name>IEEE</publisher-name>), <fpage>2684</fpage>&#x2013;<lpage>2691</lpage>. <pub-id pub-id-type="doi">10.1109/ijcnn.2017.7966185</pub-id> </citation>
</ref>
<ref id="B21">
<citation citation-type="other">
<person-group person-group-type="author">
<name>
<surname>Ioffe</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Szegedy</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>2015</year>). &#x201c;<article-title>Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift</article-title>,&#x201d; in <conf-name>Proceedings of the International conference on machine learning</conf-name>, <conf-loc>Lille, France</conf-loc>, <conf-date>July 2015</conf-date>, (<publisher-name>PMLR</publisher-name>), <fpage>448</fpage>&#x2013;<lpage>456</lpage>. </citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Isensee</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Jaeger</surname>
<given-names>P. F.</given-names>
</name>
<name>
<surname>Kohl</surname>
<given-names>S. A.</given-names>
</name>
<name>
<surname>Petersen</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Maier-Hein</surname>
<given-names>K. H.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Nnu-Net: a Self-Configuring Method for Deep Learning-Based Biomedical Image Segmentation</article-title>. <source>Nat. Methods</source> <volume>18</volume>, <fpage>203</fpage>&#x2013;<lpage>211</lpage>. <pub-id pub-id-type="doi">10.1038/s41592-020-01008-z</pub-id> </citation>
</ref>
<ref id="B23">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Katzmann</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Taubmann</surname>
<given-names>O.</given-names>
</name>
<name>
<surname>Ahmad</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>M&#xfc;hlberg</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>S&#xfc;hling</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Gro&#xdf;</surname>
<given-names>H.-M.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Explaining Clinical Decision Support Systems in Medical Imaging Using Cycle-Consistent Activation Maximization</article-title>. <source>Neurocomputing</source> <volume>458</volume>, <fpage>141</fpage>&#x2013;<lpage>156</lpage>. <pub-id pub-id-type="doi">10.1016/j.neucom.2021.05.081</pub-id> </citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Khan</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Evaniew</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Bedi</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Ayeni</surname>
<given-names>O. R.</given-names>
</name>
<name>
<surname>Bhandari</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Arthroscopic Surgery for Degenerative Tears of the Meniscus: a Systematic Review and Meta-Analysis</article-title>. <source>Cmaj</source> <volume>186</volume>, <fpage>1057</fpage>&#x2013;<lpage>1064</lpage>. <pub-id pub-id-type="doi">10.1503/cmaj.140433</pub-id> </citation>
</ref>
<ref id="B25">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Kingma</surname>
<given-names>D. P.</given-names>
</name>
<name>
<surname>Ba</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2014</year>). <source>Adam: A Method for Stochastic Optimization</source>. <publisher-loc>New York</publisher-loc>: <publisher-name>arXiv preprint arXiv:1412.6980</publisher-name>. </citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kise</surname>
<given-names>N. J.</given-names>
</name>
<name>
<surname>Risberg</surname>
<given-names>M. A.</given-names>
</name>
<name>
<surname>Stensrud</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Ranstam</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Engebretsen</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Roos</surname>
<given-names>E. M.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Exercise Therapy versus Arthroscopic Partial Meniscectomy for Degenerative Meniscal Tear in Middle Aged Patients: Randomised Controlled Trial with Two Year Follow-Up</article-title>. <source>bmj</source> <volume>354</volume>. <pub-id pub-id-type="doi">10.1136/bjsports-2016-i3740rep</pub-id> </citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kunze</surname>
<given-names>K. N.</given-names>
</name>
<name>
<surname>Rossi</surname>
<given-names>D. M.</given-names>
</name>
<name>
<surname>White</surname>
<given-names>G. M.</given-names>
</name>
<name>
<surname>Karhade</surname>
<given-names>A. V.</given-names>
</name>
<name>
<surname>Deng</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Williams</surname>
<given-names>B. T.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>Diagnostic Performance of Artificial Intelligence for Detection of Anterior Cruciate Ligament and Meniscus Tears: A Systematic Review</article-title>. <source>Arthrosc. J.&#x20;Arthroscopic Relat. Surg.</source> <volume>37</volume> (<issue>2</issue>), <fpage>771</fpage>&#x2013;<lpage>781</lpage>. <pub-id pub-id-type="doi">10.1016/j.arthro.2020.09.012</pub-id> </citation>
</ref>
<ref id="B28">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Lin</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Yan</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2013</year>). <source>Network in Network</source>. <publisher-loc>New York</publisher-loc>: <publisher-name>arXiv preprint arXiv:1312.4400</publisher-name>. </citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Markes</surname>
<given-names>A. R.</given-names>
</name>
<name>
<surname>Hodax</surname>
<given-names>J.&#x20;D.</given-names>
</name>
<name>
<surname>Ma</surname>
<given-names>C. B.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Meniscus Form and Function</article-title>. <source>Clin. Sports Med.</source> <volume>39</volume>, <fpage>1</fpage>&#x2013;<lpage>12</lpage>. <pub-id pub-id-type="doi">10.1016/j.csm.2019.08.007</pub-id> </citation>
</ref>
<ref id="B30">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pedoia</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Norman</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Mehany</surname>
<given-names>S. N.</given-names>
</name>
<name>
<surname>Bucknor</surname>
<given-names>M. D.</given-names>
</name>
<name>
<surname>Link</surname>
<given-names>T. M.</given-names>
</name>
<name>
<surname>Majumdar</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>3d Convolutional Neural Networks for Detection and Severity Staging of Meniscus and Pfj Cartilage Morphological Degenerative Changes in Osteoarthritis and Anterior Cruciate Ligament Subjects</article-title>. <source>J.&#x20;Magn. Reson. Imaging</source> <volume>49</volume>, <fpage>400</fpage>&#x2013;<lpage>410</lpage>. <pub-id pub-id-type="doi">10.1002/jmri.26246</pub-id> </citation>
</ref>
<ref id="B31">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Peterfy</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Gold</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Eckstein</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Cicuttini</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Dardzinski</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Stevens</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2006</year>). <article-title>Mri Protocols for Whole-Organ Assessment of the Knee in Osteoarthritis</article-title>. <source>Osteoarthritis and Cartilage</source> <volume>14</volume>, <fpage>95</fpage>&#x2013;<lpage>111</lpage>. <pub-id pub-id-type="doi">10.1016/j.joca.2006.02.029</pub-id> </citation>
</ref>
<ref id="B32">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rahman</surname>
<given-names>M. M.</given-names>
</name>
<name>
<surname>D&#xfc;rselen</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Seitz</surname>
<given-names>A. M.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Automatic Segmentation of Knee Menisci&#x2013;A Systematic Review</article-title>. <source>Artif. Intelligence Med.</source> <volume>105</volume>, <fpage>101849</fpage>. <pub-id pub-id-type="doi">10.1016/j.artmed.2020.101849</pub-id> </citation>
</ref>
<ref id="B33">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Reddy</surname>
<given-names>G. V.</given-names>
</name>
</person-group> (<year>2017</year>). <source>Automatic Classification of 3D MRI Data Using Deep Convolutional Neural Networks</source>. <comment>Master&#x2019;s thesis</comment> (<publisher-loc>Germany</publisher-loc>: <publisher-name>Otto-von-Guericke-Universit&#xe4;t Magdeburg</publisher-name>). </citation>
</ref>
<ref id="B34">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ren</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>He</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Girshick</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Faster R-Cnn: Towards Real-Time Object Detection with Region Proposal Networks</article-title>. <source>Adv. Neural Inf. Process. Syst.</source> <volume>28</volume>, <fpage>91</fpage>&#x2013;<lpage>99</lpage>. </citation>
</ref>
<ref id="B35">
<citation citation-type="other">
<person-group person-group-type="author">
<name>
<surname>Rezatofighi</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Tsoi</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Gwak</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Sadeghian</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Reid</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Savarese</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2019</year>). &#x201c;<article-title>Generalized Intersection over union: A Metric and a Loss for Bounding Box Regression</article-title>,&#x201d; in <conf-name>Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition</conf-name>, <conf-loc>Long Beach, CA, USA</conf-loc>, <conf-date>June 2019</conf-date>, <fpage>658</fpage>&#x2013;<lpage>666</lpage>. </citation>
</ref>
<ref id="B36">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rizk</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Brat</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Zille</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Guillin</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Pouchy</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Adam</surname>
<given-names>C.</given-names>
</name>
<etal/>
</person-group> (<year>2021</year>). <article-title>Meniscal Lesion Detection and Characterization in Adult Knee Mri: A Deep Learning Model Approach with External Validation</article-title>. <source>Physica Med.</source> <volume>83</volume>, <fpage>64</fpage>&#x2013;<lpage>71</lpage>. <pub-id pub-id-type="doi">10.1016/j.ejmp.2021.02.010</pub-id> </citation>
</ref>
<ref id="B37">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Roblot</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Giret</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Antoun</surname>
<given-names>M. B.</given-names>
</name>
<name>
<surname>Morillot</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Chassin</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Cotten</surname>
<given-names>A.</given-names>
</name>
<etal/>
</person-group> (<year>2019</year>). <article-title>Artificial Intelligence to Diagnose Meniscus Tears on Mri</article-title>. <source>Diagn. Interv. Imaging</source> <volume>100</volume>, <fpage>243</fpage>&#x2013;<lpage>249</lpage>. <pub-id pub-id-type="doi">10.1016/j.diii.2019.02.007</pub-id> </citation>
</ref>
<ref id="B38">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Roemer</surname>
<given-names>F. W.</given-names>
</name>
<name>
<surname>Kwoh</surname>
<given-names>C. K.</given-names>
</name>
<name>
<surname>Hannon</surname>
<given-names>M. J.</given-names>
</name>
<name>
<surname>Hunter</surname>
<given-names>D. J.</given-names>
</name>
<name>
<surname>Eckstein</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Grago</surname>
<given-names>J.</given-names>
</name>
<etal/>
</person-group> (<year>2017</year>). <article-title>Partial Meniscectomy Is Associated with Increased Risk of Incident Radiographic Osteoarthritis and Worsening Cartilage Damage in the Following Year</article-title>. <source>Eur. Radiol.</source> <volume>27</volume>, <fpage>404</fpage>&#x2013;<lpage>413</lpage>. <pub-id pub-id-type="doi">10.1007/s00330-016-4361-z</pub-id> </citation>
</ref>
<ref id="B39">
<citation citation-type="other">
<person-group person-group-type="author">
<name>
<surname>Shih</surname>
<given-names>S.-M.</given-names>
</name>
<name>
<surname>Tien</surname>
<given-names>P.-J.</given-names>
</name>
<name>
<surname>Karnin</surname>
<given-names>Z.</given-names>
</name>
</person-group> (<year>2021</year>). &#x201c;<article-title>Ganmex: One-Vs-One Attributions Using gan-based Model Explainability</article-title>,&#x201d; in <conf-name>Proceedings of the International Conference on Machine Learning</conf-name>, <conf-loc>Vienna, Austria on Virtual</conf-loc>, <conf-date>July 2021</conf-date> (<publisher-name>PMLR</publisher-name>), <fpage>9592</fpage>&#x2013;<lpage>9602</lpage>. </citation>
</ref>
<ref id="B40">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Simonyan</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Vedaldi</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Zisserman</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2013</year>). <source>Deep inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps</source>. <publisher-loc>New York</publisher-loc>: <publisher-name>arXiv preprint arXiv:1312.6034</publisher-name>. </citation>
</ref>
<ref id="B41">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Smilkov</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Thorat</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Kim</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Vi&#xe9;gas</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Wattenberg</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2017</year>). <source>Smoothgrad: Removing Noise by Adding Noise</source>. <publisher-loc>New York</publisher-loc>: <publisher-name>arXiv preprint arXiv:1706.03825</publisher-name>. </citation>
</ref>
<ref id="B42">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Snoeker</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Ishijima</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Kumm</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Turkiewicz</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Englund</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Are Structural Abnormalities on Knee Mri Associated with Osteophyte Development? Data from the Osteoarthritis Initiative</article-title>. <source>Osteoarthritis and Cartilage</source> <volume>S1063-4584</volume> (<issue>21</issue>), <fpage>00841</fpage>&#x2013;<lpage>00844</lpage>. </citation>
</ref>
<ref id="B43">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tack</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Mukhopadhyay</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Zachow</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Knee Menisci Segmentation Using Convolutional Neural Networks: Data from the Osteoarthritis Initiative</article-title>. <source>Osteoarthritis and Cartilage</source> <volume>26</volume>, <fpage>680</fpage>&#x2013;<lpage>688</lpage>. <pub-id pub-id-type="doi">10.1016/j.joca.2018.02.907</pub-id> </citation>
</ref>
<ref id="B44">
<citation citation-type="other">
<person-group person-group-type="author">
<name>
<surname>Tack</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Zachow</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2019</year>). &#x201c;<article-title>Accurate Automated Volumetry of Cartilage of the Knee Using Convolutional Neural Networks: Data from the Osteoarthritis Initiative</article-title>,&#x201d; in <conf-name>Proceedings of the 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019)</conf-name>, <conf-loc>Venice, Italy</conf-loc>, <conf-date>April 2019</conf-date> (<publisher-name>IEEE</publisher-name>), <fpage>40</fpage>&#x2013;<lpage>43</lpage>. </citation>
</ref>
<ref id="B45">
<citation citation-type="other">
<person-group person-group-type="author">
<name>
<surname>Tsai</surname>
<given-names>C.-H.</given-names>
</name>
<name>
<surname>Kiryati</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Konen</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Eshed</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Mayer</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2020</year>). &#x201c;<article-title>Knee Injury Detection Using Mri with Efficiently-Layered Network (Elnet)</article-title>,&#x201d; in <conf-name>Proceedings of the Medical Imaging with Deep Learning</conf-name>, <conf-date>July, 2020</conf-date>, <conf-loc>Montreal, Canada</conf-loc> (<publisher-name>PMLR</publisher-name>), <fpage>784</fpage>&#x2013;<lpage>794</lpage>. </citation>
</ref>
<ref id="B46">
<citation citation-type="other">
<person-group person-group-type="author">
<name>
<surname>Vaswani</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Shazeer</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Parmar</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Uszkoreit</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Jones</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Gomez</surname>
<given-names>A. N.</given-names>
</name>
<etal/>
</person-group> (<year>2017</year>). &#x201c;<article-title>Attention Is All You Need</article-title>,&#x201d; in <conf-name>Advances in neural information processing systems (NIPS 2017)</conf-name>, <conf-loc>Long Beach, CA, USA</conf-loc>, <conf-date>December 2017</conf-date>, <fpage>5998</fpage>&#x2013;<lpage>6008</lpage>. </citation>
</ref>
<ref id="B47">
<citation citation-type="other">
<person-group person-group-type="author">
<name>
<surname>Yu</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Koltun</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Funkhouser</surname>
<given-names>T.</given-names>
</name>
</person-group> (<year>2017</year>). &#x201c;<article-title>Dilated Residual Networks</article-title>,&#x201d; in <conf-name>Proceedings of the IEEE conference on computer vision and pattern recognition</conf-name>, <conf-loc>Honolulu, HI, USA</conf-loc>, <conf-date>July 2017</conf-date>, <fpage>472</fpage>&#x2013;<lpage>480</lpage>. </citation>
</ref>
</ref-list>
</back>
</article>