<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Big Data</journal-id>
<journal-title>Frontiers in Big Data</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Big Data</abbrev-journal-title>
<issn pub-type="epub">2624-909X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fdata.2023.1108659</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Big Data</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>NuSegDA: Domain adaptation for nuclei segmentation</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>Haq</surname> <given-names>Mohammad Minhazul</given-names></name>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/2113954/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Ma</surname> <given-names>Hehuan</given-names></name>
</contrib>
<contrib contrib-type="author">
<name><surname>Huang</surname> <given-names>Junzhou</given-names></name>
</contrib>
</contrib-group>
<aff><institution>Department of Computer Science and Engineering, University of Texas at Arlington</institution>, <addr-line>Arlington, TX</addr-line>, <country>United States</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Hua Wang, Colorado School of Mines, United States</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Haoteng Tang, University of Pittsburgh, United States; Iman Yi Liao, University of Nottingham Malaysia Campus, Malaysia</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Mohammad Minhazul Haq <email>mohammadminhazu.haq&#x00040;mavs.uta.edu</email></corresp>
<fn fn-type="other" id="fn001"><p>This article was submitted to Medicine and Public Health, a section of the journal Frontiers in Big Data</p></fn></author-notes>
<pub-date pub-type="epub">
<day>02</day>
<month>03</month>
<year>2023</year>
</pub-date>
<pub-date pub-type="collection">
<year>2023</year>
</pub-date>
<volume>6</volume>
<elocation-id>1108659</elocation-id>
<history>
<date date-type="received">
<day>26</day>
<month>11</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>13</day>
<month>02</month>
<year>2023</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2023 Haq, Ma and Huang.</copyright-statement>
<copyright-year>2023</copyright-year>
<copyright-holder>Haq, Ma and Huang</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license></permissions>
<abstract>
<p>The accurate segmentation of nuclei is crucial for cancer diagnosis and further clinical treatments. To successfully train a nuclei segmentation network in a fully-supervised manner for a particular type of organ or cancer, we need the dataset with ground-truth annotations. However, such well-annotated nuclei segmentation datasets are highly rare, and manually labeling an unannotated dataset is an expensive, time-consuming, and tedious process. Consequently, we require to discover a way for training the nuclei segmentation network with unlabeled dataset. In this paper, we propose a model named NuSegUDA for nuclei segmentation on the unlabeled dataset (target domain). It is achieved by applying Unsupervised Domain Adaptation (UDA) technique with the help of another labeled dataset (source domain) that may come from different type of organ, cancer, or source. We apply UDA technique at both of feature space and output space. We additionally utilize a reconstruction network and incorporate adversarial learning into it so that the source-domain images can be accurately translated to the target-domain for further training of the segmentation network. We validate our proposed NuSegUDA on two public nuclei segmentation datasets, and obtain significant improvement as compared with the baseline methods. Extensive experiments also verify the contribution of newly proposed image reconstruction adversarial loss, and target-translated source supervised loss to the performance boost of NuSegUDA. Finally, considering the scenario when we have a small number of annotations available from the target domain, we extend our work and propose NuSegSSDA, a Semi-Supervised Domain Adaptation (SSDA) based approach.</p></abstract>
<kwd-group>
<kwd>nuclei segmentation</kwd>
<kwd>domain adaptation</kwd>
<kwd>Unsupervised Domain Adaptation</kwd>
<kwd>Semi-Supervised Domain Adaptation</kwd>
<kwd>adversarial learning</kwd>
</kwd-group>
<counts>
<fig-count count="7"/>
<table-count count="4"/>
<equation-count count="17"/>
<ref-count count="44"/>
<page-count count="12"/>
<word-count count="8503"/>
</counts>
</article-meta>
</front>
<body>
<sec id="s1">
<title>1. Introduction</title>
<p>Nuclei are the fundamental organizational unit of life (Sharma et al., <xref ref-type="bibr" rid="B30">2022</xref>). Nuclei segmentation, a subclass of biomedical image segmentation, is considered as an essential task of digital histopathology image analysis (Yang S. et al., <xref ref-type="bibr" rid="B39">2021</xref>; Haq and Huang, <xref ref-type="bibr" rid="B10">2022</xref>). However, accurate nuclei segmentation is quite challenging due to the significant variations in the shape and appearance of nuclei, clustered and overlapped nuclei, blurred nuclei boundaries, inconsistent staining methods, scanning artifacts, etc. (see <xref ref-type="fig" rid="F1">Figure 1</xref>). Also, histopathology of different organs or cancer types may exhibit different textures, color distributions, morphology, and scales (Xu et al., <xref ref-type="bibr" rid="B37">2017</xref>; Mahmood et al., <xref ref-type="bibr" rid="B23">2019</xref>).</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>The semantic segmentation of nuclei. In this figure, the input image comes from Triple Negative Breast Cancer (TNBC).</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fdata-06-1108659-g0001.tif"/>
</fig>
<p>Nuclei segmentation problem can be seen as a semantic segmentation problem in which we want to segment the nuclei from it&#x00027;s background. <xref ref-type="fig" rid="F1">Figure 1</xref> shows the input image, and corresponding output of semantic segmentation of nuclei. Convolutional Neural Network (CNN) based approaches like Fully Convolutional Network (FCN) (Long et al., <xref ref-type="bibr" rid="B22">2015</xref>), U-Net (Ronneberger et al., <xref ref-type="bibr" rid="B29">2015</xref>), UNet&#x0002B;&#x0002B; (Zhou et al., <xref ref-type="bibr" rid="B43">2018</xref>), etc. give very promising results in biomedical image segmentation tasks as well as in nuclei segmentation problems (Sirinukunwattana et al., <xref ref-type="bibr" rid="B31">2016</xref>; Haq and Huang, <xref ref-type="bibr" rid="B10">2022</xref>; Sharma et al., <xref ref-type="bibr" rid="B30">2022</xref>). However, to successfully train these fully-supervised methods, we need at least a few amount of annotated data (i.e., images with their corresponding pixel-level ground-truth labels) (Zeiler and Fergus, <xref ref-type="bibr" rid="B40">2014</xref>; Kumar et al., <xref ref-type="bibr" rid="B19">2017</xref>; Sharma et al., <xref ref-type="bibr" rid="B30">2022</xref>). Unfortunately, such well-annotated datasets, even if very small-sized, are highly rare in biomedical domain. Moreover, due to the heterogeneity of nuclei, it&#x00027;s even harder to learn good models under the scenario of lacking annotations and samples. Also, commonly used strategy which first collects an unannotated histopathology dataset and then do the manual pixel-level labeling with the help of experts is also an expensive, time-consuming, and tedious process (Xu et al., <xref ref-type="bibr" rid="B37">2017</xref>; Chen C. et al., <xref ref-type="bibr" rid="B3">2019</xref>; Yang S. et al., <xref ref-type="bibr" rid="B39">2021</xref>). For example, annotating even a small nuclei segmentation dataset consisting of 50 image patches takes 120&#x02013;130 h of an expert pathologist&#x00027;s time (Hou et al., <xref ref-type="bibr" rid="B12">2019</xref>). Therefore, an urgent question is raised: how could we robustly train a deep CNN model for nuclei segmentation without any further need for annotations?</p>
<p>For nuclei segmentation problem, simply applying Transfer Learning (i.e., models trained with one organ or cancer type, and then evaluated with different organ or cancer types) unfortunately leads to poor performance due to the domain shift problem (Sharma et al., <xref ref-type="bibr" rid="B30">2022</xref>). This domain shift problem happens due to different scanners, scanning protocols, tissue types, etc. (Sharma et al., <xref ref-type="bibr" rid="B30">2022</xref>). In this paper, we propose Domain Adaptation, a subclass of Transfer Learning, based framework to solve the domain shift problem for nuclei segmentation. We consider the unannotated dataset (i.e., for which we want to predict the labels) as the target domain. Then, with the help of another related but different annotated dataset, referred as the source domain, we apply adversarial learning (Goodfellow et al., <xref ref-type="bibr" rid="B8">2014</xref>) based domain adaptation technique for nuclei segmentation problem. Thus, our proposed framework, learns from the labeled source domain and adapts to the unlabeled target domain.</p>
<p>In this work, we first propose an Unsupervised Domain Adaptation (UDA) model for nuclei segmentation to close the gap between the annotated source domain and unlabeled target domain. Unsupervised Domain Adaptation methods are capable to minimize the labeling cost by utilizing cross-domain data and aligning the distribution shift between labeled source domain data and unlabeled target domain data. We empirically and carefully observed that, images from different nuclei datasets, even if collected from different organ or cancer types, exhibit dissimilarity although their corresponding segmentation ground-truth labels are quite similar (see <xref ref-type="fig" rid="F2">Figure 2</xref>). In summary, ground-truth labels for nuclei segmentation are domain-invariant. Because of the aforementioned observation, we apply domain adaptation in the output space. Thus, with the help of adversarial learning, we train a robust nuclei segmentation network to generate source-domain look-alike outputs for target images. Adversarial learning attempts to align target-domain predictions with source-domain ground truths via discriminator training. In addition to image-level domain adaptation at the output space, we apply domain-invariant class-conditional feature-level domain adaptation in the feature space. However, simply forcing the target-domain distribution toward the source-domain distribution can destroy the latent structural patterns of the target domain, leading to a drop in the model&#x00027;s accuracy. Consequently, we also use a reconstruction network to maximize the correlation between target images and target predictions. Again, a reconstruction network alone can not perfectly reconstruct original images (i.e., the reconstructed images lack original texture, style, color distribution, etc.) for which we incorporate adversarial learning into the reconstruction network, which in turn helps us to translate source domain images to the target domain. We additionally train our UDA model with these target-translated source images, and observe a significant performance boost. Finally, we extend our UDA framework to Semi-Supervised Domain Adaptation (SSDA) model considering that we have some annotations available from the target domain.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>Images from different domains look dissimilar while their pixel-level segmentation outputs are similar. In this figure, source domain and target domain images come from Kidney Renal Clear cell carcinoma (KIRC) and Triple Negative Breast Cancer (TNBC), respectively.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fdata-06-1108659-g0002.tif"/>
</fig>
<p>Conducting extensive experiments on two nuclei segmentation datasets we conclude that, our proposed UDA method, NuSegUDA, outperforms fully-supervised model trained on source domain and evaluated on target domain, and baseline generic and biomedical UDA segmentation models. Experimental result (see Section 4) also shows the impacts of training NuSegUDA with proposed image reconstruction adversarial loss, target-translated source images, and feature-level clustering loss. Furthermore, the accuracy of our SSDA model, NuSegSSDA, is highly competitive to the upper bound of fully-supervised model trained in the target domain.</p>
<p>Therefore, the main contributions of this paper are: (1) We propose an adversarial learning based Unsupervised Domain Adaptation (UDA) approach, which is applied at both of feature space and output space to solve nuclei segmentation problem for unannotated datasets. (2) Additionally, we incorporate adversarial learning into a reconstruction network to translate source domain images to the target domain, and train proposed model with these target-translated source images. (3) Compared to many of the baselines, our proposed method is simple as it does not depend on any data synthesization or data augmentation. (4) Our proposed UDA framework can be easily extended to Semi-Supervised Domain Adaptation (SSDA) in the scenario where a small portion of the target domain is labeled. (5) Extensive and comprehensive experiments on two datasets have demonstrated the superiority of the proposed methods.</p></sec>
<sec id="s2">
<title>2. Related works</title>
<p>In literature, several domain adaptation models have been proposed for generic image segmentation. Isola et al. (<xref ref-type="bibr" rid="B15">2017</xref>) applied conditional GAN (Mirza and Osindero, <xref ref-type="bibr" rid="B24">2014</xref>) for image-to-image translation problems. CyCADA proposed an Unsupervised Domain Adaptation (UDA) model utilizing both of input space and feature space adaptation (Hoffman et al., <xref ref-type="bibr" rid="B11">2017</xref>). A multi-level adversarial network based domain adaptation approach for semantic segmentation was proposed in AdaptSegNet (Tsai et al., <xref ref-type="bibr" rid="B34">2018</xref>). Zhang et al. (<xref ref-type="bibr" rid="B41">2018</xref>) proposed a fully convolutional adaptation network for semantic segmentation. CrDoCo proposed a cross-domain consistency loss based pixel-wise adversarial domain adaptation algorithm (Chen Y.-C. et al., <xref ref-type="bibr" rid="B5">2019</xref>). Yang J. et al. (<xref ref-type="bibr" rid="B38">2021</xref>) proposed adversarial self-supervision UDA model which maximizes agreement between clean samples and their adversarial examples. Toldo et al. (<xref ref-type="bibr" rid="B33">2021</xref>) proposed feature-clustering based UDA framework that groups features of the same class into tight and well-separated clusters.</p>
<p>Domain adaptation has also been employed in different biomedical image segmentation tasks. A multi-connected domain discriminator based UDA model for brain lesion segmentation was proposed by Kamnitsas et al. (<xref ref-type="bibr" rid="B17">2017</xref>). Dong et al. (<xref ref-type="bibr" rid="B6">2018</xref>) introduced another UDA framework for cardiothoracic ratio estimation through chest organ segmentation. Huo et al. (<xref ref-type="bibr" rid="B13">2018</xref>) proposed an end-to-end CycleGAN (Zhu et al., <xref ref-type="bibr" rid="B44">2017</xref>) based whole abdomen MRI to CT image synthesis and CT splenomegaly segmentation network. Mahmood et al. (<xref ref-type="bibr" rid="B23">2019</xref>) proposed a nuclei segmentation approach in which a large dataset is generated using synthesization. Gholami et al. (<xref ref-type="bibr" rid="B7">2019</xref>) proposed a biophysics-based medical image segmentation framework which enriches the training dataset by generating synthetic tumor-bearing MR images. Hou et al. (<xref ref-type="bibr" rid="B12">2019</xref>) also synthesized annotated training data for histopathology image segmentation. Haq and Huang (<xref ref-type="bibr" rid="B9">2020</xref>) utilized adversarial learning at output space along with a reconstruction network for nuclei segmentation. Xia et al. (<xref ref-type="bibr" rid="B36">2020</xref>) proposed Uncertainty-aware Multi-view Co-Training (UMCT) framework which is capable of utilizing large-scale unlabeled data to improve volumetric medical image segmentation. Raju et al. (<xref ref-type="bibr" rid="B28">2020</xref>) proposed an user-guided domain adaptation framework for liver segmentation which uses prediction-based adversarial domain adaptation to model the combined distribution of user interactions and mask predictions. EndoUDA proposed another UDA-based segmentation model for gastrointestinal endoscopy imaging which comprises of a shared encoder and a joint loss function for improved unseen target domain generalization (Celik et al., <xref ref-type="bibr" rid="B2">2021</xref>). Li et al. (<xref ref-type="bibr" rid="B20">2021</xref>) proposed another GAN (Mirza and Osindero, <xref ref-type="bibr" rid="B24">2014</xref>) based framework for Unsupervised Domain Adaptation of nuclei segmentation which also utilized self-ensembling and conditional random field (Boykov and Kolmogorov, <xref ref-type="bibr" rid="B1">2004</xref>). Sharma et al. (<xref ref-type="bibr" rid="B30">2022</xref>) proposed a mutual information based UDA method for cross-domain nuclei segmentation.</p>
<p>Several previous approaches (Dong et al., <xref ref-type="bibr" rid="B6">2018</xref>; Tsai et al., <xref ref-type="bibr" rid="B34">2018</xref>; Haq and Huang, <xref ref-type="bibr" rid="B9">2020</xref>; Toldo et al., <xref ref-type="bibr" rid="B33">2021</xref>) employed Unsupervised Domain Adaptation technique either in the output space or the feature space. Differently from these approaches, in our work we apply domain adaptation at both of output space and feature space. Additionally, unlike previous works, we utilize a reconstruction network to ensure that the target domain predictions spatially correspond to the target domain images. Also, several recent works (Huo et al., <xref ref-type="bibr" rid="B13">2018</xref>; Gholami et al., <xref ref-type="bibr" rid="B7">2019</xref>; Hou et al., <xref ref-type="bibr" rid="B12">2019</xref>; Mahmood et al., <xref ref-type="bibr" rid="B23">2019</xref>) applied complicated data synthesization techniques to generate a large training dataset. On the contrary, in our work we simply incorporate adversarial learning so that the source domain images can be translated to the target domain for further training.</p></sec>
<sec id="s3">
<title>3. Methodology</title>
<p>In this section, we first describe the problem that we aim to solve. Then, we introduce the details of our proposed Unsupervised Domain Adaptation (UDA) and Semi-Supervised Domain Adaptation (SSDA) framework. Finally, we discuss the implementations of the proposed models.</p>
<sec>
<title>3.1. Problem definition</title>
<p>In our nuclei segmentation problem, we have nuclei histopathology image patches as input <italic>X</italic> of size <italic>H</italic>&#x000D7;<italic>W</italic>&#x000D7;3. The input <italic>X</italic> comes from either the source domain or the target domain. Depending on the problem (i.e., unsupervised or semi-supervised) and domain (i.e., source or target), we may also have the corresponding pixel-wise ground-truth label <italic>Y</italic> of size <italic>H</italic>&#x000D7;<italic>W</italic>&#x000D7;1 which is basically a binary mask. Then, using the segmentation network, we want to predict the segmentation output &#x00176; of size <italic>H</italic>&#x000D7;<italic>W</italic>&#x000D7;1.</p>
<p>Formally, in Unsupervised Domain Adaptation (UDA) problem, the source domain consists of <italic>N</italic><sub><italic>s</italic></sub> annotated images {(<italic>X</italic><sub><italic>s</italic></sub>, <italic>Y</italic><sub><italic>s</italic></sub>)}, and the target domain has <italic>N</italic><sub><italic>t</italic></sub> unannotated images {(<italic>X</italic><sub><italic>t</italic></sub>)}. In the case of Semi-Supervised Domain Adaptation (SSDA) problem, the source domain is the same as it is in UDA problem, and we assume that the target domain has <inline-formula><mml:math id="M1"><mml:msubsup><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> images with annotations <inline-formula><mml:math id="M2"><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>}</mml:mo></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M3"><mml:msubsup><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> unannotated images <inline-formula><mml:math id="M4"><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>}</mml:mo></mml:mrow></mml:math></inline-formula>. In both of UDA and SSDA problem, the source domain data and target domain data are the related data but they come from different distributions (i.e., different organ or cancer types). For both of unsupervised and Semi-Supervised Domain Adaptation, our ultimate goal is to learn nuclei segmentation models that accurately produce the segmentation outputs in the target domain.</p></sec>
<sec>
<title>3.2. Unsupervised Domain Adaptation</title>
<p>We refer our nuclei segmentation Unsupervised Domain Adaptation (UDA) model as NuSegUDA, and the framework is shown in <xref ref-type="fig" rid="F3">Figure 3</xref>. NuSegUDA consists of four modules: Segmentation network (<italic>S</italic>), Reconstruction network (R), Prediction Discriminator (<italic>D</italic><sub><italic>P</italic></sub>), and Image Discriminator (<italic>D</italic><sub><italic>I</italic></sub>).</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>Complete architecture of NuSegUDA. Segmentation network generates segmentation outputs, from which reconstruction network reconstructs input images. Prediction discriminator distinguishes between source domain outputs and target domain outputs. Image discriminator distinguishes between original images and reconstructed images.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fdata-06-1108659-g0003.tif"/>
</fig>
<sec>
<title>3.2.1. Segmentation network</title>
<p>The segmentation network <italic>S</italic> takes image <italic>X</italic> as the input and produces the segmentation prediction &#x00176; of the same size as the input. Here, <italic>X</italic> can be either the source domain image <italic>X</italic><sub><italic>s</italic></sub>, or the target domain image <italic>X</italic><sub><italic>t</italic></sub>. Hence, the source domain prediction <inline-formula><mml:math id="M5"><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>^</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:mi>S</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>, and the target domain prediction <inline-formula><mml:math id="M6"><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>^</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:mi>S</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>. From the perspective of GAN (Goodfellow et al., <xref ref-type="bibr" rid="B8">2014</xref>) framework, the segmentation network <italic>S</italic> can be thought as the generator module.</p>
<p>We train <italic>S</italic> to generate the source domain segmentation predictions <inline-formula><mml:math id="M7"><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:math></inline-formula> to be similar to the source domain ground-truth labels <italic>Y</italic><sub><italic>s</italic></sub>. Since in Unsupervised Domain Adaptation (UDA) the ground-truth labels are not available for target images, we can not compute any supervised pixel-level loss for target predictions. In practice, we found that combining dice-coefficient loss and entropy minimization loss is more effective than simply using binary cross-entropy loss for nuclei segmentation tasks. Therefore, we define segmentation loss <italic>L</italic><sub><italic>seg</italic></sub> as:</p>
<disp-formula id="E1"><label>(1)</label><mml:math id="M8"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>L</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mi>i</mml:mi><mml:mi>c</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:mfrac><mml:mrow><mml:mn>2</mml:mn><mml:mo>.</mml:mo><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msubsup><mml:mo>.</mml:mo><mml:mover accent="true"><mml:mrow><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msubsup><mml:mo>&#x0002B;</mml:mo><mml:mover accent="true"><mml:mrow><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="E2"><label>(2)</label><mml:math id="M9"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>L</mml:mi></mml:mrow><mml:mrow><mml:mi>e</mml:mi><mml:mi>m</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mo>-</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>H</mml:mi><mml:mo>&#x000D7;</mml:mo><mml:mi>W</mml:mi></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>h</mml:mi><mml:mo>,</mml:mo><mml:mi>w</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>^</mml:mo></mml:mover><mml:mo class="qopname">log</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo class="qopname">^</mml:mo></mml:mover></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="E3"><label>(3)</label><mml:math id="M10"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>L</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>e</mml:mi><mml:mi>g</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>L</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mi>i</mml:mi><mml:mi>c</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>L</mml:mi></mml:mrow><mml:mrow><mml:mi>e</mml:mi><mml:mi>m</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <inline-formula><mml:math id="M11"><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> and <inline-formula><mml:math id="M12"><mml:mover accent="true"><mml:mrow><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:math></inline-formula> are the flattened <italic>Y</italic><sub><italic>s</italic></sub> and <inline-formula><mml:math id="M13"><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:math></inline-formula>, respectively.</p>
<p>Here, question may arise that why we are using single segmentation network <italic>S</italic> in NuSegUDA although we have two different domains. Since we are particularly looking for nuclei from both domain images, it is very unusual to use multiple segmentation networks. Additionally, using two segmentation networks would increase the number of learnable parameters which would slow down the training process in turn. Therefore, single segmentation network helps to prevent the memory issues and training latency in NuSegUDA.</p>
<p>Training the segmentation network <italic>S</italic> with only the annotated source data teaches <italic>S</italic> to make accurate predictions for source images. However, this segmentation network may generate incorrect outputs for target images as there are visual discrepancies between source images and target images (see <xref ref-type="fig" rid="F2">Figure 2</xref>). This visual gap between domains causes the domain shift problem. According to our aforementioned observation that nuclei segmentation outputs are domain-invariant, we require <italic>S</italic> to produce target domain predictions as much as close to the source domain predictions. In other words, we want to make the distribution of target predictions <inline-formula><mml:math id="M14"><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:math></inline-formula> closer to the distribution of source predictions <inline-formula><mml:math id="M15"><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:math></inline-formula>. For this reason, we utilize Prediction Discriminator <italic>D</italic><sub><italic>P</italic></sub> in NuSegUDA, and we define the prediction adversarial loss as:</p>
<disp-formula id="E4"><label>(4)</label><mml:math id="M16"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>L</mml:mi></mml:mrow><mml:mrow><mml:mi>a</mml:mi><mml:mi>d</mml:mi><mml:mi>v</mml:mi><mml:mi>P</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mo>-</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msub><mml:mo>&#x000D7;</mml:mo><mml:msub><mml:mrow><mml:mi>W</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>h</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munder></mml:mstyle><mml:mo class="qopname">log</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>P</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x00176;</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where &#x00176;<sub><italic>t</italic></sub> &#x0003D; <italic>S</italic>(<italic>X</italic><sub><italic>t</italic></sub>), and <italic>H</italic><sub><italic>p</italic></sub> and <italic>W</italic><sub><italic>p</italic></sub> are height and width of the prediction discriminator output <italic>D</italic><sub><italic>P</italic></sub>(&#x00176;<sub><italic>t</italic></sub>). The details of the Prediction Discriminator <italic>D</italic><sub><italic>P</italic></sub> is discussed in Section 3.2.3.</p>
<p>The prediction adversarial loss in Equation (4) helps <italic>S</italic> to fool the prediction discriminator so that it considers <inline-formula><mml:math id="M17"><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:math></inline-formula> as source domain segmentation outputs. Segmentation loss and the prediction adversarial loss jointly guide <italic>S</italic> to generate target domain predictions &#x00176;<sub><italic>t</italic></sub> which look similar to source domain ground-truths.</p></sec>
<sec>
<title>3.2.2. Reconstruction network</title>
<p>As we mentioned earlier, the segmentation network <italic>S</italic> produces domain-invariant predictions for both domains. In other words, we want to generate the target domain predictions in a way so that they become similar to the source domain predictions. However, it is highly probable that the target predictions are not well-correlated with corresponding target input images. In this scenario, the ability of reconstructing the images from the predictions with similar visual appearance as input images will ensure that there is a correlation between the input image and segmentation output.</p>
<p>To ensure that our target domain predictions spatially correspond to the target domain images, reconstruction network <italic>R</italic> is used in NuSegUDA. In a similar way to Xia and Kulis (<xref ref-type="bibr" rid="B35">2017</xref>), we consider the segmentation network <italic>S</italic> and the reconstruction network <italic>R</italic> as an encoder and a decoder, respectively. <italic>R</italic> reconstructs target images from the corresponding predictions. Thus, <italic>S</italic> and <italic>R</italic> altogether works as an autoencoder.</p>
<p>Using our reconstruction network <italic>R</italic>, we first reconstruct target input images <italic>X</italic><sub><italic>t</italic></sub> from &#x00176;<sub><italic>t</italic></sub>. Then, we calculate the reconstruction loss as:</p>
<disp-formula id="E5"><label>(5)</label><mml:math id="M19"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>L</mml:mi></mml:mrow><mml:mrow><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>H</mml:mi><mml:mo>&#x000D7;</mml:mo><mml:mi>W</mml:mi><mml:mo>&#x000D7;</mml:mo><mml:mi>C</mml:mi></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>h</mml:mi><mml:mo>,</mml:mo><mml:mi>w</mml:mi><mml:mo>,</mml:mo><mml:mi>c</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:mi>R</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x00176;</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where, <italic>R</italic>(&#x00176;<sub><italic>t</italic></sub>) is the output of reconstruction network for &#x00176;<sub><italic>t</italic></sub>, and <italic>C</italic> is the number of channels of input image <italic>X</italic><sub><italic>t</italic></sub>.</p>
<p>Although we use above reconstruction loss to reconstruct the target domain images from its predictions, the reconstructed images may have very different textures and styles (for both of nuclei and background) than the original images (see <xref ref-type="fig" rid="F4">Figure 4</xref>). The reason is that the pixel-wise reconstruction loss <italic>L</italic><sub><italic>recons</italic></sub> (in Equation 5) can not capture the overall pixel distribution of target domain images. To solve this issue, in addition to <italic>L</italic><sub><italic>recons</italic></sub>, we also utilize an Image Discriminator <italic>D</italic><sub><italic>I</italic></sub> to distinguish the original images and the reconstructed images. To train <italic>R</italic> and <italic>S</italic> to generate original-alike reconstructed images, we define image reconstruction adversarial loss as:</p>
<disp-formula id="E6"><label>(6)</label><mml:math id="M20"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>L</mml:mi></mml:mrow><mml:mrow><mml:mi>a</mml:mi><mml:mi>d</mml:mi><mml:mi>v</mml:mi><mml:mi>I</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mo>-</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x000D7;</mml:mo><mml:msub><mml:mrow><mml:mi>W</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>h</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munder></mml:mstyle><mml:mo class="qopname">log</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>I</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo class="qopname">&#x0007E;</mml:mo></mml:mover></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p>Visualization of the target-translated source domain images <italic>X</italic><sub><italic>s</italic>&#x02192;<italic>t</italic></sub> which are also the same as reconstructed source images <inline-formula><mml:math id="M18"><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover></mml:math></inline-formula>. <bold>(A&#x02013;C)</bold> and <bold>(D&#x02013;F)</bold> are chosen from Kidney Renal Clear cell carcinoma (KIRC) domain, and Triple Negative Breast Cancer (TNBC) domain, respectively. In <bold>(C)</bold> and <bold>(F)</bold>, we see that KIRC domain image is translated (i.e., reconstructed) into TNBC domain styles, and vice versa, respectively. In <bold>(B, C)</bold> and <bold>(E, F)</bold>, <italic>X</italic><sub><italic>s</italic>&#x02192;<italic>t</italic></sub> w/o <italic>L</italic><sub><italic>advI</italic></sub> and <italic>X</italic><sub><italic>s</italic>&#x02192;<italic>t</italic></sub> w/ <italic>L</italic><sub><italic>advI</italic></sub> refer to the translated image when NuSegUDA is trained without and with image reconstruction adversarial loss <italic>L</italic><sub><italic>advI</italic></sub>, respectively.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fdata-06-1108659-g0004.tif"/>
</fig>
<p>where <inline-formula><mml:math id="M21"><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:mi>R</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x00176;</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>, and <italic>H</italic><sub><italic>i</italic></sub> and <italic>W</italic><sub><italic>i</italic></sub> are height and width of the image discriminator output <inline-formula><mml:math id="M22"><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>I</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>. This adversarial loss <italic>L</italic><sub><italic>advI</italic></sub> trains <italic>R</italic> and <italic>S</italic> to reconstruct target domain images of similar distributions (in terms of texture, style, color distribution, etc.) to the original images from target domain.</p>
<p>In NuSegUDA, <italic>L</italic><sub><italic>advP</italic></sub> helps the segmentation network <italic>S</italic> to generate target predictions <inline-formula><mml:math id="M23"><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:math></inline-formula> to be similar to the source predictions <inline-formula><mml:math id="M24"><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:math></inline-formula>. And, due to <italic>L</italic><sub><italic>advI</italic></sub>, reconstruction network <italic>R</italic> learns to reconstruct target images (i.e., <inline-formula><mml:math id="M25"><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover></mml:math></inline-formula>) which are very similar to the original target images in terms of texture, style, color distribution, etc. In other words, <italic>S</italic> maps both domain images (i.e., <italic>X</italic><sub><italic>s</italic></sub> and <italic>X</italic><sub><italic>t</italic></sub>) to a common prediction subspace <inline-formula><mml:math id="M26"><mml:msubsup><mml:mrow><mml:mi>&#x0211D;</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula>, and from <inline-formula><mml:math id="M27"><mml:msubsup><mml:mrow><mml:mi>&#x0211D;</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> <italic>R</italic> reconstructs the images in target domain. Therefore, using <italic>S</italic> and <italic>R</italic> we can translate source domain images <italic>X</italic><sub><italic>s</italic></sub> to the target domain. Thus, target translated source domain images <italic>X</italic><sub><italic>s</italic>&#x02192;<italic>t</italic></sub> &#x0003D; <italic>R</italic>(<italic>S</italic>(<italic>X</italic><sub><italic>s</italic></sub>)). <xref ref-type="fig" rid="F4">Figure 4</xref> shows the visualizations of the impacts of image reconstruction adversarial loss <italic>L</italic><sub><italic>advI</italic></sub> on <italic>X</italic><sub><italic>s</italic>&#x02192;<italic>t</italic></sub>. Finally, we train the segmentation network <italic>S</italic> with {(<italic>X</italic><sub><italic>s</italic>&#x02192;<italic>t</italic></sub>, <italic>Y</italic><sub><italic>s</italic></sub>)} using following <italic>L</italic><sub><italic>trans</italic></sub> loss which is a combination of dice-coefficient loss and entropy minimization loss:</p>
<disp-formula id="E7"><label>(7)</label><mml:math id="M28"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>L</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mi>i</mml:mi><mml:mi>c</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mo>&#x02192;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:mfrac><mml:mrow><mml:mn>2</mml:mn><mml:mo>.</mml:mo><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msubsup><mml:mo>.</mml:mo><mml:mover accent="true"><mml:mrow><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msubsup><mml:mo>&#x0002B;</mml:mo><mml:mover accent="true"><mml:mrow><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="E8"><label>(8)</label><mml:math id="M29"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>L</mml:mi></mml:mrow><mml:mrow><mml:mi>e</mml:mi><mml:mi>m</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mo>&#x02192;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mo>-</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>H</mml:mi><mml:mo>&#x000D7;</mml:mo><mml:mi>W</mml:mi></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>h</mml:mi><mml:mo>,</mml:mo><mml:mi>w</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover><mml:mo class="qopname">log</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo class="qopname">&#x0007E;</mml:mo></mml:mover></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="E9"><label>(9)</label><mml:math id="M30"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>L</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>n</mml:mi><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mo>&#x02192;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>L</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mi>i</mml:mi><mml:mi>c</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mo>&#x02192;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>L</mml:mi></mml:mrow><mml:mrow><mml:mi>e</mml:mi><mml:mi>m</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mo>&#x02192;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <inline-formula><mml:math id="M31"><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:mi>S</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mo>&#x02192;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>. And, <inline-formula><mml:math id="M32"><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> and <inline-formula><mml:math id="M33"><mml:mover accent="true"><mml:mrow><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover></mml:math></inline-formula> are the flattened <italic>Y</italic><sub><italic>s</italic></sub> and <inline-formula><mml:math id="M34"><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover></mml:math></inline-formula>, respectively.</p></sec>
<sec>
<title>3.2.3. Discriminators</title>
<p>We utilize two discriminators in NuSegUDA: Prediction Discriminator (<italic>D</italic><sub><italic>P</italic></sub>) and Image Discriminator (<italic>D</italic><sub><italic>I</italic></sub>). Prediction Discriminator distinguishes between source domain outputs and target domain outputs, whereas Image Discriminator distinguishes between original images and reconstructed images. We discuss the details of both discriminators in the following.</p>
<p><bold>Prediction discriminator</bold> As our goal is to generate similar predictions for both of source images and target images, we incorporate prediction discriminator <italic>D</italic><sub><italic>P</italic></sub> in NuSegUDA. This discriminator takes source domain prediction or target domain prediction as input, and then distinguishes whether the input (i.e., prediction) comes from the source domain or the target domain. To train <italic>D</italic><sub><italic>P</italic></sub>, we use following cross-entropy loss:</p>
<disp-formula id="E10"><label>(10)</label><mml:math id="M35"><mml:mtable columnalign='left'><mml:mtr><mml:mtd><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>d</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>P</mml:mi></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mover accent='true'><mml:mi>Y</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mo stretchy='false'>)</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mo>=</mml:mo><mml:mtext>&#x000A0;&#x000A0;</mml:mtext><mml:mo>&#x02212;</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mrow><mml:msub><mml:mi>H</mml:mi><mml:mi>p</mml:mi></mml:msub><mml:mo>&#x000D7;</mml:mo><mml:msub><mml:mi>W</mml:mi><mml:mi>p</mml:mi></mml:msub></mml:mrow></mml:mfrac><mml:mstyle displaystyle='true'><mml:munder><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:msub><mml:mi>h</mml:mi><mml:mi>p</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>w</mml:mi><mml:mi>p</mml:mi></mml:msub></mml:mrow></mml:munder><mml:mrow><mml:msub><mml:mi>z</mml:mi><mml:mi>p</mml:mi></mml:msub></mml:mrow></mml:mstyle><mml:mo>.</mml:mo><mml:mi>log</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>D</mml:mi><mml:mi>P</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mover accent='true'><mml:mi>Y</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mo stretchy='false'>)</mml:mo><mml:mo stretchy='false'>)</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mo>+</mml:mo><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>z</mml:mi><mml:mi>p</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>.</mml:mo><mml:mi>log</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>D</mml:mi><mml:mi>P</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mover accent='true'><mml:mi>Y</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mo stretchy='false'>)</mml:mo><mml:mo stretchy='false'>)</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <italic>z</italic><sub><italic>p</italic></sub>=0 when <italic>D</italic><sub><italic>P</italic></sub> takes target domain prediction as its input, and <italic>z</italic><sub><italic>p</italic></sub>=1 when the input comes from source domain prediction.</p>
<p><bold>Image discriminator</bold> We use image discriminator <italic>D</italic><sub><italic>I</italic></sub> in NuSegUDA so that the reconstructed image distribution becomes similar to original image distribution. The input of <italic>D</italic><sub><italic>I</italic></sub> is either the original target image or the reconstructed target image. Then, <italic>D</italic><sub><italic>I</italic></sub> distinguishes whether the input is original or the reconstructed one. Similar to <italic>D</italic><sub><italic>P</italic></sub>, we use following cross-entropy loss to train <italic>D</italic><sub><italic>I</italic></sub>:</p>
<disp-formula id="E11"><label>(11)</label><mml:math id="M37"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>L</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>I</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mo>-</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x000D7;</mml:mo><mml:msub><mml:mrow><mml:mi>W</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>h</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munder></mml:mstyle><mml:msub><mml:mrow><mml:mi>z</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo><mml:mo class="qopname">log</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>I</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>z</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>.</mml:mo><mml:mo class="qopname">log</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>I</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <italic>z</italic><sub><italic>i</italic></sub>=0 when <italic>D</italic><sub><italic>I</italic></sub> takes reconstructed target image <inline-formula><mml:math id="M38"><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover></mml:math></inline-formula> as its input, and <italic>z</italic><sub><italic>i</italic></sub>=1 when the input comes from original target images <italic>X</italic><sub><italic>t</italic></sub>.</p></sec>
<sec>
<title>3.2.4. Feature-level adaptation</title>
<p>In addition to image-level domain adaptation at the outputs, we also apply feature-level domain adaptation in NuSegUDA to reduce the domain gap in the feature space. We assume that, our segmentation network <italic>S</italic> is composed of an encoder <italic>S</italic><sub><italic>E</italic></sub> and a decoder <italic>S</italic><sub><italic>D</italic></sub> (i.e., <italic>S</italic> &#x0003D; <italic>S</italic><sub><italic>E</italic></sub><italic>oS</italic><sub><italic>D</italic></sub>). Here, the encoder <italic>S</italic><sub><italic>E</italic></sub> works as a feature extractor. Due to the discrepancy of input statistics across domains, there is also a shift of feature distribution in the feature space spanned by <italic>S</italic><sub><italic>E</italic></sub>. Similar to Toldo et al. (<xref ref-type="bibr" rid="B33">2021</xref>), we utilize a clustering loss at the feature-level to serve as a constraint toward a class-conditional feature alignment between domains.</p>
<p>Given source image <italic>X</italic><sub><italic>s</italic></sub> and target image <italic>X</italic><sub><italic>t</italic></sub>, we first extract the features <italic>F</italic><sub><italic>s</italic></sub> &#x0003D; <italic>S</italic><sub><italic>E</italic></sub>(<italic>X</italic><sub><italic>s</italic></sub>) and <italic>F</italic><sub><italic>t</italic></sub> &#x0003D; <italic>S</italic><sub><italic>E</italic></sub>(<italic>X</italic><sub><italic>t</italic></sub>). Then, the clustering loss is computed as:</p>
<disp-formula id="E12"><label>(12)</label><mml:math id="M39"><mml:mtable columnalign='left'><mml:mtr><mml:mtd><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>c</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>s</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mrow><mml:mo>&#x02223;</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02223;</mml:mo></mml:mrow></mml:mfrac><mml:mstyle displaystyle='true'><mml:munder><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x02208;</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mover accent='true'><mml:mi>y</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x02208;</mml:mo><mml:msub><mml:mover accent='true'><mml:mi>Y</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mrow><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munder><mml:mrow><mml:mi>d</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>f</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>c</mml:mi><mml:mrow><mml:msub><mml:mover accent='true'><mml:mi>y</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:mstyle></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mo>&#x02212;</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mrow><mml:mo>&#x02223;</mml:mo><mml:mi>C</mml:mi><mml:mo>&#x02223;</mml:mo><mml:mo stretchy='false'>(</mml:mo><mml:mo>&#x02223;</mml:mo><mml:mi>C</mml:mi><mml:mo>&#x02223;</mml:mo><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:mfrac><mml:mstyle displaystyle='true'><mml:munder><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>j</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mi>C</mml:mi></mml:mrow></mml:munder><mml:mrow><mml:mstyle displaystyle='true'><mml:munder><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>k</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mi>C</mml:mi><mml:mo>,</mml:mo><mml:mi>k</mml:mi><mml:mo>&#x02260;</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:munder><mml:mrow><mml:mi>d</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>c</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>c</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:mstyle></mml:mrow></mml:mstyle></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <italic>f</italic><sub><italic>i</italic></sub> is the feature vector corresponding to a spatial location of <italic>F</italic><sub><italic>s</italic></sub> or <italic>F</italic><sub><italic>t</italic></sub>, &#x00177;<sub><italic>i</italic></sub> is the corresponding predicted class, and <italic>C</italic> is the set of semantic classes which is {0, 1} for our nuclei segmentation problem. To compute &#x00177;<sub><italic>i</italic></sub>, the segmentation prediction &#x00176; is downsampled to match the spatial dimension of <italic>F</italic>. We set the function <italic>d</italic>(.) to L1 norm. In Equation (12), <italic>c</italic><sub><italic>j</italic></sub> denotes the centroid of semantic class <italic>j</italic>, which is computed using following formula:</p>
<disp-formula id="E13"><label>(13)</label><mml:math id="M40"><mml:mrow><mml:msub><mml:mi>c</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mstyle displaystyle='true'><mml:msub><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mrow><mml:mstyle displaystyle='true'><mml:msub><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:msub><mml:mover accent='true'><mml:mi>y</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mrow><mml:msub><mml:mi>&#x003B4;</mml:mi><mml:mrow><mml:mi>j</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mover accent='true'><mml:mi>y</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msub><mml:msub><mml:mi>f</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mstyle></mml:mrow></mml:mstyle></mml:mrow><mml:mrow><mml:mstyle displaystyle='true'><mml:msub><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:msub><mml:mover accent='true'><mml:mi>y</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mrow><mml:msub><mml:mi>&#x003B4;</mml:mi><mml:mrow><mml:mi>j</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mover accent='true'><mml:mi>y</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msub></mml:mrow></mml:mstyle></mml:mrow></mml:mfrac><mml:mo>,</mml:mo><mml:mi>j</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mo stretchy='false'>&#x0007B;</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>&#x0007D;</mml:mo></mml:mrow></mml:math></disp-formula>
<p>where &#x003B4;<sub><italic>j</italic>,<sub>&#x00177;</sub><sub><italic>i</italic></sub></sub> is equal to 1 if &#x00177;<sub><italic>i</italic></sub> &#x0003D; <italic>j</italic>, and to 0 otherwise.</p>
<p>In Equation (12), the clustering loss is composed of two terms: the first term measures how close the features are from their respective centroids, and the second term measures how far the semantic class centroids are from each other. Therefore, according to the first term, the feature vectors of the same class from same or different domain are tightened around the class feature centroids. And, because of the second term, features from different classes gets a repulsive force applied to feature centroids which moves them apart.</p>
<p>Thus, we minimize the following total loss when training our segmentation network <italic>S</italic> and reconstruction network <italic>R</italic>:</p>
<disp-formula id="E14"><label>(14)</label><mml:math id="M41"><mml:mtable columnalign='left'><mml:mtr><mml:mtd><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>u</mml:mi><mml:mi>d</mml:mi><mml:mi>a</mml:mi></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>s</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>e</mml:mi><mml:mi>g</mml:mi></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>s</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>+</mml:mo><mml:msub><mml:mi>&#x003BB;</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>d</mml:mi><mml:mi>v</mml:mi><mml:mi>P</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>d</mml:mi><mml:mi>v</mml:mi><mml:mi>P</mml:mi></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>+</mml:mo><mml:msub><mml:mi>&#x003BB;</mml:mi><mml:mrow><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>+</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:msub><mml:mi>&#x003BB;</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>d</mml:mi><mml:mi>v</mml:mi><mml:mi>I</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>d</mml:mi><mml:mi>v</mml:mi><mml:mi>I</mml:mi></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>+</mml:mo><mml:msub><mml:mi>&#x003BB;</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>n</mml:mi><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>n</mml:mi><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mo>&#x02192;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>+</mml:mo><mml:msub><mml:mi>&#x003BB;</mml:mi><mml:mrow><mml:mi>c</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>c</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>s</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where, &#x003BB;<sub><italic>advP</italic></sub>, &#x003BB;<sub><italic>recons</italic></sub>, &#x003BB;<sub><italic>advI</italic></sub>, &#x003BB;<sub><italic>trans</italic></sub>, and &#x003BB;<sub><italic>cl</italic></sub> are the weights to balance corresponding losses.</p></sec></sec>
<sec>
<title>3.3. Semi-Supervised Domain Adaptation</title>
<p>In Semi-Supervised Domain Adaptation (SSDA) problem, we aims to ensure the best usages of available target domain annotations <italic>Y</italic><sub><italic>t</italic></sub> when training our segmentation network <italic>S</italic>. In such scenarios, we extend proposed NuSegUDA framework to NuSegSSDA, a nuclei segmentation SSDA model.</p>
<p>In NuSegSSDA, for unannotated target images <inline-formula><mml:math id="M43"><mml:msubsup><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> we follow the same steps as NuSegUDA. However, when we encounter an annotated target data <inline-formula><mml:math id="M44"><mml:mrow><mml:mo stretchy='false'>(</mml:mo><mml:msubsup><mml:mi>X</mml:mi><mml:mi>t</mml:mi><mml:mi>l</mml:mi></mml:msubsup><mml:mo>,</mml:mo><mml:msub><mml:mi>Y</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:math></inline-formula> while training, we additionally compute the segmentation loss <inline-formula><mml:math id="M45"><mml:msub><mml:mrow><mml:mi>L</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>e</mml:mi><mml:mi>g</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula> in the similar manner to Equation (3). Then, while computing the total loss we incorporate <inline-formula><mml:math id="M46"><mml:msub><mml:mrow><mml:mi>L</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>e</mml:mi><mml:mi>g</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula> so that the segmentation network learns to generate the predictions closer to target ground-truths. Therefore, Equation (14) is now modified as below:</p>
<disp-formula id="E15"><label>(15)</label><mml:math id="M47"><mml:mtable columnalign='left'><mml:mtr><mml:mtd><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>s</mml:mi><mml:mi>d</mml:mi><mml:mi>a</mml:mi></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>s</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msubsup><mml:mi>X</mml:mi><mml:mi>t</mml:mi><mml:mi>l</mml:mi></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mi>X</mml:mi><mml:mi>t</mml:mi><mml:mi>u</mml:mi></mml:msubsup><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>e</mml:mi><mml:mi>g</mml:mi></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>s</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>+</mml:mo><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mi>e</mml:mi><mml:mi>g</mml:mi></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msubsup><mml:mi>X</mml:mi><mml:mi>t</mml:mi><mml:mi>l</mml:mi></mml:msubsup><mml:mo stretchy='false'>)</mml:mo><mml:mo>+</mml:mo><mml:msub><mml:mi>&#x003BB;</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>d</mml:mi><mml:mi>v</mml:mi><mml:mi>P</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>d</mml:mi><mml:mi>v</mml:mi><mml:mi>P</mml:mi></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msubsup><mml:mi>X</mml:mi><mml:mi>t</mml:mi><mml:mi>u</mml:mi></mml:msubsup><mml:mo stretchy='false'>)</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mo>+</mml:mo><mml:msub><mml:mi>&#x003BB;</mml:mi><mml:mrow><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>o</mml:mi><mml:mi>n</mml:mi><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msubsup><mml:mi>X</mml:mi><mml:mi>t</mml:mi><mml:mi>u</mml:mi></mml:msubsup><mml:mo stretchy='false'>)</mml:mo><mml:mo>+</mml:mo><mml:msub><mml:mi>&#x003BB;</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>d</mml:mi><mml:mi>v</mml:mi><mml:mi>I</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>d</mml:mi><mml:mi>v</mml:mi><mml:mi>I</mml:mi></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msubsup><mml:mi>X</mml:mi><mml:mi>t</mml:mi><mml:mi>u</mml:mi></mml:msubsup><mml:mo stretchy='false'>)</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mo>+</mml:mo><mml:msub><mml:mi>&#x003BB;</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>n</mml:mi><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>n</mml:mi><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mi>s</mml:mi><mml:mo>&#x02192;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>+</mml:mo><mml:msub><mml:mi>&#x003BB;</mml:mi><mml:mrow><mml:mi>c</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mi>c</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>s</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msubsup><mml:mi>X</mml:mi><mml:mi>t</mml:mi><mml:mi>l</mml:mi></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mi>X</mml:mi><mml:mi>t</mml:mi><mml:mi>u</mml:mi></mml:msubsup><mml:mo stretchy='false'>)</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
</sec></sec>
<sec id="s4">
<title>4. Experiments</title>
<sec>
<title>4.1. Datasets</title>
<p>In our experiments, we use two H&#x00026;E stained histopathology datasets with ground-truth annotations for nuclei segmentation. Both of the datasets that we used are public. We present the brief of the datasets in the following.</p>
<p><bold>Dataset-1 (KIRC)</bold> This dataset is taken from Irshad et al. (<xref ref-type="bibr" rid="B14">2014</xref>) in which the images are extracted at 40x magnification from Whole Slide Images (WSI) of Kidney Renal Clear cell carcinoma (KIRC). This dataset, referred as KIRC, consists of 486 H&#x00026;E stained histology images of 400 &#x000D7; 400 pixel size with annotations made by expert pathologists and research fellows. In our experiments, we randomly split KIRC into 80% for training, 10% for validation, and 10% for testing.</p>
<p><bold>Dataset-2 (TNBC)</bold> Naylor et al. (<xref ref-type="bibr" rid="B25">2018</xref>) generated this dataset by collecting slides from Triple Negative Breast Cancer (TNBC) patients at 40x magnification. For a total of 50 H&#x00026;E stained histology images of pixel size 512 &#x000D7; 512, labeling was performed by expert pathologist and research fellows. We follow the same data splitting as KIRC for this dataset. We refer this dataset as TNBC in our experiments.</p>
<p><bold>Visual differences among datasets</bold> Although both datasets consist of H&#x00026;E stained histopathology images, they are collected from two different organs, cancer types, and institutions. KIRC images are collected from TCGA portal (image acquiring tools are unknown to us), whereas TNBC images were acquired at Curie Institute using Philips Ultra Fast Scanner 1.6RA. Organ difference, cancer type difference, institutional difference, and using different imaging tools and protocols cause the visual difference among the images from these two datasets. See <xref ref-type="fig" rid="F2">Figure 2</xref>, where TNBC image looks dimmer than KIRC image.</p></sec>
<sec>
<title>4.2. Implementations</title>
<p>In our work, we use U-Net (Ronneberger et al., <xref ref-type="bibr" rid="B29">2015</xref>) as both of our segmentation network and reconstruction network. We choose U-Net so that our proposed segmentation framework can be directly applied in other biomedical domains. We preferred U-Net over UNet&#x0002B;&#x0002B; (Zhou et al., <xref ref-type="bibr" rid="B43">2018</xref>) because of the less number of parameters. Following DCGAN (Radford et al., <xref ref-type="bibr" rid="B27">2015</xref>), we designed our prediction discriminator and image discriminator consisting of five convolutional layers. To train NuSegUDA and NuSegSSDA, we followed the training strategy from GAN (Goodfellow et al., <xref ref-type="bibr" rid="B8">2014</xref>). Adam optimizer (Kingma and Ba, <xref ref-type="bibr" rid="B18">2014</xref>) with learning rate 0.0001, 0.001, 0.001, and 0.001 are used in segmentation network, reconstruction network, prediction discriminator, and image discriminator, respectively. We empirically choose 0.001, 0.01, 0.001, 0.001, and 0.002 as &#x003BB;<sub><italic>advP</italic></sub>, &#x003BB;<sub><italic>recons</italic></sub>, &#x003BB;<sub><italic>advI</italic></sub>, &#x003BB;<sub><italic>trans</italic></sub>, and &#x003BB;<sub><italic>cl</italic></sub>, respectively. We implement NuSegUDA and NuSegSSDA using PyTorch (Paszke et al., <xref ref-type="bibr" rid="B26">2019</xref>), and trained on a single GPU. We do not use any data augmentation in our experiments.</p></sec>
<sec>
<title>4.3. Experimental results</title>
<sec>
<title>4.3.1. Unsupervised Domain Adaptation</title>
<p><bold>Experiment-1 (KIRC &#x02192; TNBC)</bold> In our first experiment, we choose KIRC as source domain and TNBC as target domain, denoted by KIRC &#x02192; TNBC. In our experiment, we choose U-Net (Ronneberger et al., <xref ref-type="bibr" rid="B29">2015</xref>) as the representative of Convolutional Neural Network (CNN) based approaches. Fully-supervised segmentation model U-Net gives an insight of how it performs when directly applying transfer learning (i.e., training with only KIRC and then test it on TNBC without any modifications). AdaptSegNet (Tsai et al., <xref ref-type="bibr" rid="B34">2018</xref>) and OrClEmb (Toldo et al., <xref ref-type="bibr" rid="B33">2021</xref>) represent generic Unsupervised Domain Adaptation (UDA) models. DA-ADV (Dong et al., <xref ref-type="bibr" rid="B6">2018</xref>), CellSegUDA (Haq and Huang, <xref ref-type="bibr" rid="B9">2020</xref>), EndoUDA (Celik et al., <xref ref-type="bibr" rid="B2">2021</xref>), SelfEnsemb (Li et al., <xref ref-type="bibr" rid="B20">2021</xref>), and MaNi (Sharma et al., <xref ref-type="bibr" rid="B30">2022</xref>) are chosen as the representatives of UDA model for biomedical image segmentation.</p>
<p>From <xref ref-type="table" rid="T1">Table 1</xref>, we see that source-trained U-Net gives the lower-bound of experimental performance (see first row of <xref ref-type="table" rid="T1">Table 1</xref>) which happens because of the visual domain gap between source training images and target test images, also known as domain shift problem. We see that, our proposed UDA model NuSegUDA outperforms all UDA baseline models in terms of IoU%, Dice score, and Hausdorff distance. Specifically, NuSegUDA has 1.28 and 0.42 higher IoU% than best generic UDA baseline OrClEmb, and best biomedical UDA baseline MaNi, respectively. <xref ref-type="fig" rid="F5">Figure 5</xref> shows the visualization results of CellSegUDA, SelfEnsemb, MaNi, and NuSegUDA. In <xref ref-type="table" rid="T1">Table 1</xref>, the second to last row [i.e., U-Net (target-trained)] shows the upper-bound of experimental performance (i.e., training U-Net with TNBC-train and testing it on TNBC-test).</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Baseline sociodemographic characteristics of participants in the study.</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:#919497">
<th/>
<th valign="top" align="left" colspan="3"><bold>Experiment-1</bold></th>
<th valign="top" align="left" colspan="3"><bold>Experiment-2</bold></th>
</tr>
<tr style="background-color:#919497">
<th/>
<th valign="top" align="left" colspan="3"><bold>KIRC</bold>&#x02192;<bold>TNBC</bold></th>
<th valign="top" align="left" colspan="3"><bold>TNBC</bold>&#x02192;<bold>KIRC</bold></th>
</tr>
</thead>
<tbody>
<tr style="background-color:#919497">
<td valign="top" align="left"><bold>Method</bold></td>
<td valign="top" align="left"><bold>IoU%</bold></td>
<td valign="top" align="left"><bold>Dice score</bold></td>
<td valign="top" align="left"><bold>HD</bold></td>
<td valign="top" align="left"><bold>IoU%</bold></td>
<td valign="top" align="left"><bold>Dice score</bold></td>
<td valign="top" align="left"><bold>HD</bold></td>
</tr> <tr>
<td valign="top" align="left">U-Net (source-trained)</td>
<td valign="top" align="left">52.66</td>
<td valign="top" align="left">0.6875</td>
<td valign="top" align="left">10.1214</td>
<td valign="top" align="left">54.82</td>
<td valign="top" align="left">0.7056</td>
<td valign="top" align="left">9.2487</td>
</tr> <tr>
<td valign="top" align="left">DA-ADV</td>
<td valign="top" align="left">54.93</td>
<td valign="top" align="left">0.7079</td>
<td valign="top" align="left">9.6531</td>
<td valign="top" align="left">55.43</td>
<td valign="top" align="left">0.7107</td>
<td valign="top" align="left">9.0142</td>
</tr> <tr>
<td valign="top" align="left">AdaptSegNet</td>
<td valign="top" align="left">56.49</td>
<td valign="top" align="left">0.7198</td>
<td valign="top" align="left">9.1512</td>
<td valign="top" align="left">56.87</td>
<td valign="top" align="left">0.7235</td>
<td valign="top" align="left">8.3477</td>
</tr> <tr>
<td valign="top" align="left">CellSegUDA</td>
<td valign="top" align="left">59.02</td>
<td valign="top" align="left">0.7394</td>
<td valign="top" align="left">8.5653</td>
<td valign="top" align="left">57.09</td>
<td valign="top" align="left">0.7242</td>
<td valign="top" align="left">8.1739</td>
</tr> <tr>
<td valign="top" align="left">OrClEmb</td>
<td valign="top" align="left">59.23</td>
<td valign="top" align="left">0.7402</td>
<td valign="top" align="left">8.5564</td>
<td valign="top" align="left">57.05</td>
<td valign="top" align="left">0.7236</td>
<td valign="top" align="left">8.1923</td>
</tr> <tr>
<td valign="top" align="left">EndoUDA</td>
<td valign="top" align="left">59.81</td>
<td valign="top" align="left">0.7445</td>
<td valign="top" align="left">8.3317</td>
<td valign="top" align="left">57.39</td>
<td valign="top" align="left">0.7277</td>
<td valign="top" align="left">8.1254</td>
</tr> <tr>
<td valign="top" align="left">SelfEnsemb</td>
<td valign="top" align="left">60.02</td>
<td valign="top" align="left">0.7468</td>
<td valign="top" align="left">8.2524</td>
<td valign="top" align="left">57.45</td>
<td valign="top" align="left">0.7292</td>
<td valign="top" align="left">8.1121</td>
</tr> <tr>
<td valign="top" align="left">MaNi</td>
<td valign="top" align="left">60.09</td>
<td valign="top" align="left">0.7477</td>
<td valign="top" align="left">8.2746</td>
<td valign="top" align="left">57.48</td>
<td valign="top" align="left">0.7293</td>
<td valign="top" align="left">8.1493</td>
</tr> <tr>
<td valign="top" align="left">U-Net (target-trained)</td>
<td valign="top" align="left">66.57</td>
<td valign="top" align="left">0.7985</td>
<td valign="top" align="left">7.7301</td>
<td valign="top" align="left">62.04</td>
<td valign="top" align="left">0.7621</td>
<td valign="top" align="left">7.6281</td>
</tr> <tr>
<td valign="top" align="left">NuSegUDA (ours)</td>
<td valign="top" align="left">60.51</td>
<td valign="top" align="left">0.7525</td>
<td valign="top" align="left">8.0011</td>
<td valign="top" align="left">57.68</td>
<td valign="top" align="left">0.7303</td>
<td valign="top" align="left">8.0881</td>
</tr></tbody>
</table>
<table-wrap-foot>
<p>Unsupervised Domain Adaptation (UDA) results for Experiment-1 and Experiment-2. IoU and HD denotes Intersection over Union, and Hausdorff distance, respectively. Results are from testing on TNBC-test and KIRC-test for experiment-1 and experiment-2, respectively.</p>
</table-wrap-foot>
</table-wrap>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption><p>Visualizations of Unsupervised Domain Adaptation (UDA) for KIRC &#x02192; TNBC. <bold>(A)</bold> Target image, <bold>(B)</bold> Ground-truth, <bold>(C)</bold> CellSegUDA, <bold>(D)</bold> SelfEnsemb, <bold>(E)</bold> MaNi, and <bold>(F)</bold> NuSegUDA (ours). In <bold>(C&#x02013;F)</bold>, green pixels, red pixels, and blue pixels indicate the true positives, false positives, and false negatives, respectively. In other words, green and red pixels indicate the predicted nuclei pixels, whereas green and blue pixels indicate the ground-truth nuclei pixels. This average-dense nuclei histopathology image in <bold>(A)</bold> is chosen so that the reader can easily find out the visual differences without further zooming-in.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fdata-06-1108659-g0005.tif"/>
</fig>
<p><bold>Experiment-2 (TNBC &#x02192; KIRC)</bold> We conduct another experiment in the similar way to experiment-1 by selecting TNBC as source and KIRC as target domain. This experiment also reflects the excellence of NuSegUDA compared to other approaches in terms of segmentation accuracies (see last three columns of <xref ref-type="table" rid="T1">Table 1</xref>).</p></sec>
<sec>
<title>4.3.2. Semi-Supervised Domain Adaptation</title>
<p><bold>Experiment-1 (KIRC &#x02192; TNBC)</bold> In experiment-1, we assess our Semi-Supervised Domain Adaptation (SSDA) method NuSegSSDA for KIRC &#x02192; TNBC. <xref ref-type="table" rid="T2">Table 2</xref> shows the experimental performances of NuSegSSDA. For this experiment, the source dataset KIRC is the same as UDA experiments. However, now we treat TNBC as partially labeled. We train NuSegSSDA considering 10%, 25%, 50%, and 75% images from TNBC-train dataset have annotations available. Then, testing on TNBC-test gives us increasing IoUs and Dice scores, and decreasing Hausdorff Distances. This happens because more false negative nuclei can be identified and some false positive nuclei can be removed by NuSegSSDA as we train it with more target annotations (see <xref ref-type="fig" rid="F6">Figure 6</xref>). We compare NuSegSSDA with fully-supervised model U-Net (Ronneberger et al., <xref ref-type="bibr" rid="B29">2015</xref>), and baseline biomedical SSDA model CellSegSSDA (Haq and Huang, <xref ref-type="bibr" rid="B9">2020</xref>) to demonstrate the superiority of our proposed SSDA model. To train U-Net, we combine full KIRC dataset with the same 10%, 25%, 50%, and 75% of TNBC-train we chose to train NuSegSSDA. We observe that, the accuracy of NuSegSSDA approaches to the upper-bound (only lower by 1.35 IoU%) as we train with more annotations from target domain.</p>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Semi-Supervised Domain Adaptation (SSDA) results for Experiment-1 and Experiment-2.</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:#919497">
<th/>
<th valign="top" align="left" colspan="3"><bold>Experiment-1</bold></th>
<th valign="top" align="left" colspan="3"><bold>Experiment-2</bold></th>
</tr>
<tr style="background-color:#919497">
<th/>
<th valign="top" align="left" colspan="3"><bold>KIRC</bold>&#x02192;<bold>TNBC</bold></th>
<th valign="top" align="left" colspan="3"><bold>TNBC</bold>&#x02192;<bold>KIRC</bold></th>
</tr>
</thead>
<tbody>
<tr style="background-color:#919497">
<td valign="top" align="left"><bold>Method</bold></td>
<td valign="top" align="left"><bold>IoU%</bold></td>
<td valign="top" align="left"><bold>Dice</bold></td>
<td valign="top" align="left"><bold>HD</bold></td>
<td valign="top" align="left"><bold>IoU%</bold></td>
<td valign="top" align="left"><bold>Dice</bold></td>
<td valign="top" align="left"><bold>HD</bold></td>
</tr> <tr>
<td valign="top" align="left">U-Net (source 100% &#x0002B; target 10%)</td>
<td valign="top" align="left">60.74</td>
<td valign="top" align="left">0.7534</td>
<td valign="top" align="left">8.3627</td>
<td valign="top" align="left">56.89</td>
<td valign="top" align="left">0.7194</td>
<td valign="top" align="left">8.5122</td>
</tr> <tr>
<td valign="top" align="left">CellSegSSDA (source 100% &#x0002B; target 10%)</td>
<td valign="top" align="left">60.96</td>
<td valign="top" align="left">0.7557</td>
<td valign="top" align="left">8.3563</td>
<td valign="top" align="left">58.81</td>
<td valign="top" align="left">0.7377</td>
<td valign="top" align="left">7.9817</td>
</tr> <tr>
<td valign="top" align="left">NuSegSSDA (source 100% &#x0002B; target 10%) (ours)</td>
<td valign="top" align="left"><bold>61.12</bold></td>
<td valign="top" align="left"><bold>0.7578</bold></td>
<td valign="top" align="left"><bold>8.3274</bold></td>
<td valign="top" align="left"><bold>58.99</bold></td>
<td valign="top" align="left"><bold>0.7401</bold></td>
<td valign="top" align="left"><bold>7.9629</bold></td>
</tr> <tr>
<td valign="top" align="left">U-Net (source 100% &#x0002B; target 25%)</td>
<td valign="top" align="left">61.67</td>
<td valign="top" align="left">0.7607</td>
<td valign="top" align="left">8.2742</td>
<td valign="top" align="left">59.32</td>
<td valign="top" align="left">0.7405</td>
<td valign="top" align="left">7.9211</td>
</tr> <tr>
<td valign="top" align="left">CellSegSSDA (source 100% &#x0002B; target 25%)</td>
<td valign="top" align="left">62.94</td>
<td valign="top" align="left">0.771</td>
<td valign="top" align="left">8.0966</td>
<td valign="top" align="left">59.73</td>
<td valign="top" align="left">0.7443</td>
<td valign="top" align="left"><bold>7.8647</bold></td>
</tr> <tr>
<td valign="top" align="left">NuSegSSDA (source 100% &#x0002B; target 25%) (ours)</td>
<td valign="top" align="left"><bold>63.15</bold></td>
<td valign="top" align="left"><bold>0.7732</bold></td>
<td valign="top" align="left"><bold>8.0487</bold></td>
<td valign="top" align="left"><bold>59.79</bold></td>
<td valign="top" align="left"><bold>0.7449</bold></td>
<td valign="top" align="left">7.8752</td>
</tr> <tr>
<td valign="top" align="left">U-Net (source 100% &#x0002B; target 50%)</td>
<td valign="top" align="left">56.73</td>
<td valign="top" align="left">0.7208</td>
<td valign="top" align="left">9.1473</td>
<td valign="top" align="left">59.95</td>
<td valign="top" align="left">0.7464</td>
<td valign="top" align="left">7.8461</td>
</tr> <tr>
<td valign="top" align="left">CellSegSSDA (source 100% &#x0002B; target 50%)</td>
<td valign="top" align="left">63.59</td>
<td valign="top" align="left">0.7748</td>
<td valign="top" align="left">7.9802</td>
<td valign="top" align="left">60.32</td>
<td valign="top" align="left">0.7494</td>
<td valign="top" align="left">7.7958</td>
</tr> <tr>
<td valign="top" align="left">NuSegSSDA (source 100% &#x0002B; target 50%) (ours)</td>
<td valign="top" align="left"><bold>63.97</bold></td>
<td valign="top" align="left"><bold>0.7802</bold></td>
<td valign="top" align="left"><bold>7.9549</bold></td>
<td valign="top" align="left"><bold>60.53</bold></td>
<td valign="top" align="left"><bold>0.7511</bold></td>
<td valign="top" align="left"><bold>7.7754</bold></td>
</tr> <tr>
<td valign="top" align="left">U-Net (source 100% &#x0002B; target 75%)</td>
<td valign="top" align="left">59.06</td>
<td valign="top" align="left">0.7394</td>
<td valign="top" align="left">8.6286</td>
<td valign="top" align="left">61.63</td>
<td valign="top" align="left">0.7592</td>
<td valign="top" align="left">7.7026</td>
</tr> <tr>
<td valign="top" align="left">CellSegSSDA (source 100% &#x0002B; target 75%)</td>
<td valign="top" align="left">64.96</td>
<td valign="top" align="left">0.7862</td>
<td valign="top" align="left">7.8496</td>
<td valign="top" align="left">61.01</td>
<td valign="top" align="left">0.7541</td>
<td valign="top" align="left">7.7275</td>
</tr> <tr>
<td valign="top" align="left">NuSegSSDA (source 100% &#x0002B; target 75%) (ours)</td>
<td valign="top" align="left"><bold>65.22</bold></td>
<td valign="top" align="left"><bold>0.7901</bold></td>
<td valign="top" align="left"><bold>7.7928</bold></td>
<td valign="top" align="left"><bold>61.68</bold></td>
<td valign="top" align="left"><bold>0.7598</bold></td>
<td valign="top" align="left"><bold>7.6872</bold></td>
</tr> <tr>
<td valign="top" align="left">U-Net (target 100%)</td>
<td valign="top" align="left">66.57</td>
<td valign="top" align="left">0.7985</td>
<td valign="top" align="left">7.7301</td>
<td valign="top" align="left">62.04</td>
<td valign="top" align="left">0.7621</td>
<td valign="top" align="left">7.6281</td>
</tr></tbody>
</table>
<table-wrap-foot>
<p>IoU, Dice, and HD denotes Intersection over Union, Dice score, and Hausdorff Distance, respectively. NuSegSSDA refers to our proposed SSDA model. NuSegSSDA (source 100% &#x0002B; target <italic>n</italic>%) denotes <italic>n</italic>% annotations available in TNBC-train and KIRC-train for experiment-1 and experiment-2, respectively. Results are from testing on TNBC-test and KIRC-test for experiment-1 and experiment-2, respectively. Bold values denote the best scores among the experiments for different <italic>n</italic> percentages (i.e., source 100% &#x0002B; target <italic>n</italic>%).</p>
</table-wrap-foot>
</table-wrap>
<fig id="F6" position="float">
<label>Figure 6</label>
<caption><p>Visualizations of Semi-Supervised Domain Adaptation (SSDA) for KIRC &#x02192; TNBC. <bold>(A)</bold> Target image, <bold>(B)</bold> Ground-truth, <bold>(C)</bold> NuSegSSDA (10%), <bold>(D)</bold> NuSegSSDA (25%), <bold>(E)</bold> NuSegSSDA (50%), and <bold>(F)</bold> NuSegSSDA (75%). In <bold>(C-F)</bold>, green pixels, red pixels, and blue pixels indicate the true positives, false positives, and false negatives, respectively.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fdata-06-1108659-g0006.tif"/>
</fig>
<p><bold>Experiment-2 (TNBC &#x02192; KIRC)</bold> In our second experiment, we select TNBC as source and KIRC as target domain. The second experiment also demonstrates the excellence of NuSegSSDA compared to U-Net (Ronneberger et al., <xref ref-type="bibr" rid="B29">2015</xref>) and CellSegSSDA (Haq and Huang, <xref ref-type="bibr" rid="B9">2020</xref>) (see last three columns of <xref ref-type="table" rid="T2">Table 2</xref>). Similar to experiment-1, for the second experiment we again see that the segmentation accuracies of NuSegSSDA increase when more target images are annotated.</p></sec>
<sec>
<title>4.3.3. Ablation studies</title>
<p>To verify the robustness of proposed UDA framework, we perform extensive ablation studies on the adaptation of NuSegUDA from KIRC to TNBC, and from TNBC to KIRC. First, we examine the contribution of each loss to the final IoU%, Dice score, and Hausdorff Distance; then, we investigate the effects of different segmentation network backbones on NuSegUDA.</p>
<p><bold>Effectiveness of losses</bold> The contribution of image adversarial loss <italic>L</italic><sub><italic>advI</italic></sub>, target-translated source supervised loss <italic>L</italic><sub><italic>trans</italic></sub>, and clustering loss <italic>L</italic><sub><italic>cl</italic></sub> to our proposed NuSegUDA model is shown in <xref ref-type="table" rid="T3">Table 3</xref>. We see that, simply applying only <italic>L</italic><sub><italic>advI</italic></sub> or <italic>L</italic><sub><italic>cl</italic></sub> to CellSegUDA (Haq and Huang, <xref ref-type="bibr" rid="B9">2020</xref>) gives little better performance than CellSegUDA alone. However, when we apply only target-translated source supervised loss <italic>L</italic><sub><italic>trans</italic></sub> to CellSegUDA, the performance is inferior due to the absence of <italic>L</italic><sub><italic>advI</italic></sub> loss. Without applying image-adversarial loss <italic>L</italic><sub><italic>advI</italic></sub>, target-translated source images <italic>X</italic><sub><italic>s</italic>&#x02192;<italic>t</italic></sub> looks very different from the target-domain images in terms of texture, style, color distribution, etc. (see <xref ref-type="fig" rid="F4">Figure 4</xref>). As a result, the performance of the model (i.e., CellSegUDA w/ <italic>L</italic><sub><italic>trans</italic></sub>) decreases when trained with these <italic>X</italic><sub><italic>s</italic>&#x02192;<italic>t</italic></sub> images.</p>
<table-wrap position="float" id="T3">
<label>Table 3</label>
<caption><p>Impacts of <italic>L</italic><sub><italic>advI</italic></sub>, <italic>L</italic><sub><italic>trans</italic></sub>, and <italic>L</italic><sub><italic>cl</italic></sub> loss on NuSegUDA for Experiment-1 and Experiment-2.</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:#919497">
<th/>
<th/>
<th/>
<th/>
<th valign="top" align="left" colspan="3"><bold>Experiment-1</bold></th>
<th valign="top" align="left" colspan="3"><bold>Experiment-2</bold></th>
</tr>
<tr style="background-color:#919497">
<th/>
<th/>
<th/>
<th/>
<th valign="top" align="left" colspan="3"><bold>KIRC</bold>&#x02192;<bold>TNBC</bold></th>
<th valign="top" align="left" colspan="3"><bold>TNBC</bold>&#x02192;<bold>KIRC</bold></th>
</tr>
</thead>
<tbody>
<tr style="background-color:#919497">
<td valign="top" align="left"><bold>Method</bold></td>
<td valign="top" align="left"><italic>L</italic><sub><italic>advI</italic></sub></td>
<td valign="top" align="left"><italic>L</italic><sub><italic>trans</italic></sub></td>
<td valign="top" align="left"><italic>L</italic><sub><italic>cl</italic></sub></td>
<td valign="top" align="left"><bold>IoU%</bold></td>
<td valign="top" align="left"><bold>Dice</bold></td>
<td valign="top" align="left"><bold>HD</bold></td>
<td valign="top" align="left"><bold>IoU%</bold></td>
<td valign="top" align="left"><bold>Dice</bold></td>
<td valign="top" align="left"><bold>HD</bold></td>
</tr> <tr>
<td valign="top" align="left">CellSegUDA</td>
<td/>
<td/>
<td/>
<td valign="top" align="left">59.02</td>
<td valign="top" align="left">0.7394</td>
<td valign="top" align="left">8.5653</td>
<td valign="top" align="left">57.09</td>
<td valign="top" align="left">0.7242</td>
<td valign="top" align="left">8.1739</td>
</tr> <tr>
<td valign="top" align="left">CellSegUDA w/ <italic>L</italic><sub><italic>advI</italic></sub></td>
<td valign="top" align="left">&#x02713;</td>
<td/>
<td/>
<td valign="top" align="left">59.38</td>
<td valign="top" align="left">0.7405</td>
<td valign="top" align="left">8.4316</td>
<td valign="top" align="left">57.17</td>
<td valign="top" align="left">0.7252</td>
<td valign="top" align="left">8.1422</td>
</tr> <tr>
<td valign="top" align="left">CellSegUDA w/ <italic>L</italic><sub><italic>trans</italic></sub></td>
<td/>
<td valign="top" align="left">&#x02713;</td>
<td/>
<td valign="top" align="left">58.44</td>
<td valign="top" align="left">0.7357</td>
<td valign="top" align="left">8.6123</td>
<td valign="top" align="left">56.77</td>
<td valign="top" align="left">0.7209</td>
<td valign="top" align="left">8.3865</td>
</tr> <tr>
<td valign="top" align="left">CellSegUDA w/ <italic>L</italic><sub><italic>cl</italic></sub></td>
<td/>
<td/>
<td valign="top" align="left">&#x02713;</td>
<td valign="top" align="left">59.11</td>
<td valign="top" align="left">0.7398</td>
<td valign="top" align="left">8.5734</td>
<td valign="top" align="left">57.02</td>
<td valign="top" align="left">0.7237</td>
<td valign="top" align="left">8.1203</td>
</tr> <tr>
<td valign="top" align="left">NuSegUDA w/o <italic>L</italic><sub><italic>advI</italic></sub></td>
<td/>
<td valign="top" align="left">&#x02713;</td>
<td valign="top" align="left">&#x02713;</td>
<td valign="top" align="left">58.59</td>
<td valign="top" align="left">0.7365</td>
<td valign="top" align="left">8.5914</td>
<td valign="top" align="left">56.82</td>
<td valign="top" align="left">0.7212</td>
<td valign="top" align="left">8.3685</td>
</tr> <tr>
<td valign="top" align="left">NuSegUDA w/o <italic>L</italic><sub><italic>trans</italic></sub></td>
<td valign="top" align="left">&#x02713;</td>
<td/>
<td valign="top" align="left">&#x02713;</td>
<td valign="top" align="left">59.45</td>
<td valign="top" align="left">0.7411</td>
<td valign="top" align="left">8.4021</td>
<td valign="top" align="left">57.19</td>
<td valign="top" align="left">0.7253</td>
<td valign="top" align="left">8.2468</td>
</tr> <tr>
<td valign="top" align="left">NuSegUDA w/o <italic>L</italic><sub><italic>cl</italic></sub></td>
<td valign="top" align="left">&#x02713;</td>
<td valign="top" align="left">&#x02713;</td>
<td/>
<td valign="top" align="left">60.36</td>
<td valign="top" align="left">0.7512</td>
<td valign="top" align="left">8.1963</td>
<td valign="top" align="left">57.63</td>
<td valign="top" align="left">0.7298</td>
<td valign="top" align="left">8.1247</td>
</tr> <tr>
<td valign="top" align="left">NuSegUDA (ours)</td>
<td valign="top" align="left">&#x02713;</td>
<td valign="top" align="left">&#x02713;</td>
<td valign="top" align="left">&#x02713;</td>
<td valign="top" align="left">60.51</td>
<td valign="top" align="left">0.7525</td>
<td valign="top" align="left">8.0011</td>
<td valign="top" align="left">57.68</td>
<td valign="top" align="left">0.7303</td>
<td valign="top" align="left">8.0880</td>
</tr></tbody>
</table>
<table-wrap-foot>
<p>IoU, Dice, and HD denotes Intersection over Union, Dice score, and Hausdorff Distance, respectively. NuSegUDA w/o <italic>L</italic><sub><italic>advI</italic></sub>, NuSegUDA w/o <italic>L</italic><sub><italic>trans</italic></sub>, and NuSegUDA w/o <italic>L</italic><sub><italic>cl</italic></sub> refer to our proposed UDA model without image adversarial loss, target-translated source supervised loss, and clustering loss, respectively. Results are from testing on TNBC-test and KIRC-test for experiment-1 and experiment-2, respectively.</p>
</table-wrap-foot>
</table-wrap>
<p>Similarly, NuSegUDA w/o <italic>L</italic><sub><italic>advI</italic></sub> gives much worse performance than NuSegUDA which happens because of training NuSegUDA with less-realistic target-translated source domain images. This again validates the effectiveness of <italic>L</italic><sub><italic>advI</italic></sub> on NuSegUDA. Finally, with all the proposed losses enabled, we achieve the best performing model NuSegUDA for both of the experiments which demonstrates the combined impact of newly proposed image adversarial loss, target-translated source supervised loss, and clustering loss on NuSegUDA. <xref ref-type="fig" rid="F7">Figure 7</xref> shows the visualization results of NuSegUDA w/o <italic>L</italic><sub><italic>advI</italic></sub>, NuSegUDA w/o <italic>L</italic><sub><italic>trans</italic></sub>, NuSegUDA w/o <italic>L</italic><sub><italic>cl</italic></sub>, and NuSegUDA.</p>
<fig id="F7" position="float">
<label>Figure 7</label>
<caption><p>Visualizations of the effectiveness of proposed <italic>L</italic><sub><italic>advI</italic></sub>, <italic>L</italic><sub><italic>trans</italic></sub>, and <italic>L</italic><sub><italic>cl</italic></sub> loss on NuSegUDA for KIRC &#x02192; TNBC. <bold>(A)</bold> Target image, <bold>(B)</bold> Ground-truth, <bold>(C)</bold> NuSegUDA w/o <italic>L</italic><sub><italic>advl</italic></sub>, <bold>(D)</bold> NuSegUDA w/o <italic>L</italic><sub><italic>trans</italic></sub>, <bold>(E)</bold> NuSegUDA w/o <italic>L</italic><sub><italic>cl</italic></sub>, and <bold>(F)</bold> NuSegUDA. In <bold>(C-F)</bold>, green pixels, red pixels, and blue pixels indicate the true positives, false positives, and false negatives, respectively.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fdata-06-1108659-g0007.tif"/>
</fig>
<p><bold>Impacts of different segmentation networks</bold> In NuSegUDA, we use U-Net (Ronneberger et al., <xref ref-type="bibr" rid="B29">2015</xref>) as the backbone segmentation network. We also assess the model performance by replacing the backbone segmentation network with two more frequently-used Convolutional Neural Network (CNN) based approaches: FCN (Long et al., <xref ref-type="bibr" rid="B22">2015</xref>) and UNet&#x0002B;&#x0002B; (Zhou et al., <xref ref-type="bibr" rid="B43">2018</xref>). As mentioned earlier, CNN based approaches are still the dominant ones for semantic segmentation of nuclei. However, due to the intrinsic locality nature and limited receptive fields of convolution operations, CNN based models may be incapable of capturing the global context of the input (Chen et al., <xref ref-type="bibr" rid="B4">2021</xref>; Jia et al., <xref ref-type="bibr" rid="B16">2021</xref>; Zheng et al., <xref ref-type="bibr" rid="B42">2021</xref>). To this end, we explore the feasibility of Transformers, an alternative to CNNs, as the backbone segmentation network in NuSegUDA. Transformer mainly utilizes self-attention mechanism to extract inherent features (Tay et al., <xref ref-type="bibr" rid="B32">2020</xref>), and due to this self-attention mechanism, transformers are powerful at modeling the global context of an input (Zheng et al., <xref ref-type="bibr" rid="B42">2021</xref>). To examine the effectiveness of Vision Transformer based model, we replace U-Net in NuSegUDA with TransUNet (Chen et al., <xref ref-type="bibr" rid="B4">2021</xref>) which basically combines a hybrid CNN-transformer encoder architecture with a decoder.</p>
<p><xref ref-type="table" rid="T4">Table 4</xref> shows the quantitative results of using different segmentation networks in NuSegUDA. We see that, among CNN-based models, UNet&#x0002B;&#x0002B; and U-Net outperform other CNN approaches in Experiment-1, and Experiment-2, respectively. We also see that, Transformer-based model TransUNet does not give any better accuracy than U-Net and UNet&#x0002B;&#x0002B; for both of the experiments. This happens due to our small-sized training datasets, because Vision Transformers (VT) need lot of data for training, usually more than what is necessary to standard CNNs (Liu et al., <xref ref-type="bibr" rid="B21">2021</xref>).</p>
<table-wrap position="float" id="T4">
<label>Table 4</label>
<caption><p>Impacts of different segmentation network backbones in NuSegUDA.</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:#919497">
<th valign="top" align="left"><bold>Segmentation network</bold></th>
<th valign="top" align="left" colspan="3"><bold>Experiment-1</bold></th>
<th valign="top" align="left" colspan="3"><bold>Experiment-2</bold></th>
</tr>
<tr>
<th/>
<th valign="top" align="left" colspan="3"><bold>KIRC</bold>&#x02192;<bold>TNBC</bold></th>
<th valign="top" align="left" colspan="3"><bold>TNBC</bold>&#x02192;<bold>KIRC</bold></th>
</tr>
<tr>
<th/>
</tr>
</thead>
<tbody>
<tr>
<td/>
<td valign="top" align="left"><bold>IoU%</bold></td>
<td valign="top" align="left"><bold>Dice score</bold></td>
<td valign="top" align="left"><bold>HD</bold></td>
<td valign="top" align="left"><bold>IoU%</bold></td>
<td valign="top" align="left"><bold>Dice score</bold></td>
<td valign="top" align="left"><bold>HD</bold></td>
</tr> <tr>
<td valign="top" align="left">FCN</td>
<td valign="top" align="left">59.23</td>
<td valign="top" align="left">0.7398</td>
<td valign="top" align="left">8.4125</td>
<td valign="top" align="left">55.81</td>
<td valign="top" align="left">0.7165</td>
<td valign="top" align="left">8.7365</td>
</tr> <tr>
<td valign="top" align="left">U-Net</td>
<td valign="top" align="left">60.51</td>
<td valign="top" align="left">0.7525</td>
<td valign="top" align="left">8.0011</td>
<td valign="top" align="left">57.68</td>
<td valign="top" align="left">0.7303</td>
<td valign="top" align="left">8.0880</td>
</tr> <tr>
<td valign="top" align="left">UNet&#x0002B;&#x0002B;</td>
<td valign="top" align="left">60.57</td>
<td valign="top" align="left">0.7529</td>
<td valign="top" align="left">8.0336</td>
<td valign="top" align="left">57.41</td>
<td valign="top" align="left">0.7282</td>
<td valign="top" align="left">8.1575</td>
</tr> <tr>
<td valign="top" align="left">TransUNet</td>
<td valign="top" align="left">59.87</td>
<td valign="top" align="left">0.7476</td>
<td valign="top" align="left">8.1562</td>
<td valign="top" align="left">57.02</td>
<td valign="top" align="left">0.7256</td>
<td valign="top" align="left">8.1742</td>
</tr></tbody>
</table>
</table-wrap></sec></sec></sec>
<sec id="s5">
<title>5. Conclusion</title>
<p>Accurate nuclei segmentation is a significant step for cancer diagnosis and further clinical procedures. Collecting a fully annotated nuclei segmentation dataset, or manually labeling an unannotated dataset is expensive, time-consuming, and impractical although such annotations are required to train Convolutional Neural Networks in fully-supervised manner. In this work, we propose a novel Unsupervised Domain Adaptation (UDA) framework named NuSegUDA for segmenting nuclei in unannotated datasets by utilizing adversarial learning. In NuSegUDA, we apply domain adaptation at both of feature space and output space. We also incorporate image adversarial loss and target-translated source supervised loss into NuSegUDA, and train the model with target-translated source domain images. Extensive and prominent experimental results validate the effectiveness of each of the newly proposed modules and losses, and the superiority of NuSegUDA over baseline models. Finally, assuming we have a few annotations available, we extend our work to Semi-Supervised Domain Adaptation (SSDA). We expect our proposed UDA and SSDA approaches to be very useful in other biomedical image segmentation tasks.</p></sec>
<sec sec-type="data-availability" id="s6">
<title>Data availability statement</title>
<p>The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.</p></sec>
<sec sec-type="author-contributions" id="s7">
<title>Author contributions</title>
<p>MH, HM, and JH contributed to the methodology and design of the study. MH implemented the proposed method, organized the experimental section, and wrote the first draft of the manuscript. HM and JH corrected the draft. All authors contributed to the article and approved the submitted version.</p></sec>
</body>
<back>
<sec sec-type="funding-information" id="s8">
<title>Funding</title>
<p>This work was partially supported by US NSF CAREER Award IIS-1553687.</p>
</sec>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s9">
<title>Publisher&#x00027;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Boykov</surname> <given-names>Y.</given-names></name> <name><surname>Kolmogorov</surname> <given-names>V.</given-names></name></person-group> (<year>2004</year>). <article-title>An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision</article-title>. <source>IEEE Trans. Pattern Anal. Mach. Intell.</source> <volume>26</volume>, <fpage>1124</fpage>&#x02013;<lpage>1137</lpage>. <pub-id pub-id-type="doi">10.1109/TPAMI.2004.60</pub-id><pub-id pub-id-type="pmid">15742889</pub-id></citation></ref>
<ref id="B2">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Celik</surname> <given-names>N.</given-names></name> <name><surname>Ali</surname> <given-names>S.</given-names></name> <name><surname>Gupta</surname> <given-names>S.</given-names></name> <name><surname>Braden</surname> <given-names>B.</given-names></name> <name><surname>Rittscher</surname> <given-names>J.</given-names></name></person-group> (<year>2021</year>). <article-title>&#x0201C;Endouda: a modality independent segmentation approach for endoscopy imaging,&#x0201D;</article-title> in <source>International Conference on Medical Image Computing and Computer-Assisted Intervention - MICCAI 2021. Lecture Notes in Computer Science</source>, Vol. 12903 (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>303</fpage>&#x02013;<lpage>312</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-030-87199-4_29</pub-id><pub-id pub-id-type="pmid">36765794</pub-id></citation></ref>
<ref id="B3">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>C.</given-names></name> <name><surname>Dou</surname> <given-names>Q.</given-names></name> <name><surname>Chen</surname> <given-names>H.</given-names></name> <name><surname>Qin</surname> <given-names>J.</given-names></name> <name><surname>Heng</surname> <given-names>P.-A.</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;Synergistic image and feature adaptation: towards cross-modality domain adaptation for medical image segmentation,&#x0201D;</article-title> in <source>AAAI&#x00027;19: Proceedings of the AAAI Conference on Artificial Intelligence</source>, Vol. 33 (Honolulu, HI), <fpage>865</fpage>&#x02013;<lpage>872</lpage>. <pub-id pub-id-type="doi">10.1609/aaai.v33i01.3301865</pub-id></citation>
</ref>
<ref id="B4">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>J.</given-names></name> <name><surname>Lu</surname> <given-names>Y.</given-names></name> <name><surname>Yu</surname> <given-names>Q.</given-names></name> <name><surname>Luo</surname> <given-names>X.</given-names></name> <name><surname>Adeli</surname> <given-names>E.</given-names></name> <name><surname>Wang</surname> <given-names>Y.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>TransUNet: Transformers make strong encoders for medical image segmentation</article-title>. <source>arXiv Preprint</source> arXiv:2102.04306. <pub-id pub-id-type="doi">10.48550/arXiv.2102.04306</pub-id></citation>
</ref>
<ref id="B5">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>Y.-C.</given-names></name> <name><surname>Lin</surname> <given-names>Y.-Y.</given-names></name> <name><surname>Yang</surname> <given-names>M.-H.</given-names></name> <name><surname>Huang</surname> <given-names>J.-B.</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;CrDoCo: Pixel-level domain transfer with cross-domain consistency,&#x0201D;</article-title> in <source>2019 Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</source> (<publisher-loc>Long Beach, CA</publisher-loc>), <fpage>1791</fpage>&#x02013;<lpage>1800</lpage>. <pub-id pub-id-type="doi">10.1109/CVPR.2019.00189</pub-id><pub-id pub-id-type="pmid">36227829</pub-id></citation></ref>
<ref id="B6">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Dong</surname> <given-names>N.</given-names></name> <name><surname>Kampffmeyer</surname> <given-names>M.</given-names></name> <name><surname>Liang</surname> <given-names>X.</given-names></name> <name><surname>Wang</surname> <given-names>Z.</given-names></name> <name><surname>Dai</surname> <given-names>W.</given-names></name> <name><surname>Xing</surname> <given-names>E.</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;Unsupervised domain adaptation for automatic estimation of cardiothoracic ratio,&#x0201D;</article-title> in <source>International Conference on Medical Image Computing and Computer-Assisted Intervention - MICCAI 2018. Lecture Notes in Computer Science</source>, Vol. 11071, eds A. Frangi, J. Schnabel, C. Davatzikos, C. Alberola-L&#x000F3;pez, and G. Fichtinger (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>544</fpage>&#x02013;<lpage>552</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-030-00934-2_61</pub-id></citation>
</ref>
<ref id="B7">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Gholami</surname> <given-names>A.</given-names></name> <name><surname>Subramanian</surname> <given-names>S.</given-names></name> <name><surname>Shenoy</surname> <given-names>V.</given-names></name> <name><surname>Himthani</surname> <given-names>N.</given-names></name> <name><surname>Yue</surname> <given-names>X.</given-names></name> <name><surname>Zhao</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>&#x0201C;A novel domain adaptation framework for medical image segmentation,&#x0201D;</article-title> in <source>International MICCAI Brainlesion Workshop. BrainLes 2018: Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries. BrainLes 2018. Lecture Notes in Computer Science</source>, Vol. 11384, eds A. Crimi, S. Bakas, H. Kuijf, F. Keyvan, M. Reyes, and T. van Walsum (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>289</fpage>&#x02013;<lpage>298</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-030-11726-9_26</pub-id><pub-id pub-id-type="pmid">35324451</pub-id></citation></ref>
<ref id="B8">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Goodfellow</surname> <given-names>I.</given-names></name> <name><surname>Pouget-Abadie</surname> <given-names>J.</given-names></name> <name><surname>Mirza</surname> <given-names>M.</given-names></name> <name><surname>Xu</surname> <given-names>B.</given-names></name> <name><surname>Warde-Farley</surname> <given-names>D.</given-names></name> <name><surname>Ozair</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2014</year>). <article-title>&#x0201C;Generative adversarial nets,&#x0201D;</article-title> in <source>Advances in Neural Information Processing Systems 27 (NIPS 2014)</source>, eds Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K. Q. Weinberger (<publisher-loc>Montreal, QC</publisher-loc>), <fpage>2672</fpage>&#x02013;<lpage>2680</lpage>.</citation>
</ref>
<ref id="B9">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Haq</surname> <given-names>M. M.</given-names></name> <name><surname>Huang</surname> <given-names>J.</given-names></name></person-group> (<year>2020</year>). <article-title>&#x0201C;Adversarial domain adaptation for cell segmentation,&#x0201D;</article-title> in <source>Proceedings of the Third Conference on Medical Imaging with Deep Learning, PMLR</source>, Vol. 121 eds T. Arbel, I. B. Ayed, M. de Bruijne, M. Descoteaux, H. Lombaert, and C. Pal (Montreal, QC), <fpage>277</fpage>&#x02013;<lpage>287</lpage>.</citation>
</ref>
<ref id="B10">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Haq</surname> <given-names>M. M.</given-names></name> <name><surname>Huang</surname> <given-names>J.</given-names></name></person-group> (<year>2022</year>). <article-title>&#x0201C;Self-supervised pre-training for nuclei segmentation,&#x0201D;</article-title> in <source>International Conference on Medical Image Computing and Computer-Assisted Intervention. Medical Image Computing and Computer Assisted Intervention - MICCAI 2022. Lecture Notes in Computer Science</source>, Vol. 13432, eds L. Wang, Q. Dou, P.T. Fletcher, S. Speidel, and S. Li (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>303</fpage>&#x02013;<lpage>313</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-031-16434-7_30</pub-id></citation>
</ref>
<ref id="B11">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hoffman</surname> <given-names>J.</given-names></name> <name><surname>Tzeng</surname> <given-names>E.</given-names></name> <name><surname>Park</surname> <given-names>T.</given-names></name> <name><surname>Zhu</surname> <given-names>J.-Y.</given-names></name> <name><surname>Isola</surname> <given-names>P.</given-names></name> <name><surname>Saenko</surname> <given-names>K.</given-names></name> <etal/></person-group>. (<year>2017</year>). <article-title>Cycada: Cycle-consistent adversarial domain adaptation</article-title>. <source>arXiv Preprint</source> arXiv:1711.03213. <pub-id pub-id-type="doi">10.48550/arXiv.1711.03213</pub-id></citation>
</ref>
<ref id="B12">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Hou</surname> <given-names>L.</given-names></name> <name><surname>Agarwal</surname> <given-names>A.</given-names></name> <name><surname>Samaras</surname> <given-names>D.</given-names></name> <name><surname>Kurc</surname> <given-names>T. M.</given-names></name> <name><surname>Gupta</surname> <given-names>R. R.</given-names></name> <name><surname>Saltz</surname> <given-names>J. H.</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;Robust histopathology image analysis: to label or to synthesize?,&#x0201D;</article-title> in <source>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</source> (<publisher-loc>Long Beach, CA</publisher-loc>), <fpage>8533</fpage>&#x02013;<lpage>8542</lpage>. <pub-id pub-id-type="doi">10.1109/CVPR.2019.00873</pub-id><pub-id pub-id-type="pmid">34025103</pub-id></citation></ref>
<ref id="B13">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Huo</surname> <given-names>Y.</given-names></name> <name><surname>Xu</surname> <given-names>Z.</given-names></name> <name><surname>Bao</surname> <given-names>S.</given-names></name> <name><surname>Assad</surname> <given-names>A.</given-names></name> <name><surname>Abramson</surname> <given-names>R. G.</given-names></name> <name><surname>Landman</surname> <given-names>B. A.</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;Adversarial synthesis learning enables segmentation without target modality ground truth,&#x0201D;</article-title> in <source>2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018)</source> (<publisher-loc>Washington, DC</publisher-loc>: <publisher-name>IEEE</publisher-name>), <fpage>1217</fpage>&#x02013;<lpage>1220</lpage>. <pub-id pub-id-type="doi">10.1109/ISBI.2018.8363790</pub-id></citation>
</ref>
<ref id="B14">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Irshad</surname> <given-names>H.</given-names></name> <name><surname>Montaser-Kouhsari</surname> <given-names>L.</given-names></name> <name><surname>Waltz</surname> <given-names>G.</given-names></name> <name><surname>Bucur</surname> <given-names>O.</given-names></name> <name><surname>Nowak</surname> <given-names>J.</given-names></name> <name><surname>Dong</surname> <given-names>F.</given-names></name> <etal/></person-group>. (<year>2014</year>). <article-title>Crowdsourcing image annotation for nucleus detection and segmentation in computational pathology: evaluating experts, automated methods, and the crowd</article-title>. <source>Pac. Symp. Biocomput.</source> <volume>2015</volume>, <fpage>294</fpage>&#x02013;<lpage>305</lpage>. <pub-id pub-id-type="doi">10.1142/9789814644730_0029</pub-id><pub-id pub-id-type="pmid">25592590</pub-id></citation></ref>
<ref id="B15">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Isola</surname> <given-names>P.</given-names></name> <name><surname>Zhu</surname> <given-names>J.-Y.</given-names></name> <name><surname>Zhou</surname> <given-names>T.</given-names></name> <name><surname>Efros</surname> <given-names>A. A.</given-names></name></person-group> (<year>2017</year>). <article-title>&#x0201C;Image-to-image translation with conditional adversarial networks,&#x0201D;</article-title> in <source>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</source> (<publisher-loc>Honolulu, HI</publisher-loc>), <fpage>1125</fpage>&#x02013;<lpage>1134</lpage>. <pub-id pub-id-type="doi">10.1109/CVPR.2017.632</pub-id><pub-id pub-id-type="pmid">34940729</pub-id></citation></ref>
<ref id="B16">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jia</surname> <given-names>H.</given-names></name> <name><surname>Tang</surname> <given-names>H.</given-names></name> <name><surname>Ma</surname> <given-names>G.</given-names></name> <name><surname>Cai</surname> <given-names>W.</given-names></name> <name><surname>Huang</surname> <given-names>H.</given-names></name> <name><surname>Zhan</surname> <given-names>L.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>PSGR: Pixel-wise sparse graph reasoning for COVID-19 pneumonia segmentation in CT images</article-title>. <source>arXiv Preprint</source>. arXiv:2108.03809. <pub-id pub-id-type="doi">10.48550/arXiv.2108.03809</pub-id></citation>
</ref>
<ref id="B17">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Kamnitsas</surname> <given-names>K.</given-names></name> <name><surname>Baumgartner</surname> <given-names>C.</given-names></name> <name><surname>Ledig</surname> <given-names>C.</given-names></name> <name><surname>Newcombe</surname> <given-names>V.</given-names></name> <name><surname>Simpson</surname> <given-names>J.</given-names></name> <name><surname>Kane</surname> <given-names>A.</given-names></name> <etal/></person-group>. (<year>2017</year>). <article-title>&#x0201C;Unsupervised domain adaptation in brain lesion segmentation with adversarial networks,&#x0201D;</article-title> in <source>International Conference on Information Processing in Medical Imaging. IPMI 2017. Lecture Notes in Computer Science</source>, Vol. 10265 (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>597</fpage>&#x02013;<lpage>609</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-319-59050-9_47</pub-id></citation>
</ref>
<ref id="B18">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kingma</surname> <given-names>D. P.</given-names></name> <name><surname>Ba</surname> <given-names>J.</given-names></name></person-group> (<year>2014</year>). <article-title>Adam: A method for stochastic optimization</article-title>. <source>arXiv Preprint</source>. arXiv:1412.6980. <pub-id pub-id-type="doi">10.48550/arXiv.1412.6980</pub-id></citation>
</ref>
<ref id="B19">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kumar</surname> <given-names>N.</given-names></name> <name><surname>Verma</surname> <given-names>R.</given-names></name> <name><surname>Sharma</surname> <given-names>S.</given-names></name> <name><surname>Bhargava</surname> <given-names>S.</given-names></name> <name><surname>Vahadane</surname> <given-names>A.</given-names></name> <name><surname>Sethi</surname> <given-names>A.</given-names></name></person-group> (<year>2017</year>). <article-title>A dataset and a technique for generalized nuclear segmentation for computational pathology</article-title>. <source>IEEE Trans. Med. Imaging</source> <volume>36</volume>, <fpage>1550</fpage>&#x02013;<lpage>1560</lpage>. <pub-id pub-id-type="doi">10.1109/TMI.2017.2677499</pub-id><pub-id pub-id-type="pmid">28287963</pub-id></citation></ref>
<ref id="B20">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>C.</given-names></name> <name><surname>Zhou</surname> <given-names>Y.</given-names></name> <name><surname>Shi</surname> <given-names>T.</given-names></name> <name><surname>Wu</surname> <given-names>Y.</given-names></name> <name><surname>Yang</surname> <given-names>M.</given-names></name> <name><surname>Li</surname> <given-names>Z.</given-names></name></person-group> (<year>2021</year>). <article-title>&#x0201C;Unsupervised domain adaptation for the histopathological cell segmentation through self-ensembling,&#x0201D;</article-title> in <source>Proceedings of the MICCAI Workshop on Computational Pathology, PMLR</source> <volume>156</volume>, <fpage>151</fpage>&#x02013;<lpage>158</lpage>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://proceedings.mlr.press/v156/li21a.html">https://proceedings.mlr.press/v156/li21a.html</ext-link></citation>
</ref>
<ref id="B21">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>Y.</given-names></name> <name><surname>Sangineto</surname> <given-names>E.</given-names></name> <name><surname>Bi</surname> <given-names>W.</given-names></name> <name><surname>Sebe</surname> <given-names>N.</given-names></name> <name><surname>Lepri</surname> <given-names>B.</given-names></name> <name><surname>Nadai</surname> <given-names>M.</given-names></name></person-group> (<year>2021</year>). <article-title>&#x0201C;Efficient training of visual transformers with small datasets,&#x0201D;</article-title> in <source>Proceedings of the 35th Conference on Neural Information Processing Systems (NeurIPS) 2021</source> (<publisher-loc>Virtual</publisher-loc>).<pub-id pub-id-type="pmid">36505893</pub-id></citation></ref>
<ref id="B22">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Long</surname> <given-names>J.</given-names></name> <name><surname>Shelhamer</surname> <given-names>E.</given-names></name> <name><surname>Darrell</surname> <given-names>T.</given-names></name></person-group> (<year>2015</year>). <article-title>&#x0201C;Fully convolutional networks for semantic segmentation,&#x0201D;</article-title> in <source>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</source> (<publisher-loc>Boston, MA</publisher-loc>), <fpage>3431</fpage>&#x02013;<lpage>3440</lpage>. <pub-id pub-id-type="doi">10.1109/CVPR.2015.7298965</pub-id></citation>
</ref>
<ref id="B23">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mahmood</surname> <given-names>F.</given-names></name> <name><surname>Borders</surname> <given-names>D.</given-names></name> <name><surname>Chen</surname> <given-names>R. J.</given-names></name> <name><surname>McKay</surname> <given-names>G. N.</given-names></name> <name><surname>Salimian</surname> <given-names>K. J.</given-names></name> <name><surname>Baras</surname> <given-names>A.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>Deep adversarial training for multi-organ nuclei segmentation in histopathology images</article-title>. <source>IEEE Trans. Med. Imaging</source> <volume>39</volume>, <fpage>3257</fpage>&#x02013;<lpage>3267</lpage>. <pub-id pub-id-type="doi">10.1109/TMI.2019.2927182</pub-id><pub-id pub-id-type="pmid">31283474</pub-id></citation></ref>
<ref id="B24">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mirza</surname> <given-names>M.</given-names></name> <name><surname>Osindero</surname> <given-names>S.</given-names></name></person-group> (<year>2014</year>). <article-title>Conditional generative adversarial nets</article-title>. <source>arXiv Preprint</source> arXiv:1411.1784. <pub-id pub-id-type="doi">10.48550/arXiv.1411.1784</pub-id></citation>
</ref>
<ref id="B25">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Naylor</surname> <given-names>P.</given-names></name> <name><surname>La&#x000E9;</surname> <given-names>M.</given-names></name> <name><surname>Reyal</surname> <given-names>F.</given-names></name> <name><surname>Walter</surname> <given-names>T.</given-names></name></person-group> (<year>2018</year>). <article-title>Segmentation of nuclei in histopathology images by deep regression of the distance map</article-title>. <source>IEEE Trans. Med. Imaging</source> <volume>38</volume>, <fpage>448</fpage>&#x02013;<lpage>459</lpage>. <pub-id pub-id-type="doi">10.1109/TMI.2018.2865709</pub-id><pub-id pub-id-type="pmid">30716022</pub-id></citation></ref>
<ref id="B26">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Paszke</surname> <given-names>A.</given-names></name> <name><surname>Gross</surname> <given-names>S.</given-names></name> <name><surname>Massa</surname> <given-names>F.</given-names></name> <name><surname>Lerer</surname> <given-names>A.</given-names></name> <name><surname>Bradbury</surname> <given-names>J.</given-names></name> <name><surname>Chanan</surname> <given-names>G.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>Pytorch: An imperative style, high-performance deep learning library</article-title>. <source>NIPS&#x00027;19: Proceedings of the 33rd International Conference on Neural Information Processing Systems 32</source> (<publisher-loc>Vancouver, BC</publisher-loc>), <fpage>8026</fpage>&#x02013;<lpage>8037</lpage>.</citation>
</ref>
<ref id="B27">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Radford</surname> <given-names>A.</given-names></name> <name><surname>Metz</surname> <given-names>L.</given-names></name> <name><surname>Chintala</surname> <given-names>S.</given-names></name></person-group> (<year>2015</year>). <article-title>Unsupervised representation learning with deep convolutional generative adversarial networks</article-title>. <source>arXiv Preprint</source>. arXiv:1511.06434. <pub-id pub-id-type="doi">10.48550/arXiv.1511.06434</pub-id></citation>
</ref>
<ref id="B28">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Raju</surname> <given-names>A.</given-names></name> <name><surname>Ji</surname> <given-names>Z.</given-names></name> <name><surname>Cheng</surname> <given-names>C. T.</given-names></name> <name><surname>Cai</surname> <given-names>J.</given-names></name> <name><surname>Huang</surname> <given-names>J.</given-names></name> <name><surname>Xiao</surname> <given-names>J.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>&#x0201C;User-guided domain adaptation for rapid annotation from user interactions: a study on pathological liver segmentation,&#x0201D;</article-title> in <source>International Conference on Medical Image Computing and Computer-Assisted Intervention - MICCAI 2020. Lecture Notes in Computer Science, Vol. 12261</source> (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>457</fpage>&#x02013;<lpage>467</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-030-59710-8_45</pub-id></citation>
</ref>
<ref id="B29">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Ronneberger</surname> <given-names>O.</given-names></name> <name><surname>Fischer</surname> <given-names>P.</given-names></name> <name><surname>Brox</surname> <given-names>T.</given-names></name></person-group> (<year>2015</year>). <article-title>&#x0201C;U-net: Convolutional networks for biomedical image segmentation,&#x0201D;</article-title> in <source>International Conference on Medical Image Computing and Computer-Assisted Intervention - MICCAI 2015. Lecture Notes in Computer Science</source>, Vol. 9351 (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>234</fpage>&#x02013;<lpage>241</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-319-24574-4_28</pub-id></citation>
</ref>
<ref id="B30">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Sharma</surname> <given-names>Y.</given-names></name> <name><surname>Syed</surname> <given-names>S.</given-names></name> <name><surname>Brown</surname> <given-names>D. E.</given-names></name></person-group> (<year>2022</year>). <article-title>&#x0201C;Mani: Maximizing mutual information for nuclei cross-domain unsupervised segmentation,&#x0201D;</article-title> in <source>International Conference on Medical Image Computing and Computer-Assisted Intervention</source>, eds L. Wang, Q. Dou, P.T. Fletcher, S. Speidel, and S. Li (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>345</fpage>&#x02013;<lpage>355</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-031-16434-7_34</pub-id></citation>
</ref>
<ref id="B31">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sirinukunwattana</surname> <given-names>K.</given-names></name> <name><surname>Raza</surname> <given-names>S. E. A.</given-names></name> <name><surname>Tsang</surname> <given-names>Y.-W.</given-names></name> <name><surname>Snead</surname> <given-names>D. R.</given-names></name> <name><surname>Cree</surname> <given-names>I. A.</given-names></name> <name><surname>Rajpoot</surname> <given-names>N. M.</given-names></name></person-group> (<year>2016</year>). <article-title>Locality sensitive deep learning for detection and classification of nuclei in routine colon cancer histology images</article-title>. <source>IEEE Trans. Med. Imaging</source> <volume>35</volume>, <fpage>1196</fpage>&#x02013;<lpage>1206</lpage>. <pub-id pub-id-type="doi">10.1109/TMI.2016.2525803</pub-id><pub-id pub-id-type="pmid">26863654</pub-id></citation></ref>
<ref id="B32">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tay</surname> <given-names>Y.</given-names></name> <name><surname>Dehghani</surname> <given-names>M.</given-names></name> <name><surname>Bahri</surname> <given-names>D.</given-names></name> <name><surname>Metzler</surname> <given-names>D.</given-names></name></person-group> (<year>2020</year>). <article-title>Efficient transformers: a survey</article-title>. <source>arXiv Preprint</source> arXiv:2009.06732. <pub-id pub-id-type="doi">10.48550/arXiv.2009.06732</pub-id></citation>
</ref>
<ref id="B33">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Toldo</surname> <given-names>M.</given-names></name> <name><surname>Michieli</surname> <given-names>U.</given-names></name> <name><surname>Zanuttigh</surname> <given-names>P.</given-names></name></person-group> (<year>2021</year>). <article-title>&#x0201C;Unsupervised domain adaptation in semantic segmentation via orthogonal and clustered embeddings,&#x0201D;</article-title> in <source>2021 Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision</source> (<publisher-loc>Waikoloa, HI</publisher-loc>), <fpage>1358</fpage>&#x02013;<lpage>1368</lpage>. <pub-id pub-id-type="doi">10.1109/WACV48630.2021.00140</pub-id></citation>
</ref>
<ref id="B34">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Tsai</surname> <given-names>Y.-H.</given-names></name> <name><surname>Hung</surname> <given-names>W.-C.</given-names></name> <name><surname>Schulter</surname> <given-names>S.</given-names></name> <name><surname>Sohn</surname> <given-names>K.</given-names></name> <name><surname>Yang</surname> <given-names>M.-H.</given-names></name> <name><surname>Chandraker</surname> <given-names>M.</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;Learning to adapt structured output space for semantic segmentation,&#x0201D;</article-title> in <source>2018 Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</source> (<publisher-loc>Salt Lake City, UT</publisher-loc>), <fpage>7472</fpage>&#x02013;<lpage>7481</lpage>. <pub-id pub-id-type="doi">10.1109/CVPR.2018.00780</pub-id><pub-id pub-id-type="pmid">32870790</pub-id></citation></ref>
<ref id="B35">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xia</surname> <given-names>X.</given-names></name> <name><surname>Kulis</surname> <given-names>B.</given-names></name></person-group> (<year>2017</year>). <article-title>W-net: A deep model for fully unsupervised image segmentation</article-title>. <source>arXiv Preprint</source>. arXiv:1711.08506. <pub-id pub-id-type="doi">10.48550/arXiv.1711.08506</pub-id><pub-id pub-id-type="pmid">34460779</pub-id></citation></ref>
<ref id="B36">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xia</surname> <given-names>Y.</given-names></name> <name><surname>Yang</surname> <given-names>D.</given-names></name> <name><surname>Yu</surname> <given-names>Z.</given-names></name> <name><surname>Liu</surname> <given-names>F.</given-names></name> <name><surname>Cai</surname> <given-names>J.</given-names></name> <name><surname>Yu</surname> <given-names>L.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Uncertainty-aware multi-view co-training for semi-supervised medical image segmentation and domain adaptation</article-title>. <source>Med. Image Anal.</source> <volume>65</volume>, <fpage>101766</fpage>. <pub-id pub-id-type="doi">10.1016/j.media.2020.101766</pub-id><pub-id pub-id-type="pmid">32623276</pub-id></citation></ref>
<ref id="B37">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xu</surname> <given-names>Y.</given-names></name> <name><surname>Jia</surname> <given-names>Z.</given-names></name> <name><surname>Wang</surname> <given-names>L.-B.</given-names></name> <name><surname>Ai</surname> <given-names>Y.</given-names></name> <name><surname>Zhang</surname> <given-names>F.</given-names></name> <name><surname>Lai</surname> <given-names>M.</given-names></name> <etal/></person-group>. (<year>2017</year>). <article-title>Large scale tissue histopathology image classification, segmentation, and visualization via deep convolutional activation features</article-title>. <source>BMC Bioinformatics</source> <volume>18</volume>:<fpage>281</fpage>. <pub-id pub-id-type="doi">10.1186/s12859-017-1685-x</pub-id><pub-id pub-id-type="pmid">28549410</pub-id></citation></ref>
<ref id="B38">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Yang</surname> <given-names>J.</given-names></name> <name><surname>Li</surname> <given-names>C.</given-names></name> <name><surname>An</surname> <given-names>W.</given-names></name> <name><surname>Ma</surname> <given-names>H.</given-names></name> <name><surname>Guo</surname> <given-names>Y.</given-names></name> <name><surname>Rong</surname> <given-names>Y.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>&#x0201C;Exploring robustness of unsupervised domain adaptation in semantic segmentation,&#x0201D;</article-title> in <source>Proceedings of the IEEE/CVF International Conference on Computer Vision</source> (<publisher-loc>Montreal, BC</publisher-loc>), <fpage>9194</fpage>&#x02013;<lpage>9203</lpage>. <pub-id pub-id-type="doi">10.1109/ICCV48922.2021.00906</pub-id></citation>
</ref>
<ref id="B39">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yang</surname> <given-names>S.</given-names></name> <name><surname>Zhang</surname> <given-names>J.</given-names></name> <name><surname>Huang</surname> <given-names>J.</given-names></name> <name><surname>Lovell</surname> <given-names>B. C.</given-names></name> <name><surname>Han</surname> <given-names>X.</given-names></name></person-group> (<year>2021</year>). <article-title>&#x0201C;Minimizing labeling cost for nuclei instance segmentation and classification with cross-domain images and weak labels,&#x0201D;</article-title> in <source>Proceedings of the AAAI Conference on Artificial Intelligence</source>, Vol. 35 (Virtual), <fpage>697</fpage>&#x02013;<lpage>705</lpage>. <pub-id pub-id-type="doi">10.1609/aaai.v35i1.16150</pub-id></citation>
</ref>
<ref id="B40">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Zeiler</surname> <given-names>M. D.</given-names></name> <name><surname>Fergus</surname> <given-names>R.</given-names></name></person-group> (<year>2014</year>). <article-title>&#x0201C;Visualizing and understanding convolutional networks,&#x0201D;</article-title> in <source>European Conference on Computer Vision - ECCV 2014. Lecture Notes in Computer Science</source>, eds D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>818</fpage>&#x02013;<lpage>833</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-319-10590-1_53</pub-id></citation>
</ref>
<ref id="B41">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>Y.</given-names></name> <name><surname>Qiu</surname> <given-names>Z.</given-names></name> <name><surname>Yao</surname> <given-names>T.</given-names></name> <name><surname>Liu</surname> <given-names>D.</given-names></name> <name><surname>Mei</surname> <given-names>T.</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;Fully convolutional adaptation networks for semantic segmentation,&#x0201D;</article-title> in <source>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</source> (<publisher-loc>Salt Lake City, UT</publisher-loc>), <fpage>6810</fpage>&#x02013;<lpage>6818</lpage>. <pub-id pub-id-type="doi">10.1109/CVPR.2018.00712</pub-id><pub-id pub-id-type="pmid">34807822</pub-id></citation></ref>
<ref id="B42">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Zheng</surname> <given-names>S.</given-names></name> <name><surname>Lu</surname> <given-names>J.</given-names></name> <name><surname>Zhao</surname> <given-names>H.</given-names></name> <name><surname>Zhu</surname> <given-names>X.</given-names></name> <name><surname>Luo</surname> <given-names>Z.</given-names></name> <name><surname>Wang</surname> <given-names>Y.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>&#x0201C;Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers,&#x0201D;</article-title> in <source>Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition</source> (<publisher-loc>Nashville, TN</publisher-loc>), <fpage>6881</fpage>&#x02013;<lpage>6890</lpage>. <pub-id pub-id-type="doi">10.1109/CVPR46437.2021.00681</pub-id></citation>
</ref>
<ref id="B43">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Zhou</surname> <given-names>Z.</given-names></name> <name><surname>Siddiquee</surname> <given-names>M. M. R.</given-names></name> <name><surname>Tajbakhsh</surname> <given-names>N.</given-names></name> <name><surname>Liang</surname> <given-names>J.</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;UNet&#x0002B;&#x0002B;: A nested u-net architecture for medical image segmentation,&#x0201D;</article-title> in <source>Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. DLMIA ML-CDS 2018. Lecture Notes in Computer Science, Vol. 11045</source> (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>3</fpage>&#x02013;<lpage>11</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-030-00889-5_1</pub-id><pub-id pub-id-type="pmid">32613207</pub-id></citation></ref>
<ref id="B44">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Zhu</surname> <given-names>J.-Y.</given-names></name> <name><surname>Park</surname> <given-names>T.</given-names></name> <name><surname>Isola</surname> <given-names>P.</given-names></name> <name><surname>Efros</surname> <given-names>A. A.</given-names></name></person-group> (<year>2017</year>). <article-title>&#x0201C;Unpaired image-to-image translation using cycle-consistent adversarial networks,&#x0201D;</article-title> in <source>Proceedings of the IEEE International Conference on Computer Vision</source> (<publisher-loc>ICCV 2017</publisher-loc>) (Venice), <fpage>2223</fpage>&#x02013;<lpage>2232</lpage>. <pub-id pub-id-type="doi">10.1109/ICCV.2017.244</pub-id></citation>
</ref>
</ref-list>
</back>
</article>