<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="2.3" xml:lang="EN">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Oncol.</journal-id>
<journal-title>Frontiers in Oncology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Oncol.</abbrev-journal-title>
<issn pub-type="epub">2234-943X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fonc.2024.1396887</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Oncology</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>A semi-supervised segmentation method for microscopic hyperspectral pathological images based on multi-consistency learning</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Fang</surname>
<given-names>Jinghui</given-names>
</name>
<xref ref-type="author-notes" rid="fn001">
<sup>*</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/2675261"/>
<role content-type="https://credit.niso.org/contributor-roles/data-curation/"/>
<role content-type="https://credit.niso.org/contributor-roles/formal-analysis/"/>
<role content-type="https://credit.niso.org/contributor-roles/investigation/"/>
<role content-type="https://credit.niso.org/contributor-roles/visualization/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-original-draft/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-review-editing/"/>
</contrib>
</contrib-group>    <aff id="aff1">
<institution>College of Information Science and Engineering, Hohai University</institution>, <addr-line>Nanjing</addr-line>, <country>China</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>Edited by: Qingli Li, East China Normal University, China</p>
</fn>
<fn fn-type="edited-by">
<p>Reviewed by: Meng Lv, Beijing Institute of Technology, China</p>
<p>Haoyang Yu, Dalian Maritime University, China</p>
</fn>
<fn fn-type="corresp" id="fn001">
<p>*Correspondence: Jinghui Fang, <email xlink:href="mailto:2106020108@hhu.edu.cn">2106020108@hhu.edu.cn</email>
</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>19</day>
<month>06</month>
<year>2024</year>
</pub-date>
<pub-date pub-type="collection">
<year>2024</year>
</pub-date>
<volume>14</volume>
<elocation-id>1396887</elocation-id>
<history>
<date date-type="received">
<day>06</day>
<month>03</month>
<year>2024</year>
</date>
<date date-type="accepted">
<day>15</day>
<month>04</month>
<year>2024</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2024 Fang</copyright-statement>
<copyright-year>2024</copyright-year>
<copyright-holder>Fang</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<p>Pathological images are considered the gold standard for clinical diagnosis and cancer grading. Automatic segmentation of pathological images is a fundamental and crucial step in constructing powerful computer-aided diagnostic systems. Medical microscopic hyperspectral pathological images can provide additional spectral information, further distinguishing different chemical components of biological tissues, offering new insights for accurate segmentation of pathological images. However, hyperspectral pathological images have higher resolution and larger area, and their annotation requires more time and clinical experience. The lack of precise annotations limits the progress of research in pathological image segmentation. In this paper, we propose a novel semi-supervised segmentation method for microscopic hyperspectral pathological images based on multi-consistency learning (MCL-Net), which combines consistency regularization methods with pseudo-labeling techniques. The MCL-Net architecture employs a shared encoder and multiple independent decoders. We introduce a Soft-Hard pseudo-label generation strategy in MCL-Net to generate pseudo-labels that are closer to real labels for pathological images. Furthermore, we propose a multi-consistency learning strategy, treating pseudo-labels generated by the Soft-Hard process as real labels, by promoting consistency between predictions of different decoders, enabling the model to learn more sample features. Extensive experiments in this paper demonstrate the effectiveness of the proposed method, providing new insights for the segmentation of microscopic hyperspectral tissue pathology images.</p>
</abstract>
<kwd-group>
<kwd>microscopic hyperspectral images</kwd>
<kwd>semi-supervised learning</kwd>
<kwd>medical image segmentation</kwd>
<kwd>mutual consistency</kwd>
<kwd>pseudo-labels</kwd>
</kwd-group>
<counts>
<fig-count count="7"/>
<table-count count="3"/>
<equation-count count="9"/>
<ref-count count="37"/>
<page-count count="11"/>
<word-count count="5683"/>
</counts>
<custom-meta-wrap>
<custom-meta>
<meta-name>section-in-acceptance</meta-name>
<meta-value>Cancer Imaging and Image-directed Interventions</meta-value>
</custom-meta>
</custom-meta-wrap>
</article-meta>
</front>
<body>
<sec id="s1" sec-type="intro">
<label>1</label>
<title>Introduction</title>
<p>Pathological images are considered the gold standard for clinical diagnosis and cancer grading (<xref ref-type="bibr" rid="B1">1</xref>). Automatic segmentation of pathological images is a fundamental and crucial step in constructing powerful computer-aided diagnostic systems. Quantitative analysis of the morphological properties of organs and tissues based on segmentation results provides valuable evidence for clinical diagnosis. Existing pathological image segmentation methods (<xref ref-type="bibr" rid="B2">2</xref>&#x2013;<xref ref-type="bibr" rid="B4">4</xref>) typically utilize RGB datasets. However, common RGB images can only provide spatial information for cancer diagnosis. The similarity in biological tissue morphology affects the accuracy of diagnostic results.</p>
<p>With the advancement of imaging systems, medical microscopic hyperspectral images have been employed in various tumor recognition applications (<xref ref-type="bibr" rid="B5">5</xref>&#x2013;<xref ref-type="bibr" rid="B7">7</xref>). The DMCA method proposed in (<xref ref-type="bibr" rid="B5">5</xref>) integrates the classifier for prediction into the extraction of deep features from MedHSIs. This integration ensures compatibility between the extracted features and the classifier, facilitating tumor diagnosis. In (<xref ref-type="bibr" rid="B6">6</xref>), Rav&#xec; et&#xa0;al. introduced a novel manifold embedding framework called FR-t-SNE. Using this framework, the outputs generated from hyperspectral imaging can be utilized as inputs for semantic segmentation classifiers of brain tissue <italic>in vivo</italic>. The proposed method aims to delineate tumor boundaries, preserve healthy brain tissue, and facilitate complete removal of malignant cells. Muniz et&#xa0;al. proposed, in (<xref ref-type="bibr" rid="B7">7</xref>), a method utilizing hyperspectral imaging and micro-FTIR spectroscopy to represent biological tissues based on their spectral characteristics. Subsequently, a deep learning-based classification approach was established to aid experts in distinguishing tissues affected by cancer or inflammation from healthy tissues. Microscopic hyperspectral imaging applied in medical image analysis relies on the following two fundamental principles: i) tissues with similar biochemical compositions are&#xa0;likely to exhibit similar spectra; and ii) variations in spectra can&#xa0;be&#xa0;quantified to delineate different tissues (<xref ref-type="bibr" rid="B8">8</xref>). Compared to conventional imaging modalities, medical microscopic hyperspectral pathological images offer additional spectral information, enabling further differentiation of various chemical constituents within biological tissues. Nevertheless, hyperspectral pathological images have higher resolution and larger area, and their annotation requires more time and clinical experience. Therefore, the lack of precise annotations limits the progress of research in pathological image segmentation.</p>
<p>Semi-supervised learning is a method used to address the issue of limited labeled data. This approach typically involves joint training with a small amount of labeled data and a large amount of unlabeled data. The core of this method lies in effectively extracting useful information from both labeled and unlabeled data to achieve relatively stable segmentation results. To achieve this goal, many semi-supervised algorithms have been applied in this field. Common existing semi-supervised segmentation methods can be categorized into pseudo-labeling and consistency regularization methods (<xref ref-type="bibr" rid="B9">9</xref>). Firstly, pseudo-labeling is an intuitive approach where a model trained on labeled data is used to predict pseudo-labels for unlabeled data. These new pseudo-labeled data are then combined with the original labeled set to further refine the model. However, the effectiveness of this method is constrained by the varying quality of the predicted pseudo-labels (<xref ref-type="bibr" rid="B9">9</xref>). Consistency regularization methods are based on the smoothness assumption (<xref ref-type="bibr" rid="B10">10</xref>). They explore an unsupervised way to leverage unlabeled data. This method (<xref ref-type="bibr" rid="B11">11</xref>&#x2013;<xref ref-type="bibr" rid="B13">13</xref>) typically applies slight perturbations to the input data or the model, and learns from unlabeled data by ensuring consistency in model output under different perturbations. Many methods employ a single image to enforce consistency in their perturbations (<xref ref-type="bibr" rid="B14">14</xref>), which may lead to inaccurate segmentation results due to a lack of context information across volumes, thus limiting the effectiveness of consistency regularization.</p>    <p>Given the aforementioned issues with consistency regularization and pseudo-labeling methods, this paper introduces a semi-supervised segmentation method for microscopic hyperspectral pathological images based on <italic>multi-consistency learning (MCL-Net)</italic>, which combines both methods. The architecture of the model is illustrated in <xref ref-type="fig" rid="f1">
<bold>Figure&#xa0;1</bold>
</xref>. The model employs a shared encoder and multiple decoders, where the output of the encoder undergoes different perturbations before being fed into distinct decoders. Subsequently, we employ a novel pseudo-label generation method called Soft-Hard to transform the outputs of different decoders into pseudo-labels. Using these generated pseudo-labels as a basis, we devise a novel multi-consistency training approach, wherein soft pseudo-labels obtained from each decoder are treated as genuine labels for the other decoders and subjected to consistency constraints. Through this approach, we minimize discrepancies in output across multiple decoders during model training, thereby obtaining a more comprehensive feature representation. In summary, this paper makes the following four contributions:</p>
<list list-type="simple">
<list-item>
<p>1) A semi-supervised segmentation method for microscopic hyperspectral pathology images based on multi-consistency learning is proposed in this study. This method combines pseudo-labeling and consistency regularization techniques.</p>
</list-item>
<list-item>
<p>2) A multi-consistency learning approach that effectively integrates features extracted by different models is introduced in this research.</p>
</list-item>
<list-item>
<p>3) A novel pseudo-label generation method, Soft-Hard, which generates pseudo-labels that are closer to real labels, has been devised.</p>
</list-item>
<list-item>
<p>4) Extensive experiments demonstrate that our method outperforms five other state-of-the-art methods, providing new insights for the segmentation of microscopic hyperspectral pathology images.</p>
</list-item>
</list>
<fig id="f1" position="float">
<label>Figure&#xa0;1</label>
<caption>
<p>The MCL-Net model proposed in this paper.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fonc-14-1396887-g001.tif"/>
</fig>
<p>Remaining sections of the paper are organized as follows: Section 2 provides a review of pathological image segmentation and semi-supervised segmentation methods for medical images. Section 3 introduces the MCL-Net method proposed in this paper. Section 4 outlines the experimental setup. Section 5 is dedicated to result analysis. Finally, Section 6 concludes and summarizes the paper.</p>
</sec>
<sec id="s2">
<label>2</label>
<title>Related work</title>
<sec id="s2_1">
<label>2.1</label>
<title>Pathological image segmentation</title>
<p>Pathological images serve as the gold standard for cancer detection, with various segmentation methods being employed in different types of cancer detection. Musulin et&#xa0;al. (<xref ref-type="bibr" rid="B15">15</xref>) proposed a two-stage image segmentation method utilizing DeepLabv3+ as the backbone model for predicting oral squamous cell carcinoma in head and neck cancer. Zidan et&#xa0;al. (<xref ref-type="bibr" rid="B16">16</xref>) designed a Transformer-based approach constructing a Swin Transformer encoder block to mimic the global context of tumor-related regions in colorectal cancer. Additionally, a cascaded upsampler was devised to utilize supervised multiscale features from the encoder to assist in detecting tumor boundary regions. Jayachandran et&#xa0;al. (<xref ref-type="bibr" rid="B17">17</xref>) introduced a novel deep learning framework based on an encoder-decoder structure effectively incorporating attention mechanisms for segmenting osteosarcoma from histological images. Huang et&#xa0;al. (<xref ref-type="bibr" rid="B18">18</xref>) proposed an end-to-end ViT-AMCNet, possessing interpretable throat tumor grading capabilities and good interpretability. This model not only ensured good feature representation capabilities of ViT and AMC blocks but also enhanced the redundancy removal ability of the model fusion algorithm. In (<xref ref-type="bibr" rid="B19">19</xref>), Rashmi et&#xa0;al. proposed an unsupervised method for segmenting cell nuclei from breast tissue pathology images. A method for selecting template images for color normalization was introduced. An experiment determining a new color channel combination was conducted, which could distinguish cell nuclei from background regions. Furthermore, this work introduced an improved C-V model capable of effectively segmenting nuclei using multi-channel color information. To fully exploit the spectral characteristics of three-dimensional hyperspectral data, Wang et&#xa0;al. (<xref ref-type="bibr" rid="B8">8</xref>) applied deep convolutional networks for melanoma segmentation on hyperspectral pathological images. They introduced a 3D fully convolutional network named Hyper-net for segmenting melanoma from hyperspectral pathological images. Zhang et&#xa0;al. (<xref ref-type="bibr" rid="B20">20</xref>) proposed a two-stage segmentation method for OSCC tumors with lymph node metastasis. In the learning stage, this method is employed for coarse segmentation of cancer cell nuclei. In the decision stage, the pathologist&#x2019;s prior knowledge is utilized to make lesion decisions based on the coarse segmentation mask of cancer nuclei, resulting in refined segmentation results. Gao et&#xa0;al. (<xref ref-type="bibr" rid="B21">21</xref>) proposed a semi-supervised segmentation method for microscopic hyperspectral pathological images based on shape priors and contrastive learning. They utilized shape priors and image-level contrastive learning to learn features from unlabeled data, enhancing semi-supervised segmentation performance and mitigating limitations posed by limited annotated data. Despite significant progress in tissue pathology image segmentation for cancer prediction, research on hyperspectral pathological images remains limited.</p>
</sec>
<sec id="s2_2">
<label>2.2</label>
<title>Consistency regularization</title>
<p>Consistency regularization refers to the similarity of predictions generated by a model under the same input data or model with added random noise. It is a crucial component of temporal ensemble techniques (<xref ref-type="bibr" rid="B22">22</xref>). Mean Teacher (<xref ref-type="bibr" rid="B23">23</xref>) is a classical temporal ensemble technique where both the student and teacher models adopt the same network structure. Through exponential moving average (EMA), the student network&#x2019;s output across different training iterations becomes similar to that of the teacher network. Various models based on temporal ensemble techniques have been developed based on Mean Teacher. In (<xref ref-type="bibr" rid="B24">24</xref>), Shu et&#xa0;al. proposed a novel cross-pollination learning and feature migration mechanism allowing the teacher model to provide higher confidence outputs for student model learning. This method cross-pollinates unlabeled samples to enhance the segmentation network&#x2019;s generalization ability. It also introduces new cross-gradient monitors to reduce consistency failures caused by semantic gaps between teacher and student models. The average teacher model is enhanced into a novel Fuzzy Consistency Average Teacher (AC-MT) model, where Xu et&#xa0;al. (<xref ref-type="bibr" rid="B25">25</xref>) added a series of comprehensive plug-and-play strategies for fuzzy (informative) target selection based on Mean Teacher. This model stabilizes disturbances in regions, enabling more useful representations to be learned from unlabeled data. In (<xref ref-type="bibr" rid="B26">26</xref>), Zhang et&#xa0;al. proposed a novel uncertainty-guided mutual consistency learning framework for semi-supervised medical image segmentation. The model employs a dual-task backbone network with two output branches to simultaneously generate segmentation probability maps and signed distance maps. It performs intra-task consistency learning within self-ensemble tasks and utilizes task-level regularization for cross-task consistency learning to leverage geometric shape information. By estimating model segmentation uncertainty guidance, the framework effectively utilizes more reliable information from unlabeled data by selecting relatively determinis.</p>
</sec>
<sec id="s2_3">
<label>2.3</label>
<title>Pseudo-labeling</title>
<p>Pseudo-labeling involves generating targets for unlabeled data to obtain amplified, approximately fully labeled datasets (<xref ref-type="bibr" rid="B27">27</xref>). Common pseudo-labeling methods focus on effective pseudo-label generation strategies and how to generate high-quality segmentation results under the supervision of pseudo-labels. Wu et&#xa0;al. (<xref ref-type="bibr" rid="B28">28</xref>) proposed a novel Mutual Consistency Network (MC-Net) for semi-supervised left atrium segmentation in 3D MR images. MC-Net consists of an encoder and two slightly different decoders. It converts the prediction differences between the two decoders into unsupervised loss through a cyclic pseudo-labeling scheme to encourage mutual consistency. Building upon (<xref ref-type="bibr" rid="B28">28</xref>), Wu et&#xa0;al. (<xref ref-type="bibr" rid="B29">29</xref>) further introduced a model comprising a shared encoder and multiple slightly different decoders. This model represents the model&#x2019;s uncertainty by computing the statistical differences among the outputs of multiple decoders, indicating uncertain regions in unlabeled data. The model obtains soft pseudo-labels using a sharpening function and applies a novel mutual consistency constraint between the probability output of one decoder and the soft pseudo-labels of other decoders. Chaitanya et&#xa0;al. (<xref ref-type="bibr" rid="B30">30</xref>) proposed a joint training framework defining per-pixel contrastive loss on pseudo-labels of unlabeled and sparsely labeled images, while applying traditional segmentation loss only on the labeled set. This method performs pseudo-label-based self-training and trains the network by jointly optimizing the contrastive loss proposed on labeled and unlabeled sets and the segmentation loss on the sparsely labeled set. Chen et&#xa0;al. (<xref ref-type="bibr" rid="B31">31</xref>) proposed a semi-supervised tissue segmentation framework called FDCT. This framework introduces the SBOM boundary refinement strategy, utilizing the characteristics of distance maps to optimize the pseudo-labels generated by the model, making them closer to the ground truth labels.</p>
</sec>
</sec>
<sec id="s3">
<label>3</label>
<title>Proposed method based on multi-consistency learning</title>
<sec id="s3_1">
<label>3.1</label>
<title>Overall structure</title>
<p>The task of semi-supervised segmentation of microscopic hyperspectral pathology images aims to learn more sample information by utilizing a small set of labeled samples and a large set of unlabeled samples. In this paper, we work with a dataset, denoted as <inline-formula>
<mml:math display="inline" id="im1">
<mml:mi>D</mml:mi>
</mml:math>
</inline-formula>, which contains <inline-formula>
<mml:math display="inline" id="im2">
<mml:mi>M</mml:mi>
</mml:math>
</inline-formula> labeled samples and <inline-formula>
<mml:math display="inline" id="im3">
<mml:mi>N</mml:mi>
</mml:math>
</inline-formula> unlabeled samples, where <inline-formula>
<mml:math display="inline" id="im4">
<mml:mrow>
<mml:mi>M</mml:mi>
<mml:mo>&#x226a;</mml:mo>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> We define the labeled dataset as <inline-formula>
<mml:math display="inline" id="im5">
<mml:mrow>
<mml:msup>
<mml:mi>D</mml:mi>
<mml:mi>L</mml:mi>
</mml:msup>
<mml:mo>=</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mi>Y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msup>
</mml:mrow>
<mml:mo>}</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>M</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> and the unlabeled dataset as <inline-formula>
<mml:math display="inline" id="im6">
<mml:mrow>
<mml:msup>
<mml:mi>D</mml:mi>
<mml:mi>U</mml:mi>
</mml:msup>
<mml:mo>=</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msup>
</mml:mrow>
<mml:mo>}</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi>M</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>M</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula>. Each sample <inline-formula>
<mml:math display="inline" id="im7">
<mml:mrow>
<mml:msup>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msup>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mi>&#x211d;</mml:mi>
<mml:mrow>
<mml:mi>H</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>W</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>C</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> in <inline-formula>
<mml:math display="inline" id="im8">
<mml:mi>D</mml:mi>
</mml:math>
</inline-formula> is a microscopic hyperspectral image with a size of <inline-formula>
<mml:math display="inline" id="im9">
<mml:mrow>
<mml:mi>H</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>W</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula>
<mml:math display="inline" id="im10">
<mml:mi>C</mml:mi>
</mml:math>
</inline-formula> channels. Correspondingly, <inline-formula>
<mml:math display="inline" id="im11">
<mml:mrow>
<mml:msup>
<mml:mi>Y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msup>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mi>&#x211d;</mml:mi>
<mml:mrow>
<mml:mi>H</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>W</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> is the segmentation label map associated with <inline-formula>
<mml:math display="inline" id="im12">
<mml:mrow>
<mml:msup>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula>. The objective of semi-supervised segmentation is to learn a segmentation model <inline-formula>
<mml:math display="inline" id="im13">
<mml:mrow>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>s</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3b8;</mml:mi>
<mml:mi>s</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> parameterized by <inline-formula>
<mml:math display="inline" id="im14">
<mml:mrow>
<mml:mi>D</mml:mi>
<mml:mo>=</mml:mo>
<mml:msup>
<mml:mi>D</mml:mi>
<mml:mi>L</mml:mi>
</mml:msup>
<mml:mo>+</mml:mo>
<mml:msup>
<mml:mi>D</mml:mi>
<mml:mi>U</mml:mi>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> from <inline-formula>
<mml:math display="inline" id="im15">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3b8;</mml:mi>
<mml:mi>s</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, such that each pixel in the input image is mapped to its correct class.</p>
<p>The network model proposed in this paper is illustrated in <xref ref-type="fig" rid="f1">
<bold>Figure&#xa0;1</bold>
</xref>. During the training phase, in the preprocessing step on the left, Principal Component Analysis (PCA) is employed to perform dimensionality reduction on the input microscopic hyperspectral images. PCA serves a dual purpose: firstly, it helps distance the image from noise, thereby enhancing the data quality; secondly, it eliminates redundant spectral bands, reducing computational overhead and improving processing efficiency. We will use the data obtained by PCA dimensionality reduction as model input. In the segmentation step depicted in <xref ref-type="fig" rid="f1">
<bold>Figure&#xa0;1</bold>
</xref>, the feature vector FA is obtained by first passing through a shared encoder. Subsequently, various perturbations are applied to the feature vector FA, and the perturbed feature vectors are fed into different decoders. The outputs of multiple decoders are subjected to the proposed multi-consistency loss in this paper. All decoders update their model parameters during the training process. However, during testing, only one decoder is selected as the primary decoder, while the others are referred to as auxiliary decoders. Further details will be elucidated in Section 3.2. The U-Net model, known for its simple yet efficient structure, has found wide application in the field of medical image segmentation. Therefore, in our segmentation model, we employ an encoder-decoder structure based on U-Net, as depicted in <xref ref-type="fig" rid="f2">
<bold>Figure&#xa0;2</bold>
</xref>.</p>
<fig id="f2" position="float">
<label>Figure&#xa0;2</label>
<caption>
<p>The U-Net structure used in this article.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fonc-14-1396887-g002.tif"/>
</fig>
</sec>
<sec id="s3_2">
<label>3.2</label>
<title>Training with multiple consistencies</title>
<p>After the input data <inline-formula>
<mml:math display="inline" id="im16">
<mml:mi>D</mml:mi>
</mml:math>
</inline-formula> passes through the encoder <inline-formula>
<mml:math display="inline" id="im17">
<mml:mi>E</mml:mi>
</mml:math>
</inline-formula>, it yields the feature vector FA. FA is directly fed into the primary decoder <inline-formula>
<mml:math display="inline" id="im18">
<mml:mrow>
<mml:msup>
<mml:mi>D</mml:mi>
<mml:mi>L</mml:mi>
</mml:msup>
<mml:mo>=</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mi>Y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msup>
</mml:mrow>
<mml:mo>}</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>M</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula>. Different noise is introduced to the inputs of the model&#x2019;s auxiliary decoders, represented as <inline-formula>
<mml:math display="inline" id="im19">
<mml:mrow>
<mml:mi>F</mml:mi>
<mml:msub>
<mml:mi>A</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mi>f</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>F</mml:mi>
<mml:mi>A</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>, where <italic>n</italic>&#x2265;<italic>i</italic>&gt; denotes the number of decoders and <inline-formula>
<mml:math display="inline" id="im20">
<mml:mrow>
<mml:mi>f</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>

<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> signifies the distinct noise added to FA. Specifically, we set n to 3 to strike a balance between effectiveness and training efficiency. Through experiments in Section 5.3, we ultimately determine <inline-formula>
<mml:math display="inline" id="im21">
<mml:mrow>
<mml:mi>f</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>

<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> to be Gaussian noise. Further experimental details will be presented in Sections 5.3 and 5.4.</p>
<p>The feature vector is processed by different decoders to obtain distinct probability maps. For the labeled dataset <inline-formula>
<mml:math display="inline" id="im22">
<mml:mrow>
<mml:msup>
<mml:mi>D</mml:mi>
<mml:mi>L</mml:mi>
</mml:msup>
<mml:mo>=</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mi>Y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msup>
</mml:mrow>
<mml:mo>}</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>M</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> this paper only calculates the supervised loss between the probability map from the primary decoder and its corresponding ground truth label. In this paper, the supervised loss is computed using both cross-entropy loss and Dice loss, defined by <xref ref-type="disp-formula" rid="eq1">Equation (1)</xref>:</p>
<disp-formula id="eq1">
<label>(1)</label>
<mml:math display="block" id="M1">
<mml:mrow>
<mml:msub>
<mml:mi>L</mml:mi>
<mml:mi>s</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>L</mml:mi>
<mml:mrow>
<mml:mi>C</mml:mi>
<mml:mi>E</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>y</mml:mi>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>g</mml:mi>
</mml:mrow>
<mml:mi>L</mml:mi>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:mi>Y</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>L</mml:mi>
<mml:mrow>
<mml:mi>d</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>e</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>y</mml:mi>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>g</mml:mi>
</mml:mrow>
<mml:mi>L</mml:mi>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:mi>Y</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>For the unlabeled data, we utilize the Soft-Hard process proposed in this paper to obtain pseudo-labels corresponding to different decoders for the same input. Firstly, we employ a sharpening function (<xref ref-type="bibr" rid="B32">32</xref>) to transform the probability maps into soft pseudo-labels. The computation process of the sharpening function is as <xref ref-type="disp-formula" rid="eq2">Equation (2)</xref>:</p>
<disp-formula id="eq2">
<label>(2)</label>
<mml:math display="block" id="M2">
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mi>S</mml:mi>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>d</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo>|</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>;</mml:mo>
<mml:mi>&#x3b8;</mml:mi>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo stretchy="false">/</mml:mo>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>d</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo>|</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>;</mml:mo>
<mml:mi>&#x3b8;</mml:mi>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo stretchy="false">/</mml:mo>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>+</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>d</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo>|</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>;</mml:mo>
<mml:mi>&#x3b8;</mml:mi>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo stretchy="false">/</mml:mo>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
<p>Here, <inline-formula>
<mml:math display="inline" id="im23">
<mml:mi>T</mml:mi>
</mml:math>
</inline-formula> is a hyperparameter controlling the sharpening temperature. By choosing an appropriate <inline-formula>
<mml:math display="inline" id="im24">
<mml:mi>T</mml:mi>
</mml:math>
</inline-formula>, we can apply entropy minimization constraint to regularize our model without introducing additional noise that might interfere with model training. The specific value of the temperature coefficient <inline-formula>
<mml:math display="inline" id="im25">
<mml:mi>T</mml:mi>
</mml:math>
</inline-formula> will be tested in Section 5.5. Afterwards, we obtain the corresponding hard labels generated by the above process, denoted as <inline-formula>
<mml:math display="inline" id="im26">
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mi>H</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi>arg</mml:mi>
<mml:mi>max</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mi>S</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>. The process of obtaining hard labels can be expressed as <xref ref-type="disp-formula" rid="eq3">Equation (3)</xref>:</p>
<disp-formula id="eq3">
<label>(3)</label>
<mml:math display="block" id="M3">
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mi>H</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi>arg</mml:mi>
<mml:mi>max</mml:mi>
<mml:mrow>
<mml:mo stretchy="true">(</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>d</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo>|</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>;</mml:mo>
<mml:mi>&#x3b8;</mml:mi>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo stretchy="false">/</mml:mo>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>d</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo>|</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>;</mml:mo>
<mml:mi>&#x3b8;</mml:mi>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo stretchy="false">/</mml:mo>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>+</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>d</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo>|</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>;</mml:mo>
<mml:mi>&#x3b8;</mml:mi>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo stretchy="false">/</mml:mo>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
<mml:mo stretchy="true">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>By minimizing the consistency loss imposed on the update direction of the aforementioned constrained model, we enable the primary decoder to integrate sample features extracted by different decoders, thereby maximizing the learning of latent sample information from unlabeled samples. Ultimately, the overall loss function of this study is formulated as <xref ref-type="disp-formula" rid="eq4">Equation (4)</xref>:</p>
<disp-formula id="eq4">
<label>(4)</label>
<mml:math display="block" id="M4">
<mml:mrow>
<mml:msub>
<mml:mi>L</mml:mi>
<mml:mi>u</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&amp;</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>&#x2260;</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="true">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>L</mml:mi>
<mml:mrow>
<mml:mi>C</mml:mi>
<mml:mi>E</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>y</mml:mi>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>g</mml:mi>
</mml:mrow>
<mml:mi>i</mml:mi>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:mi>P</mml:mi>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>L</mml:mi>
<mml:mrow>
<mml:mi>d</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>e</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>y</mml:mi>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>g</mml:mi>
</mml:mrow>
<mml:mi>i</mml:mi>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:mi>P</mml:mi>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="true">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:math>
</disp-formula>
<p>We treat the hard labels obtained from each decoder as the ground truth labels for the inputs of the other decoders. We then compute the multi-consistency loss between different decoders. Inspired by the findings in (<xref ref-type="bibr" rid="B32">32</xref>), we have adopted a strategy of weighting the supervised and unsupervised losses using hyperparameters to achieve improved experimental results. The formula for the multi-consistency loss is as <xref ref-type="disp-formula" rid="eq5">Equation (5)</xref>:</p>
<disp-formula id="eq5">
<label>(5)</label>
<mml:math display="block" id="M5">
<mml:mrow>
<mml:mi>L</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi>&#x3bb;</mml:mi>
<mml:msub>
<mml:mi>L</mml:mi>
<mml:mi>s</mml:mi>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:mi>&#x3b2;</mml:mi>
<mml:msub>
<mml:mi>L</mml:mi>
<mml:mi>u</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <inline-formula>
<mml:math display="inline" id="im27">
<mml:mi>&#x3bb;</mml:mi>
</mml:math>
</inline-formula> represents the weight of the supervised loss, and <inline-formula>
<mml:math display="inline" id="im28">
<mml:mi>&#x3b2;</mml:mi>
</mml:math>
</inline-formula> represents the weight of the unsupervised loss. Due to the lack of true labels for unlabeled data, pseudo-labels generated by the Soft-Hard process might initially lead the model in the wrong direction during training. Therefore, this paper adopts the method from (<xref ref-type="bibr" rid="B33">33</xref>) by adding a time-varying Gaussian weighting function, denoted as <inline-formula>
<mml:math display="inline" id="im29">
<mml:mrow>
<mml:mi>&#x3b2;</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>0.001</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>exp</mml:mi>
<mml:mrow>
<mml:mo stretchy="true">(</mml:mo>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>5</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="true">(</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mfrac>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
<mml:mo stretchy="true">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
<mml:mo stretchy="true">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>, to the unsupervised loss. This aims to balance the supervised loss and the consistency loss. Here, <inline-formula>
<mml:math display="inline" id="im30">
<mml:mi>t</mml:mi>
</mml:math>
</inline-formula> denotes the current iteration count, and <inline-formula>
<mml:math display="inline" id="im31">
<mml:mrow>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mi>max</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> represents the maximum number of iterations.</p>
</sec>
</sec>
<sec id="s4">
<label>4</label>
<title>Experimental setup</title>
<sec id="s4_1">
<label>4.1</label>
<title>Dataset</title>
<p>In this study, we utilized the cholangiocarcinoma micro-hyperspectral images from the multi-dimensional biliary tract database collected in (<xref ref-type="bibr" rid="B34">34</xref>). This dataset comprises 880 scenes from 174 individuals, with 689 scenes containing partially cancerous regions, 49 scenes representing complete cancerous regions, and 142 scenes devoid of any cancerous regions. The spatial resolution of these images is <inline-formula>
<mml:math display="inline" id="im32">
<mml:mrow>
<mml:mn>1024</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>1280</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula> pixels, each image encompassing 60 bands uniformly distributed from 550nm to 1000nm. While the database provides pixel-level labels for each image, it was observed through experimentation that these labels were somewhat coarse, failing to meet the accuracy requirements for semantic segmentation tasks. As illustrated in <xref ref-type="fig" rid="f3">
<bold>Figure&#xa0;3A</bold>
</xref>, shows a false-color image synthesized using the 5th, 15th, and 25th bands of the input data, while <xref ref-type="fig" rid="f3">
<bold>Figure&#xa0;3B</bold>
</xref> depicts the original labels provided by the database. It can be noted that, on one hand, the labels are disconnected at the opening of the circular structure, while in reality, they should be continuous. On the other hand, the boundaries between the tumor region and the normal region in the original labels are too sharp and abrupt, failing to accurately represent the true boundary information. Therefore, experienced researchers re-annotated the dataset, resulting in a total of 94 re-annotated images. The annotated results are shown in <xref ref-type="fig" rid="f3">
<bold>Figure&#xa0;3C</bold>
</xref>, demonstrating improved continuity and accuracy compared to the original labels.</p>
<fig id="f3" position="float">
<label>Figure&#xa0;3</label>
<caption>
<p>
<bold>(A)</bold> False color images synthesized using bands 5,15, and 25; <bold>(B)</bold> The original label provided in (<xref ref-type="bibr" rid="B34">34</xref>); <bold>(C)</bold> Our re-labeling.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fonc-14-1396887-g003.tif"/>
</fig>
</sec>
<sec id="s4_2">
<label>4.2</label>
<title>Evaluation metrics</title>
<p>We employed four evaluation metrics to assess the performance of semi-supervised segmentation on histopathological images, including overall accuracy (OA), average accuracy (AA), Dice coefficient, and mean intersection over Union (MIoU). Better segmentation performance is indicated by higher values of OA, AA, Dice, and MIoU.</p>
<p>OA and AA represent the proportion of pixels correctly classified: We define the tumor region as positive samples and the non-cancer region (i.e., normal region) as negative samples. TP, TN, FP, and FN denote true positive pixels, true negative pixels, false positive pixels, and false negative pixels, respectively. Then, OA and AA can be defined as <xref ref-type="disp-formula" rid="eq6">Equations (6)</xref> and <xref ref-type="disp-formula" rid="eq7">(7)</xref>:</p>
<disp-formula id="eq6">
<label>(6)</label>
<mml:math display="block" id="M6">
<mml:mrow>
<mml:mi>O</mml:mi>
<mml:mi>A</mml:mi>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi>P</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>T</mml:mi>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi>P</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>T</mml:mi>
<mml:mi>N</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>P</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="eq7">
<label>(7)</label>
<mml:math display="block" id="M7">
<mml:mrow>
<mml:mi>A</mml:mi>
<mml:mi>A</mml:mi>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mn>2</mml:mn>
</mml:mfrac>
<mml:mrow>
<mml:mo stretchy="true">(</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi>P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi>P</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mo>+</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>F</mml:mi>
<mml:mi>P</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>T</mml:mi>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
<mml:mo stretchy="true">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>MIoU measures the ratio of the intersection to the union between the sets of true positive and predicted positive pixels as <xref ref-type="disp-formula" rid="eq8">Equation (8)</xref>. It provides a measure of how well the predicted segmentation aligns with the ground truth segmentation.</p>
<disp-formula id="eq8">
<label>(8)</label>
<mml:math display="block" id="M8">
<mml:mrow>
<mml:mi>M</mml:mi>
<mml:mi>I</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>U</mml:mi>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mn>2</mml:mn>
</mml:mfrac>
<mml:mrow>
<mml:mo stretchy="true">(</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi>P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi>P</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>P</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mo>+</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>F</mml:mi>
<mml:mi>P</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>N</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>T</mml:mi>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
<mml:mo stretchy="true">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>The Dice coefficient is a metric that quantifies the similarity between the predicted segmentation (P) and the ground truth labels (Y) based on region overlap as <xref ref-type="disp-formula" rid="eq9">Equation (9)</xref>. It&#x2019;s widely used in image segmentation tasks to evaluate the accuracy of the segmentation results.</p>
<disp-formula id="eq9">
<label>(9)</label>
<mml:math display="block" id="M9">
<mml:mrow>
<mml:mi>D</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>e</mml:mi>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>&#xb7;</mml:mo>
<mml:mrow>
<mml:mo>|</mml:mo>
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mo>&#x2229;</mml:mo>
<mml:mi>Y</mml:mi>
</mml:mrow>
<mml:mo>|</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mo>|</mml:mo>
<mml:mtext>P</mml:mtext>
<mml:mo>|</mml:mo>
</mml:mrow>
<mml:mo>+</mml:mo>
<mml:mrow>
<mml:mo>|</mml:mo>
<mml:mtext>Y</mml:mtext>
<mml:mo>|</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
</sec>
<sec id="s4_3">
<label>4.3</label>
<title>Implementation detail</title>
<p>In this study, we implemented the model in the environment of PyTorch 1.13.1 with CUDA 11.7 and Python 3.8. Training and testing were performed on an NVIDIA GeForce RTX 3090. The batch size was set to 2, with batch sizes of 1 for labeled and unlabeled data, respectively. In the data preprocessing stage, the input data was dimensionally reduced to 6 channels using PCA. We utilized the SGD optimizer for training the entire network for 100 epochs, with a learning rate of 0.01 and momentum set to 0.9. Gaussian noise with mean 1 and standard deviation 1.2 was added to one auxiliary decoder, and mean 1 and standard deviation 1.5 was added to the other auxiliary decoder.</p>
</sec>
</sec>
<sec id="s5" sec-type="results">
<label>5</label>
<title>Result</title>
<sec id="s5_1">
<label>5.1</label>
<title>Comparative experiments</title>
<p>To demonstrate the effectiveness of our proposed semi-supervised method, we conducted comparative experiments on the Multidimensional Cholangiocarcinoma Dataset. We compared our method with six other approaches, including:(1). 2D U-Net (<xref ref-type="bibr" rid="B35">35</xref>); (2) Mean Teacher (MT) (<xref ref-type="bibr" rid="B23">23</xref>); (3) Uncertainty Aware Mean Teacher (UA-MT) (<xref ref-type="bibr" rid="B33">33</xref>); (4) Cross Consistency Training (CCT) (<xref ref-type="bibr" rid="B36">36</xref>); (5) Cross Pseudo Supervision (CPS) (<xref ref-type="bibr" rid="B14">14</xref>); (6). Uncertainty-aware Pseudo-label and Consistency (UPC) (<xref ref-type="bibr" rid="B37">37</xref>); Here, the 2D U-Net is trained in a fully supervised manner using a limited set of labeled samples. Our proposed method and the other five approaches utilize semi-supervised learning algorithms with a certain proportion of labeled data and a large amount of unlabeled data.</p>
<p>
<xref ref-type="table" rid="T1">
<bold>Table&#xa0;1</bold>
</xref> presents quantitative results obtained using various semi-supervised models with different labeling ratios. From <xref ref-type="table" rid="T1">
<bold>Table&#xa0;1</bold>
</xref>, it can be observed that our proposed method outperforms the fully supervised approach in scenarios with different labeling ratios. Particularly, when using 20% labeled data, our method shows improvements of 1.23% in OA, 0.77% in AA, 2.04% in MIOU, and 1.97% in DICE compared to the fully supervised 2D U-Net. This indicates that our proposed method is able to better utilize the information embedded in the unlabeled data compared to the fully supervised approach. In comparison with other semi-supervised methods, our method achieves results close to the other methods, indicating that our proposed approach is suitable for histopathological microscopic hyperspectral images. <xref ref-type="fig" rid="f4">
<bold>Figure&#xa0;4</bold>
</xref> displays the predicted results of the fully supervised and semi-supervised methods, including our proposed method, using different labeling ratios. It can be observed from the figure that our method&#x2019;s predicted results are closer to the ground truth labels.</p>
<table-wrap id="T1" position="float">
<label>Table&#xa0;1</label>
<caption>
<p>Quantitative experimental results of different methods on multi-dimensional common bile duct dataset.</p>
</caption>
<table frame="hsides">
<thead>
<tr>
<th valign="middle" align="center">Labeled data</th>
<th valign="middle" align="center">Method</th>
<th valign="middle" align="center">OA<inline-formula>
<mml:math display="inline" id="im33">
<mml:mo>&#x2191;</mml:mo>
</mml:math>
</inline-formula>
</th>
<th valign="middle" align="center">AA<inline-formula>
<mml:math display="inline" id="im34">
<mml:mo>&#x2191;</mml:mo>
</mml:math>
</inline-formula>
</th>
<th valign="middle" align="center">Dice<inline-formula>
<mml:math display="inline" id="im35">
<mml:mo>&#x2191;</mml:mo>
</mml:math>
</inline-formula>
</th>
<th valign="middle" align="center">MIoU<inline-formula>
<mml:math display="inline" id="im36">
<mml:mo>&#x2191;</mml:mo>
</mml:math>
</inline-formula>
</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="center">6/66(10%)</td>
<td valign="middle" align="center">MT (<xref ref-type="bibr" rid="B23">23</xref>)</td>
<td valign="middle" align="center">88.69%</td>
<td valign="middle" align="center">82.34%</td>
<td valign="middle" align="center">66.00%</td>
<td valign="middle" align="center">69.24%</td>
</tr>
<tr>
<td valign="top" align="center"/>
<td valign="middle" align="center">UA-MT (<xref ref-type="bibr" rid="B33">33</xref>)</td>
<td valign="middle" align="center">88.62%</td>
<td valign="middle" align="center">82.93%</td>
<td valign="middle" align="center">66.59%</td>
<td valign="middle" align="center">69.35%</td>
</tr>
<tr>
<td valign="top" align="center"/>
<td valign="middle" align="center">CCT (<xref ref-type="bibr" rid="B36">36</xref>)</td>
<td valign="middle" align="center">87.83%</td>
<td valign="middle" align="center">82.69%</td>
<td valign="middle" align="center">65.24%</td>
<td valign="middle" align="center">68.21%</td>
</tr>
<tr>
<td valign="top" align="center"/>
<td valign="middle" align="center">CPS (<xref ref-type="bibr" rid="B14">14</xref>)</td>
<td valign="middle" align="center">88.72%</td>
<td valign="middle" align="center">82.83%</td>
<td valign="middle" align="center">66.38%</td>
<td valign="middle" align="center">69.53%</td>
</tr>
<tr>
<td valign="top" align="center"/>
<td valign="middle" align="center">UPC (<xref ref-type="bibr" rid="B37">37</xref>)</td>
<td valign="middle" align="center">88.50%</td>
<td valign="middle" align="center">83.32%</td>
<td valign="middle" align="center">66.81%</td>
<td valign="middle" align="center">69.65%</td>
</tr>
<tr>
<td valign="top" align="center"/>
<td valign="middle" align="center">2D-Unet (<xref ref-type="bibr" rid="B35">35</xref>)</td>
<td valign="middle" align="center">88.56%</td>
<td valign="middle" align="center">82.20%</td>
<td valign="middle" align="center">65.24%</td>
<td valign="middle" align="center">68.77%</td>
</tr>
<tr>
<td valign="top" align="center"/>
<td valign="middle" align="center">Ours</td>
<td valign="middle" align="center">88.81%</td>
<td valign="middle" align="center">82.76%</td>
<td valign="middle" align="center">65.50%</td>
<td valign="middle" align="center">68.95%</td>
</tr>
<tr>
<td valign="top" align="center">13/66(20%)</td>
<td valign="middle" align="center">MT (<xref ref-type="bibr" rid="B23">23</xref>)</td>
<td valign="bottom" align="center">89.84%</td>
<td valign="bottom" align="center">
<bold>86.50%</bold>
</td>
<td valign="bottom" align="center">70.61%</td>
<td valign="bottom" align="center">72.45%</td>
</tr>
<tr>
<td valign="top" align="center"/>
<td valign="middle" align="center">UA-MT (<xref ref-type="bibr" rid="B33">33</xref>)</td>
<td valign="bottom" align="center">90.17%</td>
<td valign="bottom" align="center">86.05%</td>
<td valign="bottom" align="center">70.77%</td>
<td valign="bottom" align="center">72.67%</td>
</tr>
<tr>
<td valign="top" align="center"/>
<td valign="middle" align="center">CCT (<xref ref-type="bibr" rid="B36">36</xref>)</td>
<td valign="bottom" align="center">89.74%</td>
<td valign="bottom" align="center">85.96%</td>
<td valign="bottom" align="center">71.05%</td>
<td valign="bottom" align="center">72.79%</td>
</tr>
<tr>
<td valign="top" align="center"/>
<td valign="middle" align="center">CPS (<xref ref-type="bibr" rid="B14">14</xref>)</td>
<td valign="bottom" align="center">90.19%</td>
<td valign="bottom" align="center">86.03%</td>
<td valign="bottom" align="center">70.88%</td>
<td valign="bottom" align="center">72.59%</td>
</tr>
<tr>
<td valign="top" align="center"/>
<td valign="middle" align="center">UPC (<xref ref-type="bibr" rid="B37">37</xref>)</td>
<td valign="bottom" align="center">90.32%</td>
<td valign="bottom" align="center">85.11%</td>
<td valign="bottom" align="center">70.40%</td>
<td valign="bottom" align="center">72.53%</td>
</tr>
<tr>
<td valign="top" align="center"/>
<td valign="middle" align="center">2D-Unet (<xref ref-type="bibr" rid="B35">35</xref>)</td>
<td valign="middle" align="center">89.25%</td>
<td valign="middle" align="center">85.14%</td>
<td valign="middle" align="center">69.15%</td>
<td valign="middle" align="center">71.00%</td>
</tr>
<tr>
<td valign="top" align="center"/>
<td valign="middle" align="center">Ours</td>
<td valign="bottom" align="center">
<bold>90.48%</bold>
</td>
<td valign="bottom" align="center">85.91%</td>
<td valign="bottom" align="center">
<bold>71.19%</bold>
</td>
<td valign="bottom" align="center">
<bold>72.97%</bold>
</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>Bold text represent the optimal result.</p>
</fn>
<fn>
<p>The symbol '&#x2191;' signifies that a higher metric corresponds to better segmentation performance.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<fig id="f4" position="float">
<label>Figure&#xa0;4</label>
<caption>
<p>Visualizations of different methods.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fonc-14-1396887-g004.tif"/>
</fig>
</sec>
<sec id="s5_2">
<label>5.2</label>
<title>Ablation study</title>
<p>In order to validate the effectiveness of the proposed method, ablation experiments were conducted on the multi-dimensional bile duct dataset. We removed the &#x201c;multi-consistency&#x201d; learning method and the Soft-Hard pseudo-label generation method, constructing a &#x201c;basic&#x201d; model that uses a shared encoder and three independent decoders. We then separately added the multi-consistency learning strategy and the Soft-Hard pseudo-label generation method, referred to as &#x201c;basic+mcl&#x201d; and &#x201c;basic+s-h&#x201d; respectively. Next, we incorporated both of these methods into the &#x201c;basic&#x201d; model to obtain our final model, MCL-Net. The quantitative analysis results of the ablation study are presented in <xref ref-type="table" rid="T2">
<bold>Table&#xa0;2</bold>
</xref>.</p>
<table-wrap id="T2" position="float">
<label>Table&#xa0;2</label>
<caption>
<p>Ablation results on multidimensional common bile duct dataset.</p>
</caption>
<table frame="hsides">
<thead>
<tr>
<th valign="middle" align="center">Labeled data</th>
<th valign="middle" align="center">Method</th>
<th valign="middle" align="center">Dice<inline-formula>
<mml:math display="inline" id="im37">
<mml:mo>&#x2191;</mml:mo>
</mml:math>
</inline-formula>
</th>
<th valign="middle" align="center">MIoU<inline-formula>
<mml:math display="inline" id="im38">
<mml:mo>&#x2191;</mml:mo>
</mml:math>
</inline-formula>
</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="center">6/66(10%)</td>
<td valign="middle" align="center">basic</td>
<td valign="middle" align="center">69.43%</td>
<td valign="middle" align="center">71.12%</td>
</tr>
<tr>
<td valign="top" align="center"/>
<td valign="middle" align="center">basic+mcl</td>
<td valign="middle" align="center">70.50%</td>
<td valign="middle" align="center">72.23%</td>
</tr>
<tr>
<td valign="top" align="center"/>
<td valign="middle" align="center">basic+s-h</td>
<td valign="middle" align="center">70.69%</td>
<td valign="middle" align="center">72.01%</td>
</tr>
<tr>
<td valign="top" align="center"/>
<td valign="middle" align="center">ours</td>
<td valign="bottom" align="center">71.19%</td>
<td valign="bottom" align="center">72.97%</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>The symbol '&#x2191;' signifies that a higher metric corresponds to better segmentation performance.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>From <xref ref-type="table" rid="T2">
<bold>Table&#xa0;2</bold>
</xref>, it can be observed that when using the &#x201c;basic&#x201d; model, the experimental results showed little improvement compared to the fully supervised approach. This indicates that merely employing a multi-decoder structure does not significantly enhance experimental performance. When we added our proposed multi-consistency learning strategy to the original model, the Dice coefficient and MIoU improved by 1.07% and 1.11% respectively. This demonstrates that the multi-consistency learning strategy effectively integrates features extracted by different decoders, extracting more sample information from unlabeled data.</p>
<p>When we separately added the Soft-Hard method to the model, the two evaluation metrics improved from 69.43% and 71.12% to 70.69% and 72.23%. This suggests that our proposed pseudo-label generation strategy can bring pseudo-labels closer to real labels to a greater extent. When we applied both of our proposed methods together on the model, experimental performance further improved. This indicates that our multi-consistency method and Soft-Hard method can collaborate synergistically to enhance experimental performance.</p>
</sec>
<sec id="s5_3">
<label>5.3</label>
<title>Influence of different data perturbation methods</title>
<p>To increase the diversity of input data among different decoders, we applied perturbations to the inputs of the two decoders, excluding the main decoder. We utilized three perturbation methods: adding Gaussian noise (Add-gn), adding salt-and-pepper noise (Add-spn), and adding Poisson noise (Add-pn). In our experiments, we employed strategies of using the same noise and adding different noises to different decoder inputs. Specifically, we denoted the use of Gaussian noise and salt-and-pepper noise as (Add-gspn), the use of Gaussian noise and Poisson noise as (Add-gpn), and the use of salt-and-pepper noise and Poisson noise as (Add-sppn). <xref ref-type="table" rid="T3">
<bold>Table&#xa0;3</bold>
</xref> records the quantitative results corresponding to the six methods mentioned above. For each noise adding method, we conducted numerous experiments, and the results in <xref ref-type="table" rid="T3">
<bold>Table&#xa0;3</bold>
</xref> represent the optimal settings for each method. From <xref ref-type="table" rid="T3">
<bold>Table&#xa0;3</bold>
</xref>, it can be observed that the optimal experimental results are achieved when adding Gaussian noise to both auxiliary decoders. This may be attributed to the fact that Gaussian noise better simulates the existing noise patterns in images, aligning more closely with real-world requirements.</p>
<table-wrap id="T3" position="float">
<label>Table&#xa0;3</label>
<caption>
<p>Influence of different data perturbation methods.</p>
</caption>
<table frame="hsides">
<thead>
<tr>
<th valign="middle" align="center">Labeled data</th>
<th valign="middle" align="center">Method</th>
<th valign="middle" align="center">OA<inline-formula>
<mml:math display="inline" id="im39">
<mml:mo>&#x2191;</mml:mo>
</mml:math>
</inline-formula>
</th>
<th valign="middle" align="center">AA<inline-formula>
<mml:math display="inline" id="im40">
<mml:mo>&#x2191;</mml:mo>
</mml:math>
</inline-formula>
</th>
<th valign="middle" align="center">Dice<inline-formula>
<mml:math display="inline" id="im41">
<mml:mo>&#x2191;</mml:mo>
</mml:math>
</inline-formula>
</th>
<th valign="middle" align="center">MIoU<inline-formula>
<mml:math display="inline" id="im42">
<mml:mo>&#x2191;</mml:mo>
</mml:math>
</inline-formula>
</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="center">6/66(10%)</td>
<td valign="middle" align="center">
<bold>Add-gn</bold>
</td>
<td valign="bottom" align="center">
<bold>90.48%</bold>
</td>
<td valign="bottom" align="center">
<bold>85.91%</bold>
</td>
<td valign="bottom" align="center">
<bold>71.19%</bold>
</td>
<td valign="bottom" align="center">
<bold>72.97%</bold>
</td>
</tr>
<tr>
<td valign="top" align="center"/>
<td valign="middle" align="center">Add-spn</td>
<td valign="middle" align="center">89.60%</td>
<td valign="middle" align="center">84.28%</td>
<td valign="middle" align="center">69.24%</td>
<td valign="middle" align="center">71.51%</td>
</tr>
<tr>
<td valign="top" align="center"/>
<td valign="middle" align="center">Add-pn</td>
<td valign="middle" align="center">88.77%</td>
<td valign="middle" align="center">84.07%</td>
<td valign="middle" align="center">70.61%</td>
<td valign="middle" align="center">71.94%</td>
</tr>
<tr>
<td valign="top" align="center"/>
<td valign="middle" align="center">Add-gpn</td>
<td valign="middle" align="center">90.45%</td>
<td valign="middle" align="center">84.50%</td>
<td valign="middle" align="center">70.81%</td>
<td valign="middle" align="center">72.85%</td>
</tr>
<tr>
<td valign="top" align="center"/>
<td valign="middle" align="center">Add-gspn</td>
<td valign="middle" align="center">89.59%</td>
<td valign="middle" align="center">84.36%</td>
<td valign="middle" align="center">69.29%</td>
<td valign="middle" align="center">71.54%</td>
</tr>
<tr>
<td valign="top" align="center"/>
<td valign="middle" align="center">Add-sppn</td>
<td valign="bottom" align="center">89.72%</td>
<td valign="bottom" align="center">84.85%</td>
<td valign="bottom" align="center">69.23%</td>
<td valign="bottom" align="center">71.40%</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>Bold text represent the optimal result.</p>
</fn>
<fn>
<p>The symbol '&#x2191;' signifies that a higher metric corresponds to better segmentation performance.</p>
</fn>
</table-wrap-foot>
</table-wrap>
</sec>
<sec id="s5_4">
<label>5.4</label>
<title>Impact of different decoder numbers</title>
<p>In order to obtain more comprehensive sample features, we designed a multi-decoder structure. To better understand the influence of the number of decoders <inline-formula>
<mml:math display="inline" id="im43">
<mml:mi>n</mml:mi>
</mml:math>
</inline-formula> on the experimental results, we set <inline-formula>
<mml:math display="inline" id="im44">
<mml:mi>n</mml:mi>
</mml:math>
</inline-formula> to different values. When <inline-formula>
<mml:math display="inline" id="im45">
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>, the model corresponds to the classic 2D U-Net model. Since it is not possible to apply the proposed multi-consistency loss and Soft-Hard method in this case, the model operates in a fully supervised manner. When <inline-formula>
<mml:math display="inline" id="im46">
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>&#x2265;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>, as described in section 5.3, experiments were conducted by perturbing the input data to the decoders using Gaussian noise applied before the auxiliary decoders.</p>
<p>
<xref ref-type="fig" rid="f5">
<bold>Figure&#xa0;5</bold>
</xref> illustrates the influence of the number of decoders on the Dice similarity coefficient under different proportions of labeled data. It can be observed from the figure that as <inline-formula>
<mml:math display="inline" id="im47">
<mml:mi>n</mml:mi>
</mml:math>
</inline-formula> increases from 1 to 2, the Dice coefficient significantly improves. This indicates that the multi-decoder structure and multi-consistency learning strategy can comprehensively learn the model features. As <inline-formula>
<mml:math display="inline" id="im48">
<mml:mi>n</mml:mi>
</mml:math>
</inline-formula> further increases from 2 to 3, the Dice coefficient continues to improve, suggesting that appropriately increasing the number of decoders allows for more effective utilization of information from unlabeled data. However, when <inline-formula>
<mml:math display="inline" id="im49">
<mml:mi>n</mml:mi>
</mml:math>
</inline-formula> is increased to 4, the Dice coefficient experiences only a slight increase. Therefore, to balance accuracy and efficiency in the experiments, we set <inline-formula>
<mml:math display="inline" id="im50">
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>3</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>.</p>
<fig id="f5" position="float">
<label>Figure&#xa0;5</label>
<caption>
<p>Impact of different decoder numbers on experimental results.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fonc-14-1396887-g005.tif"/>
</fig>
</sec>
<sec id="s5_5">
<label>5.5</label>
<title>Impact of temperature <inline-formula>
<mml:math display="inline" id="im51">
<mml:mi>T</mml:mi>
</mml:math>
</inline-formula>
</title>
<p>In <xref ref-type="disp-formula" rid="eq2">Equation 2</xref>, we use a sharpening function to generate preliminary pseudo-labels. <xref ref-type="fig" rid="f6">
<bold>Figure&#xa0;6</bold>
</xref> presents the Dice coefficients obtained by training our MCL-Net+ model on the multi-dimensional bile duct dataset with different temperature values <inline-formula>
<mml:math display="inline" id="im52">
<mml:mi>T</mml:mi>
</mml:math>
</inline-formula>. Following the guidance from (<xref ref-type="bibr" rid="B29">29</xref>), we experimented with different values of <inline-formula>
<mml:math display="inline" id="im53">
<mml:mi>T</mml:mi>
</mml:math>
</inline-formula>, specifically setting <inline-formula>
<mml:math display="inline" id="im54">
<mml:mi>T</mml:mi>
</mml:math>
</inline-formula> to 0.01, 0.1, 0.5, and 1.</p>
<fig id="f6" position="float">
<label>Figure&#xa0;6</label>
<caption>
<p>Impact of temperature coefficient T on experimental results.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fonc-14-1396887-g006.tif"/>
</fig>
<p>From <xref ref-type="fig" rid="f6">
<bold>Figure&#xa0;6</bold>
</xref>, it can be observed that when <inline-formula>
<mml:math display="inline" id="im55">
<mml:mo>&#xa0;</mml:mo>
</mml:math>
</inline-formula>, the model achieves the optimal experimental results. When <inline-formula>
<mml:math display="inline" id="im56">
<mml:mi>T</mml:mi>
</mml:math>
</inline-formula> is too large, the model may fail to generate reasonable soft pseudo-labels due to the inability to utilize entropy minimization. On the other hand, when <inline-formula>
<mml:math display="inline" id="im57">
<mml:mi>T</mml:mi>
</mml:math>
</inline-formula> is too small, it may introduce noise into the pseudo-labels, leading to prediction errors. Therefore, this study ultimately selects a temperature coefficient of <inline-formula>
<mml:math display="inline" id="im58">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>0.5</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>.</p>
</sec>
<sec id="s5_6">
<label>5.6</label>
<title>Impact of supervised loss weight <inline-formula>
<mml:math display="inline" id="im59">
<mml:mi>&#x3bb;</mml:mi>
</mml:math>
</inline-formula>
</title>
<p>We further investigated the influence of the weight of the supervised loss term <inline-formula>
<mml:math display="inline" id="im60">
<mml:mi>&#x3bb;</mml:mi>
</mml:math>
</inline-formula> in the loss function. In <xref ref-type="disp-formula" rid="eq5">Equation 5</xref>, the weight of the unsupervised loss is set according to a Gaussian warming function, while <inline-formula>
<mml:math display="inline" id="im61">
<mml:mi>&#x3bb;</mml:mi>
</mml:math>
</inline-formula> affects the balance between the two types of losses. <xref ref-type="fig" rid="f7">
<bold>Figure&#xa0;7</bold>
</xref> illustrates how different weights impact the experimental performance.</p>
<fig id="f7" position="float">
<label>Figure&#xa0;7</label>
<caption>
<p>Impact of supervision loss weight on experimental results.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fonc-14-1396887-g007.tif"/>
</fig>
<p>When <inline-formula>
<mml:math display="inline" id="im62">
<mml:mi>&#x3bb;</mml:mi>
</mml:math>
</inline-formula> is too high, the model tends to focus more on extracting features from the labeled data, but this comes at the cost of neglecting the unlabeled data, as the proposed multi-consistency loss may not be effectively utilized. On the other hand, when <inline-formula>
<mml:math display="inline" id="im63">
<mml:mi>&#x3bb;</mml:mi>
</mml:math>
</inline-formula> is too low, the accurately labeled data is not effectively leveraged, resulting in poorer experimental results. Therefore, we finally chose <inline-formula>
<mml:math display="inline" id="im64">
<mml:mrow>
<mml:mi>&#x3bb;</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>0.5</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula> as the final experimental setting.</p>
</sec>
</sec>
<sec id="s6" sec-type="conclusions">
<label>6</label>
<title>Conclusion and future work</title>
<p>In this paper, we proposed a novel semi-supervised segmentation model, MCL-Net, for microspectroscopic pathology images. The model combines consistency regularization and pseudo-labeling methods. MCL-Net employs a shared encoder and multiple independent decoders. Through the proposed Soft-Hard pseudo-labeling strategy, MCL-Net generates pseudo-labels that are closer to the real labels for pathological images. Additionally, we introduced a multi-consistency learning strategy, treating the pseudo-labels generated by the Soft-Hard process as&#xa0;real labels. This encourages consistency among predictions from different decoders, enabling the model to learn more sample features.</p>
<p>The effectiveness of this approach was demonstrated through extensive experiments, providing a new perspective for the segmentation of microspectroscopic pathological images. Despite the promising results, there are limitations. Specifically, when using only 10% labeled data for experiments, our method did not significantly improve performance. This might be attributed to the limited explicit application of spectral information, which is unique to microspectroscopy. In the future, we will further explore ways to utilize spectral information and consider both labeled and unlabeled samples from multiple angles.</p>
</sec>
<sec id="s7" sec-type="data-availability">
<title>Data availability statement</title>
<p>The datasets presented in this article are not readily available because this is for the use of this team only. Requests to access the datasets should be directed to JF, <email xlink:href="mailto:2106020108@hhu.edu.cn">2106020108@hhu.edu.cn</email>.</p>
</sec>
<sec id="s8" sec-type="ethics-statement">
<title>Ethics statement</title>
<p>Ethical approval was not required for the study involving humans in accordance with the local legislation and institutional requirements. Written informed consent to participate in this study was not required from the participants or the participants' legal guardians/next of kin in accordance with the national legislation and the institutional requirements.</p>
</sec>
<sec id="s9" sec-type="author-contributions">
<title>Author contributions</title>
<p>JF: Data curation, Formal Analysis, Investigation, Visualization, Writing &#x2013; original draft, Writing &#x2013; review &amp; editing.</p>
</sec>
</body>
<back>
<sec id="s10" sec-type="funding-information">
<title>Funding</title>
<p>The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This research was supported by National Undergraduate Training Program for Innovation and Entrepreneurship under Grant 202310294055Z.</p>
</sec>
<sec id="s11" sec-type="COI-statement">
<title>Conflict of interest</title>
<p>The author declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec id="s12" sec-type="disclaimer">
<title>Publisher&#x2019;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<label>1</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname> <given-names>Z</given-names>
</name>
<name>
<surname>Li</surname> <given-names>C</given-names>
</name>
<name>
<surname>Luo</surname> <given-names>X</given-names>
</name>
<name>
<surname>Zhou</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Zhu</surname> <given-names>J</given-names>
</name>
<name>
<surname>Xu</surname> <given-names>C</given-names>
</name>
<etal/>
</person-group>. <article-title>Toward source-free cross tissues histopathological cell segmentation via target-specific finetuning</article-title>. <source>IEEE Trans Med Imaging</source>. (<year>2023</year>) <volume>42</volume>:<page-range>2666&#x2013;77</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/TMI.2023.3263465</pub-id>
</citation>
</ref>
<ref id="B2">
<label>2</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hu</surname> <given-names>W</given-names>
</name>
<name>
<surname>Li</surname> <given-names>X</given-names>
</name>
<name>
<surname>Li</surname> <given-names>C</given-names>
</name>
<name>
<surname>Li</surname> <given-names>R</given-names>
</name>
<name>
<surname>Jiang</surname> <given-names>T</given-names>
</name>
<name>
<surname>Sun</surname> <given-names>H</given-names>
</name>
<etal/>
</person-group>. <article-title>A state-of-the-art survey of artificial neural networks for whole-slide image analysis: from popular convolutional neural networks to potential visual transformers</article-title>. <source>Comput Biol Med</source>. (<year>2023</year>) <volume>161</volume>:<fpage>107034</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.compbiomed.2023.107034</pub-id>
</citation>
</ref>
<ref id="B3">
<label>3</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mehta</surname> <given-names>S</given-names>
</name>
<name>
<surname>Lu</surname> <given-names>X</given-names>
</name>
<name>
<surname>Wu</surname> <given-names>W</given-names>
</name>
<name>
<surname>Weaver</surname> <given-names>D</given-names>
</name>
<name>
<surname>Hajishirzi</surname> <given-names>H</given-names>
</name>
<name>
<surname>Elmore</surname> <given-names>JG</given-names>
</name>
<etal/>
</person-group>. <article-title>End-to-End diagnosis of breast biopsy images with transformers</article-title>. <source>Med Image Anal</source>. (<year>2022</year>) <volume>79</volume>:<fpage>102466</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.media.2022.102466</pub-id>
</citation>
</ref>
<ref id="B4">
<label>4</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gao</surname> <given-names>Z</given-names>
</name>
<name>
<surname>Jia</surname> <given-names>C</given-names>
</name>
<name>
<surname>Li</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>X</given-names>
</name>
<name>
<surname>Hong</surname> <given-names>B</given-names>
</name>
<name>
<surname>Wu</surname> <given-names>J</given-names>
</name>
<etal/>
</person-group>. <article-title>Unsupervised representation learning for tissue segmentation in histopathological images: From global to local contrast</article-title>. <source>IEEE Trans Med Imaging</source>. (<year>2022</year>) <volume>41</volume>:<page-range>3611&#x2013;23</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/TMI.2022.3191398</pub-id>
</citation>
</ref>
<ref id="B5">
<label>5</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname> <given-names>M</given-names>
</name>
<name>
<surname>Xu</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>Z</given-names>
</name>
<name>
<surname>Xing</surname> <given-names>C</given-names>
</name>
</person-group>. <article-title>Deep margin cosine autoencoder-based medical hyperspectral image classification for tumor diagnosis</article-title>. <source>IEEE Trans Instrum Meas</source>. (<year>2023</year>) <volume>72</volume>:<fpage>1</fpage>&#x2013;<lpage>12</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/TIM.2023.3293548</pub-id>
</citation>
</ref>
<ref id="B6">
<label>6</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rav&#xec;</surname> <given-names>D</given-names>
</name>
<name>
<surname>Fabelo</surname> <given-names>H</given-names>
</name>
<name>
<surname>Callic</surname> <given-names>GM</given-names>
</name>
<name>
<surname>Yang</surname> <given-names>G-Z</given-names>
</name>
</person-group>. <article-title>Manifold embedding and semantic segmentation for intraoperative guidance with hyperspectral brain imaging</article-title>. <source>IEEE Trans Med Imaging</source>. (<year>2017</year>) <volume>36</volume>:<page-range>1845&#x2013;57</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/TMI.42</pub-id>
</citation>
</ref>
<ref id="B7">
<label>7</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Muniz</surname> <given-names>FB</given-names>
</name>
<name>
<surname>Baffa</surname> <given-names>MdeFO</given-names>
</name>
<name>
<surname>Garcia</surname> <given-names>SB</given-names>
</name>
<name>
<surname>Bachmann</surname> <given-names>L</given-names>
</name>
<name>
<surname>Felipe</surname> <given-names>JC</given-names>
</name>
</person-group>. <article-title>Histopathological diagnosis of colon cancer using micro-FTIR hyperspectral imaging and deep learning</article-title>. <source>Comput Methods Programs BioMed</source>. (<year>2023</year>) <volume>231</volume>:<fpage>107388</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.cmpb.2023.107388</pub-id>
</citation>
</ref>
<ref id="B8">
<label>8</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname> <given-names>Q</given-names>
</name>
<name>
<surname>Sun</surname> <given-names>L</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Zhou</surname> <given-names>M</given-names>
</name>
<name>
<surname>Hu</surname> <given-names>M</given-names>
</name>
<name>
<surname>Chen</surname> <given-names>J</given-names>
</name>
<etal/>
</person-group>. <article-title>Identification of melanoma from hyperspectral pathology image using 3D convolutional networks</article-title>. <source>IEEE Trans Med Imaging</source>. (<year>2020</year>) <volume>40</volume>:<page-range>218&#x2013;27</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/TMI.42</pub-id>
</citation>
</ref>
<ref id="B9">
<label>9</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname> <given-names>K</given-names>
</name>
<name>
<surname>Zhan</surname> <given-names>B</given-names>
</name>
<name>
<surname>Zu</surname> <given-names>C</given-names>
</name>
<name>
<surname>Wu</surname> <given-names>X</given-names>
</name>
<name>
<surname>Zhou</surname> <given-names>J</given-names>
</name>
<name>
<surname>Zhou</surname> <given-names>L</given-names>
</name>
<etal/>
</person-group>. <article-title>Semi-supervised medical image segmentation via a tripled-uncertainty guided mean teacher model with contrastive learning</article-title>. <source>Med Image Anal</source>. (<year>2022</year>) <volume>79</volume>:<fpage>102447</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.media.2022.102447</pub-id>
</citation>
</ref>
<ref id="B10">
<label>10</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Van Engelen</surname> <given-names>JE</given-names>
</name>
<name>
<surname>Hoos</surname> <given-names>HH</given-names>
</name>
</person-group>. <article-title>A survey on semi-supervised learning</article-title>. <source>Mach Learn</source>. (<year>2020</year>) <volume>109</volume>:<fpage>373</fpage>&#x2013;<lpage>440</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1007/s10994-019-05855-6</pub-id>
</citation>
</ref>
<ref id="B11">
<label>11</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhao</surname> <given-names>X</given-names>
</name>
<name>
<surname>Qi</surname> <given-names>Z</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>S</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>Q</given-names>
</name>
<name>
<surname>Wu</surname> <given-names>X</given-names>
</name>
<name>
<surname>Mao</surname> <given-names>Y</given-names>
</name>
<etal/>
</person-group>. <article-title>RCPS: rectified contrastive pseudo supervision for semi-supervised medical image segmentation</article-title>. <source>IEEE J BioMed Health Inform</source>. (<year>2024</year>) <volume>28</volume>:<page-range>251&#x2013;61</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/JBHI.2023.3322590</pub-id>
</citation>
</ref>
<ref id="B12">
<label>12</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xu</surname> <given-names>C</given-names>
</name>
<name>
<surname>Yang</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Xia</surname> <given-names>Z</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>B</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>D</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>Y</given-names>
</name>
<etal/>
</person-group>. <article-title>Dual uncertainty-guided mixing consistency for semi-supervised 3D medical image segmentation</article-title>. <source>IEEE Trans Big Data</source>. (<year>2023</year>) <volume>9</volume>:<page-range>1156&#x2013;70</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/TBDATA.2023.3258643</pub-id>
</citation>
</ref>
<ref id="B13">
<label>13</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhao</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Lu</surname> <given-names>K</given-names>
</name>
<name>
<surname>Xue</surname> <given-names>J</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>S</given-names>
</name>
<name>
<surname>Lu</surname> <given-names>J</given-names>
</name>
</person-group>. <article-title>Semi-supervised medical image segmentation with voxel stability and reliability constraints</article-title>. <source>IEEE J BioMed Health Inform</source>. (<year>2023</year>) <volume>27</volume>:<page-range>3912&#x2013;23</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/JBHI.2023.3273609</pub-id>
</citation>
</ref>
<ref id="B14">
<label>14</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Chen</surname> <given-names>X</given-names>
</name>
<name>
<surname>Yuan</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Zeng</surname> <given-names>G</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>J</given-names>
</name>
</person-group>. (<year>2021</year>). <article-title>Semi-supervised semantic segmentation with cross pseudo supervision</article-title>, in: <conf-name>Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition</conf-name>, . pp. <page-range>2613&#x2013;22</page-range>.</citation>
</ref>
<ref id="B15">
<label>15</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Musulin</surname> <given-names>J</given-names>
</name>
<name>
<surname>&#x160;tifani&#x107;</surname> <given-names>D</given-names>
</name>
<name>
<surname>Zulijani</surname> <given-names>A</given-names>
</name>
<name>
<surname>&#x106;abov</surname> <given-names>T</given-names>
</name>
<name>
<surname>Dekani&#x107;</surname> <given-names>A</given-names>
</name>
<name>
<surname>Car</surname> <given-names>Z</given-names>
</name>
</person-group>. <article-title>An enhanced histopathology analysis: An ai-based system for multiclass grading of oral squamous cell carcinoma and segmenting of epithelial and stromal tissue</article-title>. <source>Cancers (Basel)</source>. (<year>2021</year>) <volume>13</volume>:<fpage>1784</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.3390/cancers13081784</pub-id>
</citation>
</ref>
<ref id="B16">
<label>16</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zidan</surname> <given-names>U</given-names>
</name>
<name>
<surname>Gaber</surname> <given-names>MM</given-names>
</name>
<name>
<surname>Abdelsamea</surname> <given-names>MM</given-names>
</name>
</person-group>. <article-title>SwinCup: Cascaded swin transformer for histopathological structures segmentation in colorectal cancer</article-title>. <source>Expert Syst Appl</source>. (<year>2023</year>) <volume>216</volume>:<fpage>119452</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.eswa.2022.119452</pub-id>
</citation>
</ref>
<ref id="B17">
<label>17</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jayachandran</surname> <given-names>A</given-names>
</name>
<name>
<surname>Ganesh</surname> <given-names>S</given-names>
</name>
<name>
<surname>Kumar</surname> <given-names>SR</given-names>
</name>
</person-group>. <article-title>Multi-stage deep convolutional neural network for histopathological analysis of osteosarcoma</article-title>. <source>Neural Comput Appl</source>. (<year>2023</year>) <volume>35</volume>:<fpage>1</fpage>&#x2013;<lpage>14</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1007/s00521-023-08837-x</pub-id>
</citation>
</ref>
<ref id="B18">
<label>18</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Huang</surname> <given-names>P</given-names>
</name>
<name>
<surname>He</surname> <given-names>P</given-names>
</name>
<name>
<surname>Tian</surname> <given-names>S</given-names>
</name>
<name>
<surname>Ma</surname> <given-names>M</given-names>
</name>
<name>
<surname>Feng</surname> <given-names>P</given-names>
</name>
<name>
<surname>Xiao</surname> <given-names>H</given-names>
</name>
<etal/>
</person-group>. <article-title>ViT-AMC network with adaptive model fusion and multiobjective optimization for interpretable laryngeal tumor grading from histopathological images</article-title>. <source>IEEE Trans Med Imaging</source>. (<year>2022</year>) <volume>42</volume>:<fpage>15</fpage>&#x2013;<lpage>28</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/TMI.2022.3202248</pub-id>
</citation>
</ref>
<ref id="B19">
<label>19</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rashmi</surname> <given-names>R</given-names>
</name>
<name>
<surname>Prasad</surname> <given-names>K</given-names>
</name>
<name>
<surname>Udupa</surname> <given-names>CBK</given-names>
</name>
</person-group>. <article-title>Multi-channel Chan-Vese model for unsupervised segmentation of nuclei from breast histopathological images</article-title>. <source>Comput Biol Med</source>. (<year>2021</year>) <volume>136</volume>:<fpage>104651</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.compbiomed.2021.104651</pub-id>
</citation>
</ref>
<ref id="B20">
<label>20</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname> <given-names>X</given-names>
</name>
<name>
<surname>Li</surname> <given-names>Q</given-names>
</name>
<name>
<surname>Li</surname> <given-names>W</given-names>
</name>
<name>
<surname>Guo</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>J</given-names>
</name>
<name>
<surname>Guo</surname> <given-names>C</given-names>
</name>
<etal/>
</person-group>. <article-title>FD-net: feature distillation network for oral squamous cell carcinoma lymph node segmentation in hyperspectral imagery</article-title>. <source>IEEE J BioMed Health Inform</source>. (<year>2024</year>) <volume>28</volume>:<page-range>1552&#x2013;63</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/JBHI.2024.3350245</pub-id>
</citation>
</ref>
<ref id="B21">
<label>21</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gao</surname> <given-names>H</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>H</given-names>
</name>
<name>
<surname>Chen</surname> <given-names>L</given-names>
</name>
<name>
<surname>Cao</surname> <given-names>X</given-names>
</name>
<name>
<surname>Zhu</surname> <given-names>M</given-names>
</name>
<name>
<surname>Xu</surname> <given-names>P</given-names>
</name>
</person-group>. <article-title>Semi-supervised segmentation of hyperspectral pathological imagery based on shape priors and contrastive learning</article-title>. <source>BioMed Signal Process Control</source>. (<year>2024</year>) <volume>91</volume>:<fpage>105881</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.bspc.2023.105881</pub-id>
</citation>
</ref>
<ref id="B22">
<label>22</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Laine</surname> <given-names>S</given-names>
</name>
<name>
<surname>Aila</surname> <given-names>T</given-names>
</name>
</person-group>. <article-title>Temporal ensembling for semi-supervised learning</article-title>. <source>ArXiv Preprint ArXiv</source>. (<year>2016</year>) <volume>arXiv</volume>:<fpage>1610.02242</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.48550/arXiv.1610.02242</pub-id>
</citation>
</ref>
<ref id="B23">
<label>23</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tarvainen</surname> <given-names>A</given-names>
</name>
<name>
<surname>Valpola</surname> <given-names>H</given-names>
</name>
</person-group>. <article-title>Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results</article-title>. <source>Adv Neural Inf Process Syst</source>. (<year>2017</year>) <volume>30</volume>.</citation>
</ref>
<ref id="B24">
<label>24</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shu</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Li</surname> <given-names>H</given-names>
</name>
<name>
<surname>Xiao</surname> <given-names>B</given-names>
</name>
<name>
<surname>Bi</surname> <given-names>X</given-names>
</name>
<name>
<surname>Li</surname> <given-names>W</given-names>
</name>
</person-group>. <article-title>Cross-mix monitoring for medical image segmentation with limited supervision</article-title>. <source>IEEE Trans Multimedia</source>. (<year>2022</year>) <volume>25</volume>:<page-range>1700&#x2013;12</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/TMM.2022.3154159</pub-id>
</citation>
</ref>
<ref id="B25">
<label>25</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xu</surname> <given-names>Z</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Lu</surname> <given-names>D</given-names>
</name>
<name>
<surname>Luo</surname> <given-names>X</given-names>
</name>
<name>
<surname>Yan</surname> <given-names>J</given-names>
</name>
<name>
<surname>Zheng</surname> <given-names>Y</given-names>
</name>
<etal/>
</person-group>. <article-title>Ambiguity-selective consistency regularization for mean-teacher semi-supervised medical image segmentation</article-title>. <source>Med Image Anal</source>. (<year>2023</year>) <volume>88</volume>:<fpage>102880</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.media.2023.102880</pub-id>
</citation>
</ref>
<ref id="B26">
<label>26</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Jiao</surname> <given-names>R</given-names>
</name>
<name>
<surname>Liao</surname> <given-names>Q</given-names>
</name>
<name>
<surname>Li</surname> <given-names>D</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>J</given-names>
</name>
</person-group>. <article-title>Uncertainty-guided mutual consistency learning for semi-supervised medical image segmentation</article-title>. <source>Artif Intell Med</source>. (<year>2023</year>) <volume>138</volume>:<fpage>102476</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.artmed.2022.102476</pub-id>
</citation>
</ref>
<ref id="B27">
<label>27</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wu</surname> <given-names>H</given-names>
</name>
<name>
<surname>Li</surname> <given-names>X</given-names>
</name>
<name>
<surname>Lin</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Cheng</surname> <given-names>K-T</given-names>
</name>
</person-group>. <article-title>Compete to win: enhancing pseudo labels for barely-supervised medical image segmentation</article-title>. <source>IEEE Trans Med Imaging</source>. (<year>2023</year>) <volume>42</volume>:<page-range>3244&#x2013;55</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/TMI.2023.3279110</pub-id>
</citation>
</ref>
<ref id="B28">
<label>28</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Wu</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Xu</surname> <given-names>M</given-names>
</name>
<name>
<surname>Ge</surname> <given-names>Z</given-names>
</name>
<name>
<surname>Cai</surname> <given-names>J</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>L</given-names>
</name>
</person-group>. (<year>2021</year>). <article-title>Semi-supervised left atrium segmentation with mutual consistency training</article-title>, in: <conf-name>Medical Image Computing and Computer Assisted Intervention&#x2013;MICCAI 2021: 24th International Conference</conf-name>, <conf-loc>Strasbourg, France</conf-loc>, <conf-date>September 27&#x2013;October 1, 2021</conf-date>. pp. <fpage>297</fpage>&#x2013;<lpage>306</lpage>.</citation>
</ref>
<ref id="B29">
<label>29</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wu</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Ge</surname> <given-names>Z</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>D</given-names>
</name>
<name>
<surname>Xu</surname> <given-names>M</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>L</given-names>
</name>
<name>
<surname>Xia</surname> <given-names>Y</given-names>
</name>
<etal/>
</person-group>. <article-title>Mutual consistency learning for semi-supervised medical image segmentation</article-title>. <source>Med Image Anal</source>. (<year>2022</year>) <volume>81</volume>:<fpage>102530</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.media.2022.102530</pub-id>
</citation>
</ref>
<ref id="B30">
<label>30</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chaitanya</surname> <given-names>K</given-names>
</name>
<name>
<surname>Erdil</surname> <given-names>E</given-names>
</name>
<name>
<surname>Karani</surname> <given-names>N</given-names>
</name>
<name>
<surname>Konukoglu</surname> <given-names>E</given-names>
</name>
</person-group>. <article-title>Local contrastive loss with pseudo-label based self-training for semi-supervised medical image segmentation</article-title>. <source>Med Image Anal</source>. (<year>2023</year>) <volume>87</volume>:<fpage>102792</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.media.2023.102792</pub-id>
</citation>
</ref>
<ref id="B31">
<label>31</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname> <given-names>Z</given-names>
</name>
<name>
<surname>Hou</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Liu</surname> <given-names>H</given-names>
</name>
<name>
<surname>Ye</surname> <given-names>Z</given-names>
</name>
<name>
<surname>Zhao</surname> <given-names>R</given-names>
</name>
<name>
<surname>Shen</surname> <given-names>H</given-names>
</name>
</person-group>. <article-title>FDCT: Fusion-Guided Dual-View Consistency Training for semi-supervised tissue segmentation on MRI</article-title>. <source>Comput Biol Med</source>. (<year>2023</year>) <volume>160</volume>:<fpage>106908</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.compbiomed.2023.106908</pub-id>
</citation>
</ref>
<ref id="B32">
<label>32</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname> <given-names>X</given-names>
</name>
<name>
<surname>He</surname> <given-names>M</given-names>
</name>
<name>
<surname>Li</surname> <given-names>H</given-names>
</name>
<name>
<surname>Shen</surname> <given-names>H</given-names>
</name>
</person-group>. <article-title>A combined loss-based multiscale fully convolutional network for high-resolution remote sensing image change detection</article-title>. <source>IEEE Geosci Remote Sens Lett</source>. (<year>2022</year>) <volume>19</volume>:<fpage>1</fpage>&#x2013;<lpage>5</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/LGRS.2021.3098774</pub-id>
</citation>
</ref>
<ref id="B33">
<label>33</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Yu</surname> <given-names>L</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>S</given-names>
</name>
<name>
<surname>Li</surname> <given-names>X</given-names>
</name>
<name>
<surname>Fu</surname> <given-names>C-W</given-names>
</name>
<name>
<surname>Heng</surname> <given-names>P-A</given-names>
</name>
</person-group>. (<year>2019</year>). <article-title>Uncertainty-aware self-ensembling model for semi-supervised 3D left atrium segmentation</article-title>, in: <conf-name>Medical Image Computing and Computer Assisted Intervention&#x2013;MICCAI 2019: 22nd International Conference</conf-name>, <conf-loc>Shenzhen, China</conf-loc>, <conf-date>October 13&#x2013;17, 2019</conf-date>. pp. <page-range>605&#x2013;13</page-range>.</citation>
</ref>
<ref id="B34">
<label>34</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname> <given-names>Q</given-names>
</name>
<name>
<surname>Li</surname> <given-names>Q</given-names>
</name>
<name>
<surname>Yu</surname> <given-names>G</given-names>
</name>
<name>
<surname>Sun</surname> <given-names>L</given-names>
</name>
<name>
<surname>Zhou</surname> <given-names>M</given-names>
</name>
<name>
<surname>Chu</surname> <given-names>J</given-names>
</name>
</person-group>. <article-title>A multidimensional choledoch database and benchmarks for cholangiocarcinoma diagnosis</article-title>. <source>IEEE Access</source>. (<year>2019</year>) <volume>7</volume>:<page-range>149414&#x2013;21</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/Access.6287639</pub-id>
</citation>
</ref>
<ref id="B35">
<label>35</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Ronneberger</surname> <given-names>O</given-names>
</name>
<name>
<surname>Fischer</surname> <given-names>P</given-names>
</name>
<name>
<surname>Brox</surname> <given-names>T</given-names>
</name>
</person-group>. (<year>2015</year>). <article-title>U-net: Convolutional networks for biomedical image segmentation</article-title>, in: <conf-name>Medical Image Computing and Computer-Assisted Intervention&#x2013;MICCAI 2015: 18th International Conference</conf-name>, <conf-loc>Munich, Germany</conf-loc>, <conf-date>October 5-9, 2015</conf-date>. pp. <page-range>234&#x2013;41</page-range>.</citation>
</ref>
<ref id="B36">
<label>36</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Ouali</surname> <given-names>Y</given-names>
</name>
<name>
<surname>Hudelot</surname> <given-names>C</given-names>
</name>
<name>
<surname>Tami</surname> <given-names>M</given-names>
</name>
</person-group>. <article-title>Semi-supervised semantic segmentation with cross-consistency training</article-title>. <conf-name>IEEE</conf-name>. (<year>2020</year>). <page-range>12674&#x2013;84</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/CVPR42600.2020.01269</pub-id>
</citation>
</ref>
<ref id="B37">
<label>37</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lu</surname> <given-names>L</given-names>
</name>
<name>
<surname>Yin</surname> <given-names>M</given-names>
</name>
<name>
<surname>Fu</surname> <given-names>L</given-names>
</name>
<name>
<surname>Yang</surname> <given-names>F</given-names>
</name>
</person-group>. <article-title>Uncertainty-aware pseudo-label and consistency for semi-supervised medical image segmentation</article-title>. <source>BioMed Signal Process Control</source>. (<year>2023</year>) <volume>79</volume>:<fpage>104203</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.bspc.2022.104203</pub-id>
</citation>
</ref>
</ref-list>
</back>
</article>