<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="2.3" xml:lang="EN">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Plant Sci.</journal-id>
<journal-title>Frontiers in Plant Science</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Plant Sci.</abbrev-journal-title>
<issn pub-type="epub">1664-462X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fpls.2023.1128993</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Plant Science</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Robustness of calibration model for prediction of lignin content in different batches of snow pears based on NIR spectroscopy</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Wu</surname>
<given-names>Xin</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="author-notes" rid="fn001">
<sup>*</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/2133398"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Li</surname>
<given-names>Guanglin</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Fu</surname>
<given-names>Xinglan</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/825802"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Wu</surname>
<given-names>Weixin</given-names>
</name>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
</contrib>
</contrib-group>
<aff id="aff1">
<sup>1</sup>
<institution>School of Electronics and Internet of Things, Chongqing College of Electronic Engineering</institution>, <addr-line>Chongqing</addr-line>, <country>China</country>
</aff>
<aff id="aff2">
<sup>2</sup>
<institution>College of Engineering and Technology, Southwest University</institution>, <addr-line>Chongqing</addr-line>, <country>China</country>
</aff>
<aff id="aff3">
<sup>3</sup>
<institution>Mechanical Measurement and Testing Research Center, Academy of Metrology and Quality Inspection</institution>, <addr-line>Chongqing</addr-line>, <country>China</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>Edited by: Jiangbo Li, Beijing Academy of Agriculture and Forestry Sciences, China</p>
</fn>
<fn fn-type="edited-by">
<p>Reviewed by: Yi Yang, Beijing Academy of Agricultural and Forestry Sciences, China; Zhang Hailiang, East China Jiaotong University, China; Byoung-Kwan Cho, Chungnam National University, Republic of Korea</p>
</fn>
<fn fn-type="corresp" id="fn001">
<p>*Correspondence: Xin Wu, <email xlink:href="mailto:wuxinnk@qq.com">wuxinnk@qq.com</email>
</p>
</fn>
<fn fn-type="other" id="fn002">
<p>This article was submitted to Crop and Product Physiology, a section of the journal Frontiers in Plant Science</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>27</day>
<month>02</month>
<year>2023</year>
</pub-date>
<pub-date pub-type="collection">
<year>2023</year>
</pub-date>
<volume>14</volume>
<elocation-id>1128993</elocation-id>
<history>
<date date-type="received">
<day>22</day>
<month>12</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>06</day>
<month>02</month>
<year>2023</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2023 Wu, Li, Fu and Wu</copyright-statement>
<copyright-year>2023</copyright-year>
<copyright-holder>Wu, Li, Fu and Wu</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<p>Snow pear is very popular in southwest China thanks to its fruit texture and potential medicinal value. Lignin content (LC) plays a direct and negative role (higher concentration and larger size of stone cells lead to thicker pulp and deterioration of the taste) in determining the fruit texture of snow pears as well as consumer purchasing decisions of fresh pears. In this study, we assessed the robustness of a calibration model for predicting LC in different batches of snow pears using a portable near-infrared (NIR) spectrometer, with the range of 1033&#x2013;2300 nm. The average NIR spectra at nine different measurement positions of snow pear samples purchased at four different periods (batch A, B, C and D) were collected. We developed a standard normal variate transformation (SNV)-genetic algorithm (GA) -the partial least square regression (PLSR) model (master model A) - to predict LC in batch A of snow pear samples based on 80 selected effective wavelengths, with a higher correlation coefficient of prediction set (Rp) of 0.854 and a lower root mean square error of prediction set (RMSEP) of 0.624, which we used as the prediction model to detect LC in three other batches of snow pear samples. The performance of detecting the LC of batch B, C, and D samples by the master model A directly was poor, with lower Rp and higher RMSEP. The independent semi-supervision free parameter model enhancement (SS-FPME) method and the sequential SS-FPME method were used and compared to update master model A to predict the LC of snow pears. For the batch B samples, the predictive ability of the updated model (Ind-model AB) was improved, with an Rp of 0.837 and an RMSEP of 0.614. For the batch C samples, the performance of the Seq-model ABC was improved greatly, with an Rp of 0.952 and an RMSEP of 0.383. For the batch D samples, the performance of the Seq-model ABCD was also improved, with an Rp of 0.831 and an RMSEP of 0.309. Therefore, the updated model based on supervision and learning of new batch samples by the sequential SS-FPME method could improve the robustness and migration ability of the model used to detect the LC of snow pears and provide technical support for the development and practical application of portable detection device.</p>
</abstract>
<kwd-group>
<kwd>lignin content of snow pears</kwd>
<kwd>robustness</kwd>
<kwd>SS-FPME method</kwd>
<kwd>NIR spectroscopy</kwd>
<kwd>calibration model</kwd>
</kwd-group>
<contract-num rid="cn001">CSTB2022NSCQ-MSX1140, KJQN201903114,  KJQN202103105</contract-num>
<contract-sponsor id="cn001">Natural Science Foundation Project of Chongqing, Chongqing Science and Technology Commission<named-content content-type="fundref-id">10.13039/501100012669</named-content>
</contract-sponsor>
<counts>
<fig-count count="10"/>
<table-count count="5"/>
<equation-count count="10"/>
<ref-count count="40"/>
<page-count count="12"/>
<word-count count="6800"/>
</counts>
</article-meta>
</front>
<body>
<sec id="s1" sec-type="intro">
<label>1</label>
<title>Introduction</title>
<p>Snow pear enjoys widespread popularity in southwest China (<xref ref-type="bibr" rid="B29">Wang et&#xa0;al., 2020</xref>; <xref ref-type="bibr" rid="B30">Wu et&#xa0;al., 2021</xref>). It has excellent fruit texture and boasts some medicinal value (<xref ref-type="bibr" rid="B40">Zou, 2016</xref>). Lignin content (LC), however, has a direct and negative effect on the fruit texture of snow pears and on consumers&#x2019; decision to purchase fresh pear fruit (<xref ref-type="bibr" rid="B26">Tao et&#xa0;al., 2009</xref>; <xref ref-type="bibr" rid="B6">Cai et&#xa0;al., 2010</xref>; <xref ref-type="bibr" rid="B37">Yan et&#xa0;al., 2014</xref>; <xref ref-type="bibr" rid="B33">Xue et&#xa0;al., 2019</xref>; <xref ref-type="bibr" rid="B24">Sheng et&#xa0;al., 2020</xref>; <xref ref-type="bibr" rid="B31">Wu et&#xa0;al., 2021</xref>). More specifically, higher concentration and larger size of stone cells lead to thicker pulp and deterioration of the taste. In recent decades, the use of near-infrared (NIR) spectroscopy has been an effective tool for the nondestructive and rapid detection of the internal quality of fruits and vegetables (<xref ref-type="bibr" rid="B32">Xiaobo et&#xa0;al., 2010</xref>). In particular, NIR spectroscopy, combined with the chemometric methods, has been successfully used to predict the soluble solids content (SSC), firmness, and moisture of fruits (e.g., apples, pears, tomatoes, peaches) by notable researchers (<xref ref-type="bibr" rid="B39">Zhang et&#xa0;al., 2008</xref>; <xref ref-type="bibr" rid="B23">Rahman et&#xa0;al., 2017</xref>; <xref ref-type="bibr" rid="B28">Tian et&#xa0;al., 2018</xref>; <xref ref-type="bibr" rid="B12">Du et&#xa0;al., 2019</xref>). Although the author and other researchers have studied the calibration model to predict the LC of snow pears based on NIR spectroscopy (<xref ref-type="bibr" rid="B24">Sheng et&#xa0;al., 2020</xref>; <xref ref-type="bibr" rid="B31">Wu et&#xa0;al., 2021</xref>), the robustness and accuracy of this model need further study and more research to assess for variability of samples and external variability of the measurement systems.</p>
<p>To obtain more stable and robust prediction results, researchers typically have used partial least square regression (PLSR) to establish calibration models based on the effective wavelengths from the full NIR spectra for predicting the internal quality of fruits and vegetables. The leave-one-out cross-validation method has been used to avoid overfitting or underfitting by using too many or too few PLS components in the PLSR algorithm, respectively (<xref ref-type="bibr" rid="B11">Douglas et&#xa0;al., 2018</xref>). The optimal number of latent variables (LVs) was determined by a full cross-validation of the calibration samples and an optimal number was determined by the minimum value of the root mean square error of cross-validation (RMSECV). The full-spectra PLSR model, however, was time-consuming, redundant, and collinear (<xref ref-type="bibr" rid="B22">Rahman et&#xa0;al., 2018</xref>). We used the variables selection method to extract the effective wavelengths and were able to reduce the complexity and increase the predictive ability of the PLSR model to detect the internal quality of fruits and vegetable (<xref ref-type="bibr" rid="B32">Xiaobo et&#xa0;al., 2010</xref>; <xref ref-type="bibr" rid="B4">Balabin and Smirnov, 2011</xref>; <xref ref-type="bibr" rid="B34">Xu et&#xa0;al., 2012</xref>; <xref ref-type="bibr" rid="B15">Jie et&#xa0;al., 2013</xref>; <xref ref-type="bibr" rid="B10">Deng et&#xa0;al., 2014</xref>; <xref ref-type="bibr" rid="B17">Li et&#xa0;al., 2014</xref>). In recent years, many effective wavelengths selection methods have been studied to predict internal quality based on NIR spectroscopy. Tao used the successive projection algorithm (SPA) to selected five optimal wavelengths for exploring an accurate and non-destructive method to discriminate the sex of silkworm pupae using the visible and near-infrared hyperspectral imaging technique (<xref ref-type="bibr" rid="B27">Tao et&#xa0;al., 2019</xref>). Li used the synergy interval partial least squares (SiPLS) combining with nonlinear SVM to developed a rapid quantitative analysis model for determining the glycated albumin content based on the attenuated total reflection&#x2013;Fourier transform infrared (ATR-FTIR) spectroscopy (<xref ref-type="bibr" rid="B19">Li et&#xa0;al., 2018</xref>). Du used the genetic algorithm (GA) to optimize non-destructive prediction on property of mechanically injured peaches during postharvest storage by portable visible/shortwave near-infrared spectroscopy (<xref ref-type="bibr" rid="B12">Du et&#xa0;al., 2019</xref>). Deng developed the bootstrapping soft shrinkage (BOSS) method for variable selection in chemical modeling, and the method was used to select key variables for measurement moisture, oil, protein, and starch of corn and soy (<xref ref-type="bibr" rid="B9">Deng et&#xa0;al., 2016</xref>). Yan proposed a new computational method stabilized bootstrapping soft shrinkage approach (SBOSS) for variable selection based on the BOSS method for spectral variable selection in the issue of over-fitting, model accuracy and variable selection credibility (<xref ref-type="bibr" rid="B36">Yan et&#xa0;al., 2019</xref>). The competitive adaptive reweighted sampling (CARS) is an effective method for selecting effective wavelengths for multivariate calibration (<xref ref-type="bibr" rid="B18">Li et&#xa0;al., 2009</xref>; <xref ref-type="bibr" rid="B14">Jiang et&#xa0;al., 2015</xref>). Wang used the CARS to identify the characteristic wavelengths and simplify the PLS models for detection of juiciness of pear <italic>via</italic> VIS/NIR spectroscopy (<xref ref-type="bibr" rid="B29">Wang et&#xa0;al., 2020</xref>). Yang used the CARS to select feature variables for identification of unhealthy panax notoginseng from different geographical origins based on ATR-FTIR spectroscopy (<xref ref-type="bibr" rid="B35">Yang et&#xa0;al., 2019</xref>). Liang used the CARS to extract effective wavelengths for prediction of holocellulose and lignin content of pulp wood feedstock using NIR spectroscopy (<xref ref-type="bibr" rid="B16">Liang et&#xa0;al., 2020</xref>). The CARS has been also used to select variables for predicting internal quality of orange, dovyalis fruit, and pears by Song (<xref ref-type="bibr" rid="B25">Song et&#xa0;al., 2020</xref>), Mateus (<xref ref-type="bibr" rid="B8">de Assis et&#xa0;al., 2018</xref>), and Wu (<xref ref-type="bibr" rid="B30">Wu et&#xa0;al., 2021</xref>), respectively. In this work, these variables selection methods were used to extract effective wavelengths from the full NIR spectrum.</p>
<p>The prediction results of one master calibration model to measure the LC of different batches of snow pear samples has always had large errors based on NIR spectroscopy (<xref ref-type="bibr" rid="B21">Nicola&#xef; et&#xa0;al., 2008</xref>). The &#x201c;different batches&#x201d; usually referred to the different measurement times, different seasons, different geographical locations, and different fruit maturity of snow pear samples (<xref ref-type="bibr" rid="B1">Anderson et&#xa0;al., 2021</xref>). Moreover, changes in the ambient temperature of NIR spectrum acquisition and the instrument components (such as the light source) could affect the accuracy and robustness of the calibration model. Therefore, the prediction ability of the model has to be checked routinely, because the NIR spectrum data was affected by the possible failures of the mechanical modules of the NIR spectrometer system (e.h., sensors, light sources, reference modules) in the process of collecting NIR spectra (<xref ref-type="bibr" rid="B20">Mercader and Puigdomenech, 2014</xref>). In addition, the error of calibration model measuring the corresponding LC of a new batch of snow pear samples has been significant for two reasons: (1) the NIR spectrum of this new batch missed the feature information corresponding to the measurement LC (<xref ref-type="bibr" rid="B2">Anderson et&#xa0;al., 2020</xref>); and (2) the external effect of the new batch of snow pear samples produced interference with NIR spectral information (<xref ref-type="bibr" rid="B38">Zeaiter et&#xa0;al., 2006</xref>). These variabilities in spectral information were related to the different varieties of samples, harvest season, and measured temperature. Therefore, to accurately predict the LC of a new batch of snow pears, in this work, we updated the calibration model using a semi supervision free parameter model enhancement (SS-FPME). The objective of this work was to analyze the accuracy and robustness of the calibration model to predict the LC of different batches of snow pears based on NIR spectroscopy. We proposed and applied the SS-FPME to update the PLSR model. The research processes of this work are as follows: (1) The NIR diffuse reflectance spectrum of four batches snow pear samples were obtained by an optic-spectrometer system. (2) We built a calibration model for the measurement of the LC of snow pears based on the most effective wavelengths from the full spectrum of the optimal measurement positions of samples selected by the SPA, SiPLS, GA, BOSS and CARS methods. (3) The SS-FPME method was used to update the calibration model to predict the LC of batch B, C, and D, and we compared and analyzed two ways to update the model. (4) We evaluated the performance of the PLS model based on the independent verification data sets.</p>
</sec>
<sec id="s2" sec-type="materials|methods">
<label>2</label>
<title>Materials and methods</title>
<sec id="s2_1">
<label>2.1</label>
<title>Samples preparation</title>
<p>A total of 512 snow pears of four different batches of samples were collected from the local fruit market at different time periods in Shuangfu, Chongqing. The surface of these samples did not bear any damage. The average fruit weight was 300&#x2013;400 g. The shape was round or flat, with the top and base uneven, the longitudinal diameter around 8&#x2013;9 cm, the transverse diameter around 9&#x2013;9.5 cm, and the fruit stone diameter of 2&#x2013;3.5 cm. After each batch of samples was collected and brought back to the laboratory, the snow pears were washed, numbered, and stored in a refrigerator to ensure the accuracy of the experiment. It took eight months to collect the NIR diffuse reflectance spectra of the surface of the samples using a microfiber spectrometer and to measure the standard reference values of the LC according to the Klason method (<xref ref-type="bibr" rid="B5">Bunzel et&#xa0;al., 2011</xref>; <xref ref-type="bibr" rid="B7">Cybulska et&#xa0;al., 2012</xref>; <xref ref-type="bibr" rid="B3">Assis et&#xa0;al., 2017</xref>). Among these samples, the NIR spectra and LC reference value of the 160 samples in batch A were completed in December 2020, and the 120 samples in batch B, 104 samples in batch C, and 128 samples in batch D were completed in March 2021, May 2021, and July 2021, respectively. Different batches of samples in this research referred to the different collection time points of NIR diffuse reflection spectrum of the samples. As shown in <xref ref-type="table" rid="T1">
<bold>Table&#xa0;1</bold>
</xref>, the batch A samples were divided into a calibration set (60%) and a validation set (40%) using the Kennard&#x2013;Stone (KS) algorithm (<xref ref-type="bibr" rid="B27">Tao et&#xa0;al., 2019</xref>), and the batch B, C, and D samples were divided into a model update calibration set (40%) and a validation set (60%).</p>
<table-wrap id="T1" position="float">
<label>Table&#xa0;1</label>
<caption>
<p>Statistical data of lignin content (mg/g) of snow pear samples of four batches.</p>
</caption>
<table frame="hsides">
<thead>
<tr>
<th valign="middle" align="center">Batch</th>
<th valign="middle" align="center">Measurement time</th>
<th valign="middle" align="center">Data set</th>
<th valign="middle" align="center">Number</th>
<th valign="middle" align="center">Range (mg/g)</th>
<th valign="middle" align="center">Mean &#xb1; SD (mg/g)</th>
<th valign="middle" align="center">SEL</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="middle" rowspan="3" align="center">A</td>
<td valign="middle" rowspan="3" align="center">2020.12</td>
<td valign="middle" align="center">All samples</td>
<td valign="middle" align="center">160</td>
<td valign="middle" align="center">75.05&#x2013;81.04</td>
<td valign="middle" align="center">77.87 &#xb1; 1.22</td>
<td valign="middle" align="center">0.096</td>
</tr>
<tr>
<td valign="middle" align="center">Calibration set</td>
<td valign="middle" align="center">96</td>
<td valign="middle" align="center">75.05&#x2013;81.02</td>
<td valign="middle" align="center">77.59 &#xb1; 1.20</td>
<td valign="middle" align="center">0.109</td>
</tr>
<tr>
<td valign="middle" align="center">Prediction set</td>
<td valign="middle" align="center">64</td>
<td valign="middle" align="center">75.85&#x2013;81.04</td>
<td valign="middle" align="center">78.27 &#xb1; 1.14</td>
<td valign="middle" align="center">0.143</td>
</tr>
<tr>
<td valign="middle" rowspan="3" align="center">B</td>
<td valign="middle" rowspan="3" align="center">2021.03</td>
<td valign="middle" align="center">All samples</td>
<td valign="middle" align="center">120</td>
<td valign="middle" align="center">74.78&#x2013;80.80</td>
<td valign="middle" align="center">77.75 &#xb1; 1.18</td>
<td valign="middle" align="center">0.108</td>
</tr>
<tr>
<td valign="middle" align="center">Calibration set</td>
<td valign="middle" align="center">48</td>
<td valign="middle" align="center">74.78&#x2013;79.99</td>
<td valign="middle" align="center">77.14 &#xb1; 1.09</td>
<td valign="middle" align="center">0.157</td>
</tr>
<tr>
<td valign="middle" align="center">Prediction set</td>
<td valign="middle" align="center">72</td>
<td valign="middle" align="center">76.27&#x2013;80.80</td>
<td valign="middle" align="center">78.16 &#xb1; 1.07</td>
<td valign="middle" align="center">0.126</td>
</tr>
<tr>
<td valign="middle" rowspan="3" align="center">C</td>
<td valign="middle" rowspan="3" align="center">2021.05</td>
<td valign="middle" align="center">All samples</td>
<td valign="middle" align="center">104</td>
<td valign="middle" align="center">75.48&#x2013;81.42</td>
<td valign="middle" align="center">78.03 &#xb1; 1.19</td>
<td valign="middle" align="center">0.116</td>
</tr>
<tr>
<td valign="middle" align="center">Calibration set</td>
<td valign="middle" align="center">42</td>
<td valign="middle" align="center">75.68&#x2013;80.25</td>
<td valign="middle" align="center">77.51 &#xb1; 1.10</td>
<td valign="middle" align="center">0.170</td>
</tr>
<tr>
<td valign="middle" align="center">Prediction set</td>
<td valign="middle" align="center">62</td>
<td valign="middle" align="center">75.48&#x2013;81.42</td>
<td valign="middle" align="center">78.39 &#xb1; 1.12</td>
<td valign="middle" align="center">0.142</td>
</tr>
<tr>
<td valign="middle" rowspan="3" align="center">D</td>
<td valign="middle" rowspan="3" align="center">2021.07</td>
<td valign="middle" align="center">All samples</td>
<td valign="middle" align="center">128</td>
<td valign="middle" align="center">76.43&#x2013;79.38</td>
<td valign="middle" align="center">77.93 &#xb1; 0.55</td>
<td valign="middle" align="center">0.048</td>
</tr>
<tr>
<td valign="middle" align="center">Calibration set</td>
<td valign="middle" align="center">51</td>
<td valign="middle" align="center">76.76&#x2013;78.67</td>
<td valign="middle" align="center">77.73 &#xb1; 0.49</td>
<td valign="middle" align="center">0.068</td>
</tr>
<tr>
<td valign="middle" align="center">Prediction set</td>
<td valign="middle" align="center">77</td>
<td valign="middle" align="center">76.43&#x2013;79.38</td>
<td valign="middle" align="center">78.06 &#xb1; 0.54</td>
<td valign="middle" align="center">0.062</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>SD, standard derivation; SEL, standard error of laboratory.</p>
</fn>
</table-wrap-foot>
</table-wrap>
</sec>
<sec id="s2_2">
<label>2.2</label>
<title>Spectral measurement</title>
<p>Based on the NIR diffuse reflectance spectrum acquisition system, the NIR spectra of nine measurement positions (three stem-calyx longitude, with an interval of 120&#xb0;) intersected three latitudes (stem, equator, and calyx) from nine spectral measurement positions (as shown in <xref ref-type="fig" rid="f1">
<bold>Figure&#xa0;1</bold>
</xref>) on the surface of four batches of snow pears that were collected using a microfiber spectrometer (NIRQuest256-2.5, Ocean Insight, Orlando, FL, USA). The microfiber optic spectrometer had wavelengths ranging from 900 to 2500 nm, with a resolution of 9.5 nm and 512 data points. We set the integration time of the microfiber optic spectrometer to 70 ms, the scanning number to 5, and the number of smoothing points to 10. We obtained the average NIR spectrum of one sample after three consecutive acquisitions at each measurement point. The noise spectral data at both ends of the spectral curve were removed, and the effective wavelengths ranged from 1033 to 2300 nm, with 387 spectral points.</p>
<fig id="f1" position="float">
<label>Figure&#xa0;1</label>
<caption>
<p>Diagram of the nine spectral measurement positions of one sample. The first longitude intersects the stem latitudes, equator latitudes, and calyx latitudes form three spectral measurement positions: P<sub>I1</sub>, P<sub>II1</sub>, and P<sub>III1</sub>. The second longitude and third longitude intersect to form six spectral measurement positions: P<sub>I2</sub>, P<sub>II2</sub>, and P<sub>III2</sub>, and P<sub>I3</sub>, P<sub>II;3</sub>, and P<sub>III3</sub>, respectively.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-14-1128993-g001.tif"/>
</fig>
</sec>
<sec id="s2_3">
<label>2.3</label>
<title>Reference LC measurement</title>
<p>To make the spectrum and LC correspond more accurately, the fresh snow pear flesh (between 2&#xa0;cm outside the core and 2&#xa0;mm under the pericarp of an intact pear) was made into a dry powder immediately after the NIR spectrum acquisition. We used the traditional Klason method to measure the LC reference value of snow pears, and the statistical results are shown in <xref ref-type="table" rid="T1">
<bold>Table&#xa0;1</bold>
</xref>. The snow pear dry powder (500 mg) and 72% H<sub>2</sub>SO<sub>4</sub> (30 mL) formed the mixed solution; the solution was stirred evenly, sampled in boiling water bath for 2&#xa0;h, and diluted with deionized water. Then, the solution was poured into a sand core funnel (diameter of 2.5&#xa0;cm, particle retention of 1.6 &#x3bc;m), filtrated, washed, dried, and weighed to obtained the LC mass ratio (mg/g) of the sample. We conducted three chemical repeated measurements and obtained the value with a relative error within 5% was obtained.</p>
<p>The LC values of snow pear samples of batches A, B, C, and D ranged from 75.05 to 81.04 mg/g, 74.78 to 80.80 mg/g, 75.48 to 81.42 mg/g, and 76.43 to 79.38 mg/g, respectively. <xref ref-type="table" rid="T1">
<bold>Table&#xa0;1</bold>
</xref> also shows the lignin distribution of the calibration set and the prediction set, and the LC range in the calibration set was bigger than that in the prediction set for the batch A samples. This result was helpful to build a better calibration model for detecting the LC of snow pears in batch A.</p>
</sec>
<sec id="s2_4">
<label>2.4</label>
<title>Theory of SS-FPME</title>
<p>For the multivariate calibration model, it was assumed that a data set of NIR spectrum was <bold>X</bold>
<sub>(mxn)</sub>, the number of samples was m, the number of variSSables was n, and the data set of the LC reference value was <bold>Y</bold>
<sub>(mx1)</sub>. The linear relationship between <bold>X</bold> and y can be established by the PLSR model, as shown in formula (1). The predicted value <inline-formula>
<mml:math display="inline" id="im1">
<mml:mstyle mathvariant="" mathsize="normal">
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
</mml:mstyle>
</mml:math>
</inline-formula> could be calculated, as follows:</p>
<disp-formula>
<label>(1)</label>
<mml:math display="block" id="M1">
<mml:mrow>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mi>y</mml:mi>
</mml:mstyle>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:mrow>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mn>1</mml:mn>
<mml:mi>X</mml:mi>
</mml:mstyle>
</mml:mrow>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mi>b</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mi>b</mml:mi>
</mml:mstyle>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
<mml:mo>+</mml:mo>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mi>e</mml:mi>
</mml:mstyle>
<mml:mo>=</mml:mo>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mo>+</mml:mo>
<mml:mi>e</mml:mi>
</mml:mstyle>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <italic>b</italic>
<sub>0</sub> and b(<italic>n</italic>x1) were the intercept and regression coefficient of the PLS model, respectively; 1 was the column vector of length n, and its element was 1; and <bold>e</bold> was the prediction error between <bold>&#x177;</bold> and <bold>y</bold>.</p>
<p>If only data sets for the NIR spectra and the LC reference value of the new batch of snow pear samples were available, and no data set was available for the NIR spectral of samples of the main batch, it would be impossible to update the calibration model to predict the LC of a new batch of snow pears using the standard strategy. In practical applications, an updated calibration model is often necessary to predict the LC of new samples. Therefore, it was necessary to apply the semi-supervision free parameter model enhancement (SS-FPME) to the updated calibration model. This method reduced the influence of sample variability and external variability of measurement systems to obtain an accurate and robust prediction result. The function formula of SS-FPME was calculated as follows:</p>
<disp-formula>
<label>(2)</label>
<mml:math display="block" id="M2">
<mml:mtable columnalign="left">
<mml:mtr>
<mml:mtd>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mi>min</mml:mi>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mi>b</mml:mi>
<mml:mrow>
<mml:mn>0</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>s</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mi>b</mml:mi>
</mml:mstyle>
<mml:mi>s</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mo>&#x2016;</mml:mo>
<mml:mrow>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mi>y</mml:mi>
</mml:mstyle>
<mml:mo>&#x2212;</mml:mo>
<mml:mo stretchy="false">[</mml:mo>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mn>1</mml:mn>
</mml:mstyle>
<mml:msub>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mi>X</mml:mi>
</mml:mstyle>
<mml:mi>s</mml:mi>
</mml:msub>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mi>b</mml:mi>
<mml:mrow>
<mml:mn>0</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>s</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mi>b</mml:mi>
</mml:mstyle>
<mml:mi>s</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="false" mathsize="6">&#x2016;</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mi>s</mml:mi>
<mml:mo>.</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>.</mml:mo>
<mml:mi>c</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>r</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mi>b</mml:mi>
</mml:mstyle>
<mml:mi>s</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mi>b</mml:mi>
</mml:mstyle>
<mml:mi>m</mml:mi>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>&gt;</mml:mo>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mi>h</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</disp-formula>
<p>where X<sub>
<italic>s</italic>
</sub> is the data set of the NIR spectra of samples of new batch and the updated data set of the calibration model at the same time; <italic>b</italic>
<sub>
<italic>0,s</italic>
</sub> is the intercept; b<sub>s</sub> is the regression coefficient of calibration model of the new batch sample, and <italic>r<sub>th</sub>
</italic> is the constraint of the correlation coefficient; and b<italic>
<sub>m</sub>
</italic> is the regression coefficient of calibration model of the original main batch sample, which could be analyzed and calculated by PLSR model. We optimized the function formula (2) of SS-FPME using the sequential quadratic programming method of the fmincon optimization routine of MATLAB 2016b software. The method to update the SS-FPME model required the regression coefficient of the primacy model, the spectral data set of a few samples from the new batch, and the data set of the corresponding reference value. We used the root mean square error of the prediction set (RMSEP) to evaluate the performance of the updated calibration model, which was estimated based on the independent test set.</p>
</sec>
<sec id="s2_5">
<label>2.5</label>
<title>Method of updating model method by SS-FPME</title>
<p>To comprehensively assess the prediction ability of the updated calibration model of different batches of snow pears based on NIR spectroscopy, we used the SS-FPME method to update the calibration model of the old batch of samples based on the updated data set of the new batch of samples to predict the LC of the new batch of samples. We updated the master calibration model according to each new batch of samples independently in the SS-FPME method, referred to as the independent SS-FPME method (<xref ref-type="fig" rid="f2">
<bold>Figure&#xa0;2A</bold>
</xref>), and the master calibration model was updated sequentially by multiple batches of the samples, referred to as the sequential SS-FPME method (<xref ref-type="fig" rid="f2">
<bold>Figure&#xa0;2B</bold>
</xref>).</p>
<fig id="f2" position="float">
<label>Figure&#xa0;2</label>
<caption>
<p>Schematic of calibration model updating method based on SS-FPME: <bold>(A)</bold> independent SS-FPME method and <bold>(B)</bold> sequential SS-FPME method. Cal, calibration; Pre, prediction.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-14-1128993-g002.tif"/>
</fig>
<p>For the independent SS-FPME method, <xref ref-type="fig" rid="f2">
<bold>Figure&#xa0;2A</bold>
</xref>) shows the updating process for the calibration model to predict the LC in the four batches of snow pears. We used the PLSR to establish the master calibration model based on one batch of snow pear samples (batch A), and formed model A to predict the LC of batch A. To improve the accuracy of the calibration model, we had to update the master model (model A) from the calibration set of a new batch of samples (batch B), and formed Ind-model AB to predict the LC of batch B. The calibration set of the new batch of samples contained few samples, and was called the update set. To accurately detect the LC of batch C and batch D, we built the Ind-model AC and Ind-model AD from the calibration set of batch C and batch D independently using the SS-FPME method.</p>
<p>
<xref ref-type="fig" rid="f2">
<bold>Figure&#xa0;2B</bold>
</xref> shows the updating process for the calibration model of the sequential SS-FPME method for four batches of snow pears. Similar to the independent SS-FPME method, we built model A (the master model) from the calibration set of batch A using the PLSR algorithm to predict the LC of batch A, and we updated model A to form the Seq-model AB from the calibration set of batch B to predict the LC of batch B. Then, we updated the Seq-model AB to form the Seq-model ABC from the calibration set of batch C sequentially to predict the LC of batch C, and we updated the Seq-model ABC to form the Seq-model ABCD from the calibration set of batch D sequentially to predict the LC of batch D. In this work, the independent SS-FPME method and the sequential SS-FPME method were used and compared to update the calibration model to predict the LC of four batches of snow pears separately to improve the accuracy and robustness of the calibration model to predict the internal qualities of different batches of samples.</p>
</sec>
<sec id="s2_6">
<label>2.6</label>
<title>Evaluation model</title>
<sec id="s2_6_1">
<label>2.6.1</label>
<title>PLSR modeling</title>
<p>The PLSR algorithm is a multivariate linear analysis method first proposed by Wold and Krishnaiah, which is widely used in the analysis of spectral data (<xref ref-type="bibr" rid="B13">Haaland and Thomas, 1988</xref>). The basic principle of this algorithm is to obtain the score matrix by decomposing the sample spectral matrix and sample concentration matrix at the same time and to perform multiple linear regression. Following are the main implementation steps of the PLSR. First, the principal components of spectral matrix X and concentration matrix Y of the sample are decomposed, as follows:</p>
<disp-formula>
<label>(3)</label>
<mml:math display="block" id="M3">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mi>w</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>E</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula>
<label>(4)</label>
<mml:math display="block" id="M4">
<mml:mrow>
<mml:msub>
<mml:mi>Y</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>l</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>U</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mi>Q</mml:mi>
<mml:mrow>
<mml:mi>w</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>l</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>F</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>l</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</disp-formula>
<p>Where <italic>X<sub>mxn</sub>
</italic> is the spectral matrix of <italic>m</italic> samples at <italic>n</italic> wavelengths; <italic>Y<sub>mx1</sub>
</italic> is the concentration matrix containing the content information of <italic>l</italic> components of <italic>m</italic> samples; <italic>T<sub>mxw</sub>
</italic> and <italic>U<sub>mxw</sub>
</italic> are the score matrix; <italic>P<sub>wxn</sub>
</italic> and <italic>Q<sub>mx1</sub>
</italic> are the load matrix; and <italic>E<sub>wxn</sub>
</italic> and <italic>F<sub>mx1</sub>
</italic> are the residual matrix.</p>
<p>Then the linear regression of <italic>T<sub>mxw</sub>
</italic> and <italic>U<sub>mxw</sub>
</italic> are processed as follows:</p>
<disp-formula>
<label>(5)</label>
<mml:math display="block" id="M5">
<mml:mrow>
<mml:msub>
<mml:mi>U</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#xb7;</mml:mo>
<mml:msub>
<mml:mi>B</mml:mi>
<mml:mrow>
<mml:mi>w</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</disp-formula>
<p>Where <italic>B<sub>wxw</sub>
</italic> is the regression coefficient matrix:</p>
<disp-formula>
<label>(6)</label>
<mml:math display="block" id="M6">
<mml:mrow>
<mml:msub>
<mml:mi>B</mml:mi>
<mml:mrow>
<mml:mi>w</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>U</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#xb7;</mml:mo>
<mml:msubsup>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>w</mml:mi>
</mml:mrow>
<mml:mi>T</mml:mi>
</mml:msubsup>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msubsup>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>w</mml:mi>
</mml:mrow>
<mml:mi>T</mml:mi>
</mml:msubsup>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</disp-formula>
</sec>
<sec id="s2_6_2">
<label>2.6.2</label>
<title>Model evaluation indexes</title>
<p>Generally, correlation coefficient and root mean square error are used as the evaluation indexes for NIR spectral data analysis, including the correlation coefficient of calibration set (Rc), the root mean square error of cross-validation (RMSECV), the correlation coefficient of prediction set (Rp), and the root mean square error of prediction set (RMSEP):</p>
<disp-formula>
<label>(7)</label>
<mml:math display="block" id="im2">
<mml:mrow>
<mml:msub>
<mml:mi>R</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:msqrt>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>a</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>a</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>c</mml:mi>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula>
<label>(8)</label>
<mml:math display="block" id="M7">
<mml:mrow>
<mml:mi>R</mml:mi>
<mml:mi>M</mml:mi>
<mml:mi>S</mml:mi>
<mml:mi>E</mml:mi>
<mml:mi>C</mml:mi>
<mml:mi>V</mml:mi>
<mml:mo>=</mml:mo>
<mml:msqrt>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>a</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:mfrac>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:math>
</disp-formula>
<p>In the calibration set, <italic>n</italic> is the number of samples, <italic>Y<sub>i,a</sub>
</italic> is the standard reference of the <italic>i</italic>-th sample, <italic>Y<sub>i,p</sub>
</italic> is the predicted value of the <italic>i</italic>-th sample, and <italic>Y<sub>i,m</sub>
</italic> is the average value of the standard reference of all samples:</p>
<disp-formula>
<label>(9)</label>
<mml:math display="block" id="im3">
<mml:mrow>
<mml:msub>
<mml:mi>R</mml:mi>
<mml:mi>p</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:msqrt>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>m</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>a</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>m</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>a</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>p</mml:mi>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula>
<label>(10)</label>
<mml:math display="block" id="M8">
<mml:mrow>
<mml:mi>R</mml:mi>
<mml:mi>M</mml:mi>
<mml:mi>S</mml:mi>
<mml:mi>E</mml:mi>
<mml:mi>P</mml:mi>
<mml:mo>=</mml:mo>
<mml:msqrt>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>m</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>a</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
<mml:mi>m</mml:mi>
</mml:mfrac>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:math>
</disp-formula>
<p>In the prediction set, <italic>m</italic> is the number of samples, <italic>Y<sub>j,a</sub></italic>is the standard reference of the <italic>j</italic>-th sample, <italic>Y<sub>j,p</sub>
</italic> is the predicted value of the <italic>j</italic>-th sample, and <italic>Y<sub>j,pm</sub>
</italic> is the average value of the standard reference of all samples. The prediction model has a better accuracy and robustness with the higher Rc and Rp (closer to 1), and smaller and closer the values of REMSCV and RMSEP.</p>
</sec>
</sec>
</sec>
<sec id="s3" sec-type="results">
<label>3</label>
<title>Results and discussion</title>
<sec id="s3_1">
<label>3.1</label>
<title>Master calibration model to predict LC</title>
<p>Based on NIR spectroscopy, we established the prediction model of the LC of snow pear samples in batch A, which was used as the master model for the detection of LC in four batches of samples in this study. To deduct the influence of instrument background or drift on the signal, eliminate random noise, and improve the signal-to-noise ratio, the first derivative (1-Der, polynomial order = 1, smoothing points = 11), second derivative (2-Der, polynomial order = 2, smoothing points = 11), standard normal variate transformation (SNV), and multiplicative scatter correction (MSC) were used and compared to pretreat the raw average NIR spectra of nine measurement positions at each sample. We carried out the preprocessing methods using the software Unscrambler X 10.4 (CAMO PRECESS AS, Oslo, Norway). The results shown in <xref ref-type="table" rid="T2">
<bold>Table&#xa0;2</bold>
</xref> indicated that the prediction model using the SNV preprocessing method achieved better performance. Compared with the no preprocessing method, <xref ref-type="fig" rid="f3">
<bold>Figure&#xa0;3</bold>
</xref> showed that the Rc and Rp were improved from 0.807 and 0.850 to 0.822 and 0.857, respectively, whereas the RMSECV and RMSEP were reduced from 0.710 and 0.603 to 0.679 and 0.602, respectively. Therefore, we further analyzed the LC detection model based on the NIR data after SNV preprocessing.</p>
<table-wrap id="T2" position="float">
<label>Table&#xa0;2</label>
<caption>
<p>Performance of model based on the different preprocessing methods for measuring LC of batch A of samples.</p>
</caption>
<table frame="hsides">
<thead>
<tr>
<th valign="middle" align="center">Preprocessing method</th>
<th valign="middle" align="center">Number of wavelengths</th>
<th valign="middle" align="center">LVs</th>
<th valign="middle" align="center">Rc</th>
<th valign="middle" align="center">RMSECV</th>
<th valign="middle" align="center">Rp</th>
<th valign="middle" align="center">RMSEP</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="middle" align="center">NONE</td>
<td valign="middle" align="center">387</td>
<td valign="middle" align="center">10</td>
<td valign="middle" align="center">0.807</td>
<td valign="middle" align="center">0.710</td>
<td valign="middle" align="center">0.850</td>
<td valign="middle" align="center">0.603</td>
</tr>
<tr>
<td valign="middle" align="center">1-Der (11)</td>
<td valign="middle" align="center">387</td>
<td valign="middle" align="center">9</td>
<td valign="middle" align="center">0.812</td>
<td valign="middle" align="center">0.699</td>
<td valign="middle" align="center">0.847</td>
<td valign="middle" align="center">0.624</td>
</tr>
<tr>
<td valign="middle" align="center">2-Der (11)</td>
<td valign="middle" align="center">387</td>
<td valign="middle" align="center">8</td>
<td valign="middle" align="center">0.747</td>
<td valign="middle" align="center">0.805</td>
<td valign="middle" align="center">0.787</td>
<td valign="middle" align="center">0.716</td>
</tr>
<tr>
<td valign="middle" align="center">SNV</td>
<td valign="middle" align="center">387</td>
<td valign="middle" align="center">9</td>
<td valign="middle" align="center">0.822</td>
<td valign="middle" align="center">0.679</td>
<td valign="middle" align="center">0.857</td>
<td valign="middle" align="center">0.602</td>
</tr>
<tr>
<td valign="middle" align="center">MSC</td>
<td valign="middle" align="center">387</td>
<td valign="middle" align="center">10</td>
<td valign="middle" align="center">0.821</td>
<td valign="middle" align="center">0.683</td>
<td valign="middle" align="center">0.848</td>
<td valign="middle" align="center">0.618</td>
</tr>
</tbody>
</table>
</table-wrap>
<fig id="f3" position="float">
<label>Figure&#xa0;3</label>
<caption>
<p>
<bold>(A)</bold> Average spectra of each sample after SNV preprocessing, <bold>(B)</bold> the RMSECV versus the number of PLS components, <bold>(C)</bold> the performance of the PLSR model for measuring LC in the calibration set, and <bold>(D)</bold> the prediction set.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-14-1128993-g003.tif"/>
</fig>
<p>Hundreds or thousands of wavelengths in the full spectra of samples may contribute more collinearity and redundancies and contain useless or irrelevant information. This makes the calibration process more time-consuming, is less convenient to meet high-speed spectroscopy features, and reduces the prediction accuracy of the calibration model to measure the LC of snow pears. To eliminate the uninformative wavelengths, predigest the calibration model, and improve the prediction results in terms of accuracy and robustness, we selected and compared 19, 76, 80, 24, and 20 effective wavelengths (as shown in <xref ref-type="fig" rid="f4">
<bold>Figure&#xa0;4</bold>
</xref>) to build a model to predict the LC of snow pears using the successive projections algorithm (SPA), synergy interval partial least squares (SiPLS), genetic algorithm (GA), bootstrapping soft shrinkage (BOSS), and competitive adaptive reweighted sampling (CARS) methods, respectively.</p>
<fig id="f4" position="float">
<label>Figure&#xa0;4</label>
<caption>
<p>Distribution of effective wavelengths selected by SPA, SiPLS, GA, BOSS, and CARS methods.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-14-1128993-g004.tif"/>
</fig>
<p>In the SiPLS method, we divided the full spectra into 20 subintervals, and selected the 1st, 8th, 15th, and 18th subintervals as the effective regions. During the process of CARS effective wavelengths selection, we set the number of Monte Carlo sampling runs, the maximal principal value, and the number of cross validation to 100, 10, and 10, respectively. The number of iterations and cross-validation of the BOSS algorithm were set to 2000 and 5, and the maximum number of latent variables was set to 20. The statistical data in <xref ref-type="table" rid="T3">
<bold>Table&#xa0;3</bold>
</xref> show that the number of latent variables (LVs) of the model (SNV-CARS-PLSR) established based on the effective wavelengths selected by the CARS method was the lowest, which was eight LVs. The Rc of model (SNV-GA-PLSR) obtained by the GA method was the highest, which was 0.846, the Rp of the model (SNV-SPA-PLSR) by the SPA method was the highest (0.863), and the RMSECV and RMSEP values of the model (SNV-GA-PLSR) by the GA method were the lowest (0.637 and 0.624).</p>
<table-wrap id="T3" position="float">
<label>Table&#xa0;3</label>
<caption>
<p>Performance of the model based on different variables selection methods to measure the LC of batch A samples.</p>
</caption>
<table frame="hsides">
<thead>
<tr>
<th valign="middle" align="center">Variables selection method</th>
<th valign="middle" align="center">Number of effective wavelengths</th>
<th valign="middle" align="center">LVs</th>
<th valign="middle" align="center">Rc</th>
<th valign="middle" align="center">RMSECV</th>
<th valign="middle" align="center">Rp</th>
<th valign="middle" align="center">RMSEP</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="middle" align="center">NONE</td>
<td valign="middle" align="center">387</td>
<td valign="middle" align="center">9</td>
<td valign="middle" align="center">0.822</td>
<td valign="middle" align="center">0.679</td>
<td valign="middle" align="center">0.857</td>
<td valign="middle" align="center">0.602</td>
</tr>
<tr>
<td valign="middle" align="center">SPA</td>
<td valign="middle" align="center">19</td>
<td valign="middle" align="center">9</td>
<td valign="middle" align="center">0.828</td>
<td valign="middle" align="center">0.670</td>
<td valign="middle" align="center">0.863</td>
<td valign="middle" align="center">0.639</td>
</tr>
<tr>
<td valign="middle" align="center">SIPLS</td>
<td valign="middle" align="center">76</td>
<td valign="middle" align="center">11</td>
<td valign="middle" align="center">0.816</td>
<td valign="middle" align="center">0.692</td>
<td valign="middle" align="center">0.782</td>
<td valign="middle" align="center">0.898</td>
</tr>
<tr>
<td valign="middle" align="center">GA</td>
<td valign="middle" align="center">80</td>
<td valign="middle" align="center">10</td>
<td valign="middle" align="center">0.846</td>
<td valign="middle" align="center">0.637</td>
<td valign="middle" align="center">0.854</td>
<td valign="middle" align="center">0.624</td>
</tr>
<tr>
<td valign="middle" align="center">CARS</td>
<td valign="middle" align="center">24</td>
<td valign="middle" align="center">8</td>
<td valign="middle" align="center">0.840</td>
<td valign="middle" align="center">0.647</td>
<td valign="middle" align="center">0.859</td>
<td valign="middle" align="center">0.645</td>
</tr>
<tr>
<td valign="middle" align="center">BOSS</td>
<td valign="middle" align="center">20</td>
<td valign="middle" align="center">12</td>
<td valign="middle" align="center">0.809</td>
<td valign="middle" align="center">0.704</td>
<td valign="middle" align="center">0.806</td>
<td valign="middle" align="center">0.691</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>According to the results, the SNV-GA-PLSR model (master model A) had higher Rc and Rp values of 0.846 and 0.854 and lower RMSECV and RMSEP values of 0.637 and 0.624 (as shown in <xref ref-type="fig" rid="f5">
<bold>Figure&#xa0;5</bold>
</xref>), respectively. Moreover, the difference between the Rc and Rp and the RMSECV and RMSEP also was smaller. Therefore, the SNV-GA-PLSR demonstrated better prediction performance for measuring the LC of snow pears, which we used as the prediction model for the four batches of snow pear samples in this study.</p>
<fig id="f5" position="float">
<label>Figure&#xa0;5</label>
<caption>
<p>Performance of the SNV-GA-PLSR model for measuring the LC of batch A samples. <bold>(A)</bold> the calibration set, and <bold>(B)</bold> the prediction set.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-14-1128993-g005.tif"/>
</fig>
<p>As shown in <xref ref-type="fig" rid="f6">
<bold>Figure&#xa0;6</bold>
</xref>, the 80 selected effective wavelengths were distributed mainly at 1160 nm, 1198 nm, 1328&#x2013;1344 nm, 1420&#x2013;1430 nm, 1552&#x2013;1575 nm, 1670&#x2013;1693 nm, 1798 nm, 1821&#x2013;1831 nm, 1844&#x2013;1854 nm, 1929&#x2013;1952 nm, 2063&#x2013;2086 nm, 2128&#x2013;2138 nm, 2183&#x2013;2212 nm, 2264&#x2013;2277 nm, and 2290&#x2013;2300 nm. The NIR spectral region primarily contained the frequency doubling and combination bond information for C-H, N-H, and O-H, which was sensitive to the concentrations of organic materials. LC is the organic molecule and the C-H, N-H, and O-H were the most important groups with the main active ingredients. Thus, it is possible to use NIR methods for determination of LC in snow pear. Of these, 1160 nm and 1198 nm were associated with the third overtone of C-H; 1420&#x2013;1430 nm was associated with the second overtone of the H2O, O-H, N-H, and C-H combination; 1552&#x2013;1575 nm was associated with the first overtone of N-H; 1670&#x2013;1693 nm and 1798 nm were associated with the first overtone of C-H; 1821&#x2013;1831 nm was associated with the second overtone of the C=O stretch; 2063&#x2013;2086 nm was associated with the H2O and O-H combinations; 2128&#x2013;2138 nm was associated with the N-H combinations; 2183&#x2013;2212 nm was associated with the N-H+C-C combinations; and 2264&#x2013;2277 nm and 2290&#x2013;2300 nm were associated with the C-H+C-H combinations.</p>
<fig id="f6" position="float">
<label>Figure&#xa0;6</label>
<caption>
<p>Distribution of the 80 effective wavelengths selected by the GA method.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-14-1128993-g006.tif"/>
</fig>
<p>The <xref ref-type="table" rid="T4">
<bold>Table&#xa0;4</bold>
</xref> showed that SNV-GA-PLSR model can also simply the calibration model and improve the prediction performance for measuring the lignin content of batch B, C and D snow pears.</p>
<table-wrap id="T4" position="float">
<label>Table&#xa0;4</label>
<caption>
<p>Performance of the model based on GA method to measure the LC of batch B, C and D samples.</p>
</caption>
<table frame="hsides">
<thead>
<tr>
<th valign="middle" align="center">Batch</th>
<th valign="middle" align="center">Variables selection method</th>
<th valign="middle" align="center">Number of effective wavelengths</th>
<th valign="middle" align="center">LVs</th>
<th valign="middle" align="center">Rc</th>
<th valign="middle" align="center">RMSECV</th>
<th valign="middle" align="center">Rp</th>
<th valign="middle" align="center">RMSEP</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="middle" rowspan="2" align="center">B</td>
<td valign="middle" align="center">NONE</td>
<td valign="middle" align="center">387</td>
<td valign="middle" align="center">9</td>
<td valign="middle" align="center">0.760</td>
<td valign="middle" align="center">0.757</td>
<td valign="middle" align="center">0.780</td>
<td valign="middle" align="center">0.717</td>
</tr>
<tr>
<td valign="middle" align="center">GA</td>
<td valign="middle" align="center">80</td>
<td valign="middle" align="center">10</td>
<td valign="middle" align="center">0.798</td>
<td valign="middle" align="center">0.698</td>
<td valign="middle" align="center">0.807</td>
<td valign="middle" align="center">0.674</td>
</tr>
<tr>
<td valign="middle" rowspan="2" align="center">C</td>
<td valign="middle" align="center">NONE</td>
<td valign="middle" align="center">387</td>
<td valign="middle" align="center">9</td>
<td valign="middle" align="center">0.846</td>
<td valign="middle" align="center">0.600</td>
<td valign="middle" align="center">0.857</td>
<td valign="middle" align="center">0.694</td>
</tr>
<tr>
<td valign="middle" align="center">GA</td>
<td valign="middle" align="center">80</td>
<td valign="middle" align="center">10</td>
<td valign="middle" align="center">0.867</td>
<td valign="middle" align="center">0.550</td>
<td valign="middle" align="center">0.937</td>
<td valign="middle" align="center">0.411</td>
</tr>
<tr>
<td valign="middle" rowspan="2" align="center">D</td>
<td valign="middle" align="center">NONE</td>
<td valign="middle" align="center">387</td>
<td valign="middle" align="center">9</td>
<td valign="middle" align="center">0.693</td>
<td valign="middle" align="center">0.039</td>
<td valign="middle" align="center">0.657</td>
<td valign="middle" align="center">0.412</td>
</tr>
<tr>
<td valign="middle" align="center">GA</td>
<td valign="middle" align="center">80</td>
<td valign="middle" align="center">10</td>
<td valign="middle" align="center">0.717</td>
<td valign="middle" align="center">0.373</td>
<td valign="middle" align="center">0.756</td>
<td valign="middle" align="center">0.353</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s3_2">
<label>3.2</label>
<title>Robustness of the updated model by SS-FPME method</title>
<p>For the batch B samples of snow pears, we used master model A to directly measure the LC of the prediction data set of the batch B samples (Bpre), with the Rp of 0.823 and RMSEP of 0.641, as shown in <xref ref-type="fig" rid="f7">
<bold>Figure&#xa0;7A</bold>
</xref>. Based on the independent SS-FPME method, we obtained a new regression coefficient matrix (bs_AB) by using the regression coefficient matrix of master model A (bm_A) to supervise the learning of the calibration data set of the batch B samples (Bcal). Ind-model AB was established to predict the LC of Bpre, and the predictive ability of the updated model (Ind-model AB) was improved to a certain extent. <xref ref-type="fig" rid="f7">
<bold>Figure&#xa0;7B</bold>
</xref> shows that the Rp value increased from 0.823 to 0.837, and the RMSEP value decreased from 0.641 to 0.614.</p>
<fig id="f7" position="float">
<label>Figure&#xa0;7</label>
<caption>
<p>Performance for predicting LC of batch B of samples by <bold>(A)</bold> master model A and <bold>(B)</bold> Ind-model AB.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-14-1128993-g007.tif"/>
</fig>
<p>For the batch C samples of snow pears, <xref ref-type="fig" rid="f8">
<bold>Figure&#xa0;8A</bold>
</xref> shows that the performance of using master model A to directly detect the LC of the prediction data set of the batch C samples (Cpre) was poor, with an Rp of 0.602 and RMSEP of 1.703. Based on the independent SS-FPME method, we obtained the regression coefficient matrix (bs_AC) and the Ind-model AC using the bm_A constraint supervision to learn the calibration data set of the batch C samples (Ccal). The prediction performance was greatly improved, with an Rp of 0.940 and RMSEP of 0.433, as shown in <xref ref-type="fig" rid="f8">
<bold>Figure&#xa0;8B</bold>
</xref>. Based on the sequential SS-FPME method, we used the regression coefficient matrix (bm_A) of master model A in supervised learning Bcal to first construct the bs_AB, and then we used the bs_AB in supervised learning Ccal to construct bs_ABC, and established the Seq-model ABC to measure the LC of the prediction data set of the batch C samples (Cpre). Compared with the Ind-model AC, the prediction performance was further improved: the Rp value increased from 0.940 to 0.952 and the RMSEP value decreased from 0.433 to 0.383, as shown in <xref ref-type="fig" rid="f8">
<bold>Figure&#xa0;8C</bold>
</xref>.</p>
<fig id="f8" position="float">
<label>Figure&#xa0;8</label>
<caption>
<p>Performance for predicting LC of batch C of samples by <bold>(A)</bold> master model A, <bold>(B)</bold> Ind-model AC, and <bold>(C)</bold> Seq-model ABC.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-14-1128993-g008.tif"/>
</fig>
<p>The analysis process for the batch D samples was the same as that for the batch C samples, and the experimental results are shown in <xref ref-type="fig" rid="f9">
<bold>Figure&#xa0;9</bold>
</xref>. First, master model A was directly used to measure the LC of the batch D samples, and the performance was poor, with the Rp of 0.413 and RMSEP of 0.916 (<xref ref-type="fig" rid="f9">
<bold>Figure&#xa0;9A</bold>
</xref>). Then, we built the Ind-model AD based on the calibration data set of the batch D samples (Dcal) and bm_A in the independent SS-FPME method. The Rp and RMSEP of the Ind-model AD to detect the LC of the prediction data set of the batch D samples (Dpre) were 0.806 and 0.322 (<xref ref-type="fig" rid="f9">
<bold>Figure&#xa0;9B</bold>
</xref>), respectively. For the sequential SS-FPME method, we built the bs_ABCD and Seq-model ABCD by updating the Seq-model ABC based on the Dcal and bs_ABC. The Rp and RMSEP of Seq-model ABCD were 0.831 and 0.309 (<xref ref-type="fig" rid="f9">
<bold>Figure&#xa0;9C</bold>
</xref>), respectively. Therefore, the sequential SS-FPME method updated the master model based on SS-FPME supervised learning of the new batch samples further increased the Rp and reduced the RMSEP of prediction model to measure the LC of the batch C and D samples, and further improved the prediction performance of the updated calibration model. Moreover, the prediction performance of the updated model based on the sequential SS-FPME method was better than that of the independent SS-FPME method. This result indicated that sequential update enhanced the model features in the learning of previous batches.</p>
<fig id="f9" position="float">
<label>Figure&#xa0;9</label>
<caption>
<p>Performance for predicting LC of batch D of samples by <bold>(A)</bold> master model A, <bold>(B)</bold> Ind-model AD, and <bold>(C)</bold> Seq-model ABCD.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-14-1128993-g009.tif"/>
</fig>
<p>The constraint condition of regression coefficient had to be adjusted in the process of updating the master model using the independent SS-FPME method and the sequential SS-FPME method, which contained the information variation of the NIR spectra in the current batch and the new batch of snow pear samples. <xref ref-type="fig" rid="f10">
<bold>Figures&#xa0;10A, B</bold>
</xref> show the evolution process of the regression coefficients of master model A in the independent SS-FPME method and the sequential SS-FPME method, respectively. This illustration is helpful to better understand the batch adjustment of the regression coefficients. Compared with the regression coefficient of master model A, the regression coefficient of the updated batch B model was basically the same as that of batch A, whereas the regression coefficients of the updated batch C and D models varied greatly, thus improving the prediction performance of the model. The difference of regression coefficients was unique for each batch of samples. It was difficult, however, to extract information related to chemical composition to analyze the causes of these spectral changes.</p>
<fig id="f10" position="float">
<label>Figure&#xa0;10</label>
<caption>
<p>Evolution process of regression coefficients of master model A in <bold>(A)</bold> the independent SS-FPME method and <bold>(B)</bold> the sequential SS-FPME method.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-14-1128993-g010.tif"/>
</fig>
<p>Although we used the same microfiber optic spectrometer to collect the NIR spectra and followed the same standard procedures to measure the LC reference for each batch of samples, the performance of detecting the LC of the batch B, C, and D samples using master model A was poor, with lower Rp values and higher RMSEP values. The varieties in the NIR spectra of the samples occurred for several potential reasons, including changes in the detector light source, the acquisition environment temperature, the operation of spectral collection and reference value determination, and the process and equipment of the sample pretreatment. In this study, <xref ref-type="table" rid="T5">
<bold>Table&#xa0;5</bold>
</xref> shows that the updated model using the SS-FPME method based on the batch A samples could improve the performance of predicting the LC of the batch B, C, and D samples. Compared with the independent SS-FPME method used to update the master model, the sequential SS-FPME method could enhance the model features from previous supervised learning and obtain better prediction perfosssrmance. Therefore, the updated model based on supervision and learning of a new batch sample using the sequential SS-FPME method could improve the robustness and migration ability of the model to detect the LC of snow pears and provided technical support for the development of a portable detection device.</p>
<table-wrap id="T5" position="float">
<label>Table&#xa0;5</label>
<caption>
<p>Robustness of different updated model based on NIR spectroscopy for detecting LC in snow pear.</p>
</caption>
<table frame="hsides">
<thead>
<tr>
<th valign="middle" align="center">Batch of samples</th>
<th valign="middle" align="center">Method</th>
<th valign="middle" align="center">Rp</th>
<th valign="middle" align="center">RMSEP</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="middle" rowspan="2" align="center">B</td>
<td valign="middle" align="left">Model A directly predict batch B</td>
<td valign="middle" align="center">0.824</td>
<td valign="middle" align="center">0.641</td>
</tr>
<tr>
<td valign="middle" align="left">Ind-model AB predict batch B</td>
<td valign="middle" align="center">0.837</td>
<td valign="middle" align="center">0.614</td>
</tr>
<tr>
<td valign="middle" rowspan="3" align="center">C</td>
<td valign="middle" align="left">Model A directly predict batch C</td>
<td valign="middle" align="center">0.602</td>
<td valign="middle" align="center">1.703</td>
</tr>
<tr>
<td valign="middle" align="left">Ind-model AC predict batch C</td>
<td valign="middle" align="center">0.940</td>
<td valign="middle" align="center">0.433</td>
</tr>
<tr>
<td valign="middle" align="left">Seq-model ABC predict batch C</td>
<td valign="middle" align="center">0.952</td>
<td valign="middle" align="center">0.383</td>
</tr>
<tr>
<td valign="middle" rowspan="3" align="center">D</td>
<td valign="middle" align="left">Model A directly predict batch D</td>
<td valign="middle" align="center">0.413</td>
<td valign="middle" align="center">0.917</td>
</tr>
<tr>
<td valign="middle" align="left">Ind-model AD predict batch D</td>
<td valign="middle" align="center">0.806</td>
<td valign="middle" align="center">0.332</td>
</tr>
<tr>
<td valign="middle" align="left">Seq-model ABCD predict batch D</td>
<td valign="middle" align="center">0.831</td>
<td valign="middle" align="center">0.309</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
</sec>
<sec id="s4" sec-type="conclusions">
<label>4</label>
<title>Conclusion</title>
<p>We examined the robustness of the calibration model used to predict the LC of different batches of snow pears based on NIR spectroscopy. The results showed that the performance of the calibration model updated using the SS-FPME method with a small number of samples from a new batch of snow pears was improved. The NIR spectra at nine different measurement positions of snow pear samples purchased at four different periods were collected by a microfiber optic spectrometer. Then, the average NIR spectra of each sample in batch A were processed by 1-Der (11), 2-Der (11), SNV, and MSC pretreatment methods. Next, we selected 19, 76, 80, 24, and 20 effective wavelengths and compared them to build a model to predict the LC of snow pears using SPA, SiPLS, GA, BOSS, and CARS variable selection methods, respectively. As a result, the SNV-GA-PLSR model (master model A) had higher Rc and Rp values of 0.846 and 0.854, lower RMSECV and RMSEP values of 0.637 and 0.624, and the difference between the Rc and Rp and the RMSECV and RMSEP were also smaller. Thus, this model was used as the prediction model for detecting the LC in the other three batches of snow pear samples. Although we used the same microfiber optic spectrometer to collect the NIR spectra and followed the same standard procedures to measure the LC reference for each batch of samples, the performance of detecting the LC of the batch B, C, and D samples by the master model A was poor, with lower Rp values and higher RMSEP values. We used and compared the independent SS-FPME method and the sequential SS-FPME method to update master model A for predicting the LC of snow pears.</p>
<p>For the batch B samples, the predictive ability of the updated model (Ind-model AB) was improved: the Rp value increased from 0.823 to 0.837, and the RMSEP value decreased from 0.641 to 0.614. For the batch C samples, the performance of the Seq-model ABC was improved greatly: the Rp value increased from 0.602 to 0.952, and the RMSEP value decreased from 1.703 to 0.383. For the batch D samples, the performance of the Seq-model ABCD was also improved: the Rp value increased from 0.413 to 0.831, and the RMSEP value decreased from 0.916 to 0.309. Moreover, the prediction performance of the updated model based on the sequential SS-FPME method was better than that of independent SS-FPME method, which indicated that the sequential update enhanced the model features in the learning of previous batches. Therefore, the updated model based on supervision and learning of new batch samples according to the sequential SS-FPME method improved the robustness and migration ability of model to detect the LC of snow pears and provided technical support for the development of a portable detection device.</p>
</sec>
<sec id="s5" sec-type="data-availability">
<title>Data availability statement</title>
<p>The original contributions presented in the study are included in the article/<xref ref-type="supplementary-material" rid="SM1"><bold>Supplementary Material</bold></xref>. Further inquiries can be directed to the corresponding author.</p>
</sec>
<sec id="s6" sec-type="author-contributions">
<title>Author contributions</title>
<p>Conceptualization, XW and GL; methodology, XW; software, XW; validation, XW, XF and WW; formal analysis, XW; investigation, XW; resources, XW; data curation, XW; writing&#x2014;original draft preparation, XW; writing&#x2014;review and editing, XW; visualization, XF; supervision, GL; project administration, GL; funding acquisition, XW and WW. All authors contributed to the article and approved the submitted version.</p>
</sec>
</body>
<back>
<sec id="s7" sec-type="funding-information">
<title>Funding</title>
<p>The authors were grateful for Natural Science Foundation of Chongqing (Grant No. CSTB2022NSCQ-MSX1140); Science and Technology Research Program of Chongqing Municipal Commission (Grant No. KJQN201903114 and KJQN202103105).</p>
</sec>
<sec id="s8" sec-type="COI-statement">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec id="s9" sec-type="disclaimer">
<title>Publisher&#x2019;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<sec id="s10" sec-type="supplementary-material">
<title>Supplementary material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fpls.2023.1128993/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fpls.2023.1128993/full#supplementary-material</ext-link>
</p>
<supplementary-material xlink:href="Table_1.xlsx" id="SM1" mimetype="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"/>
<supplementary-material xlink:href="Table_2.xlsx" id="SM2" mimetype="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"/>
<supplementary-material xlink:href="Table_3.xlsx" id="SM3" mimetype="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"/>
<supplementary-material xlink:href="Table_4.xlsx" id="SM4" mimetype="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Anderson</surname> <given-names>N. T.</given-names>
</name>
<name>
<surname>Walsh</surname> <given-names>K. B.</given-names>
</name>
<name>
<surname>Flynn</surname> <given-names>J. R.</given-names>
</name>
<name>
<surname>Walsh</surname> <given-names>J. P.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Achieving robustness across season, location and cultivar for a NIRS model for intact mango fruit dry matter content. II. local PLS and nonlinear models</article-title>. <source>Postharvest Biol. Technol.</source> <volume>171</volume>, <fpage>111358</fpage>. doi: <pub-id pub-id-type="doi">10.1016/j.postharvbio.2020.111358</pub-id>
</citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Anderson</surname> <given-names>N. T.</given-names>
</name>
<name>
<surname>Walsh</surname> <given-names>K. B.</given-names>
</name>
<name>
<surname>Subedi</surname> <given-names>P. P.</given-names>
</name>
<name>
<surname>Hayes</surname> <given-names>C. H.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Achieving robustness across season, location and cultivar for a NIRS model for intact mango fruit dry matter content</article-title>. <source>Postharvest Biol. Technol.</source> <volume>168</volume>, <fpage>111202</fpage>. doi: <pub-id pub-id-type="doi">10.1016/j.postharvbio.2020.111202</pub-id>
</citation>
</ref>
<ref id="B3">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Assis</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Ramos</surname> <given-names>R. S.</given-names>
</name>
<name>
<surname>Silva</surname> <given-names>L. A.</given-names>
</name>
<name>
<surname>Kist</surname> <given-names>V.</given-names>
</name>
<name>
<surname>Barbosa</surname> <given-names>M. H. P.</given-names>
</name>
<name>
<surname>Teofilo</surname> <given-names>R. F.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Prediction of lignin content in different parts of sugarcane using near-infrared spectroscopy (NIR), ordered predictors selection (OPS), and partial least squares (PLS)</article-title>. <source>Appl. Spectrosc</source> <volume>71</volume>, <fpage>2001</fpage>&#x2013;<lpage>2012</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1177/0003702817704147</pub-id>
</citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Balabin</surname> <given-names>R. M.</given-names>
</name>
<name>
<surname>Smirnov</surname> <given-names>S. V.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>Variable selection in near-infrared spectroscopy: benchmarking of feature selection methods on biodiesel data</article-title>. <source>Anal. Chim. Acta</source> <volume>692</volume>, <fpage>63</fpage>&#x2013;<lpage>72</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.aca.2011.03.006</pub-id>
</citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bunzel</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Schussler</surname> <given-names>A.</given-names>
</name>
<name>
<surname>Tchetseubu Saha</surname> <given-names>G.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>Chemical characterization of klason lignin preparations from plant-based foods</article-title>. <source>J. Agric. Food Chem.</source> <volume>59</volume>, <fpage>12506</fpage>&#x2013;<lpage>12513</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1021/jf2031378</pub-id>
</citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cai</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>G.</given-names>
</name>
<name>
<surname>Nie</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Lin</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Nie</surname> <given-names>F.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>J.</given-names>
</name>
<etal/>
</person-group>. (<year>2010</year>). <article-title>Study of the structure and biosynthetic pathway of lignin in stone cells of pear</article-title>. <source>Scientia Hortic.</source> <volume>125</volume>, <fpage>374</fpage>&#x2013;<lpage>379</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.scienta.2010.04.029</pub-id>
</citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cybulska</surname> <given-names>I.</given-names>
</name>
<name>
<surname>Brudecki</surname> <given-names>G.</given-names>
</name>
<name>
<surname>Rosentrater</surname> <given-names>K.</given-names>
</name>
<name>
<surname>Julson</surname> <given-names>J. L.</given-names>
</name>
<name>
<surname>Lei</surname> <given-names>H.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>Comparative study of organosolv lignin extracted from prairie cordgrass, switchgrass and corn stover</article-title>. <source>Bioresour Technol.</source> <volume>118</volume>, <fpage>30</fpage>&#x2013;<lpage>36</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.biortech.2012.05.073</pub-id>
</citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>de Assis</surname> <given-names>M. W.</given-names>
</name>
<name>
<surname>De Fusco</surname> <given-names>D. O.</given-names>
</name>
<name>
<surname>Costa</surname> <given-names>R. C.</given-names>
</name>
<name>
<surname>de Lima</surname> <given-names>K. M.</given-names>
</name>
<name>
<surname>Cunha Junior</surname> <given-names>L. C.</given-names>
</name>
<name>
<surname>de Almeida Teixeira</surname> <given-names>G.H. PLS</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>iPLS, GA-PLS models for soluble solids content, pH and acidity determination in intact dovyalis fruit using near-infrared spectroscopy</article-title>. <source>J. Sci. Food Agric.</source> <volume>98</volume>, <fpage>5750</fpage>&#x2013;<lpage>5755</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1002/jsfa.9123</pub-id>
</citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Deng</surname> <given-names>B. C.</given-names>
</name>
<name>
<surname>Yun</surname> <given-names>Y. H.</given-names>
</name>
<name>
<surname>Cao</surname> <given-names>D. S.</given-names>
</name>
<name>
<surname>Yin</surname> <given-names>Y. L.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>W. T.</given-names>
</name>
<name>
<surname>Lu</surname> <given-names>H. M.</given-names>
</name>
<etal/>
</person-group>. (<year>2016</year>). <article-title>A bootstrapping soft shrinkage approach for variable selection in chemical modeling</article-title>. <source>Anal. Chim. Acta</source> <volume>908</volume>, <fpage>63</fpage>&#x2013;<lpage>74</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.aca.2016.01.001</pub-id>
</citation>
</ref>
<ref id="B10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Deng</surname> <given-names>B. C.</given-names>
</name>
<name>
<surname>Yun</surname> <given-names>Y. H.</given-names>
</name>
<name>
<surname>Liang</surname> <given-names>Y. Z.</given-names>
</name>
<name>
<surname>Yi</surname> <given-names>L. Z.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>A novel variable selection approach that iteratively optimizes variable space using weighted binary matrix sampling</article-title>. <source>Analyst</source> <volume>139</volume>, <fpage>4836</fpage>&#x2013;<lpage>4845</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1039/c4an00730a</pub-id>
</citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Douglas</surname> <given-names>R. K.</given-names>
</name>
<name>
<surname>Nawar</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Alamar</surname> <given-names>M. C.</given-names>
</name>
<name>
<surname>Mouazen</surname> <given-names>A. M.</given-names>
</name>
<name>
<surname>Coulon</surname> <given-names>F.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Rapid prediction of total petroleum hydrocarbons concentration in contaminated soil using vis-NIR spectroscopy and regression techniques</article-title>. <source>Sci. Total Environ.</source> <volume>616-617</volume>, <fpage>147</fpage>&#x2013;<lpage>155</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.scitotenv.2017.10.323</pub-id>
</citation>
</ref>
<ref id="B12">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Du</surname> <given-names>X.-l.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>X.-y.</given-names>
</name>
<name>
<surname>Liu</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Zhou</surname> <given-names>W.-h.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>J.-l.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Genetic algorithm optimized non-destructive prediction on property of mechanically injured peaches during postharvest storage by portable visible/shortwave near-infrared spectroscopy</article-title>. <source>Scientia Hortic.</source> <volume>249</volume>, <fpage>240</fpage>&#x2013;<lpage>249</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.scienta.2019.01.057</pub-id>
</citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Haaland</surname> <given-names>D. M.</given-names>
</name>
<name>
<surname>Thomas</surname> <given-names>E. V.</given-names>
</name>
</person-group> (<year>1988</year>). <article-title>Partial least-squares methods for spectral analyses .1. relation to other quantitative calibration methods and the extraction of qualitative information</article-title>. <source>Analytical Chem.</source> <volume>60</volume> (<issue>11</issue>), <fpage>1193</fpage>&#x2013;<lpage>1202</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1021/ac00162a020</pub-id>
</citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jiang</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Chen</surname> <given-names>Q.</given-names>
</name>
<name>
<surname>Mei</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Liu</surname> <given-names>G.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Identification of solid state fermentation degree with FT-NIR spectroscopy: Comparison of wavelength variable selection methods of CARS and SCARS</article-title>. <source>Spectrochim Acta A Mol. Biomol Spectrosc</source> <volume>149</volume>, <fpage>1</fpage>&#x2013;<lpage>7</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.saa.2015.04.024</pub-id>
</citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jie</surname> <given-names>D.</given-names>
</name>
<name>
<surname>Xie</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Fu</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Rao</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Ying</surname> <given-names>Y.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>Variable selection for partial least squares analysis of soluble solids content in watermelon using near-infrared diffuse transmission technique</article-title>. <source>J. Food Eng.</source> <volume>118</volume>, <fpage>387</fpage>&#x2013;<lpage>392</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.jfoodeng.2013.04.027</pub-id>
</citation>
</ref>
<ref id="B16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liang</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Wei</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Fang</surname> <given-names>G.</given-names>
</name>
<name>
<surname>Xu</surname> <given-names>F.</given-names>
</name>
<name>
<surname>Deng</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Shen</surname> <given-names>K.</given-names>
</name>
<etal/>
</person-group>. (<year>2020</year>). <article-title>Prediction of holocellulose and lignin content of pulp wood feedstock using near infrared spectroscopy and variable selection</article-title>. <source>Spectrochim Acta A Mol. Biomol Spectrosc</source> <volume>225</volume>, <elocation-id>117515</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.saa.2019.117515</pub-id>
</citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Huang</surname> <given-names>W.</given-names>
</name>
<name>
<surname>Chen</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Fan</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>B.</given-names>
</name>
<name>
<surname>Guo</surname> <given-names>Z.</given-names>
</name>
<etal/>
</person-group>. (<year>2014</year>). <article-title>Variable selection in visible and near-infrared spectral analysis for noninvasive determination of soluble solids content of &#x2018;Ya&#x2019; pear</article-title>. <source>Food Analytical Methods</source> <volume>7</volume>, <fpage>1891</fpage>&#x2013;<lpage>1902</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1007/s12161-014-9832-8</pub-id>
</citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Liang</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Xu</surname> <given-names>Q.</given-names>
</name>
<name>
<surname>Cao</surname> <given-names>D.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>Key wavelengths screening using competitive adaptive reweighted sampling method for multivariate calibration</article-title>. <source>Anal. Chim. Acta</source> <volume>648</volume>, <fpage>77</fpage>&#x2013;<lpage>84</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.aca.2009.06.046</pub-id>
</citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>F.</given-names>
</name>
<name>
<surname>Yang</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Guo</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Huang</surname> <given-names>F.</given-names>
</name>
<name>
<surname>Chen</surname> <given-names>Z.</given-names>
</name>
<etal/>
</person-group>. (<year>2018</year>). <article-title>Quantitative analysis of glycated albumin in serum based on ATR-FTIR spectrum combined with SiPLS and SVM</article-title>. <source>Spectrochim Acta A Mol. Biomol Spectrosc</source> <volume>201</volume>, <fpage>249</fpage>&#x2013;<lpage>257</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.saa.2018.05.022</pub-id>
</citation>
</ref>
<ref id="B20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mercader</surname> <given-names>M. B.</given-names>
</name>
<name>
<surname>Puigdomenech</surname> <given-names>A. R.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Near infrared multivariate model maintenance: the cornerstone of success</article-title>. <source>NIR News</source> <volume>25</volume>, <fpage>7</fpage>&#x2013;<lpage>9</lpage>. doi: <pub-id pub-id-type="doi">10.1255/nirn.1480</pub-id>
</citation>
</ref>
<ref id="B21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nicola&#xef;</surname> <given-names>B. M.</given-names>
</name>
<name>
<surname>Verlinden</surname> <given-names>B. E.</given-names>
</name>
<name>
<surname>Desmet</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Saevels</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Saeys</surname> <given-names>W.</given-names>
</name>
<name>
<surname>Theron</surname> <given-names>K.</given-names>
</name>
<etal/>
</person-group>. (<year>2008</year>). <article-title>Time-resolved and continuous wave NIR reflectance spectroscopy to predict soluble solids content and firmness of pear</article-title>. <source>Postharvest Biol. Technol.</source> <volume>47</volume>, <fpage>68</fpage>&#x2013;<lpage>74</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.postharvbio.2007.06.001</pub-id>
</citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rahman</surname> <given-names>A.</given-names>
</name>
<name>
<surname>Faqeerzada</surname> <given-names>M. A.</given-names>
</name>
<name>
<surname>Cho</surname> <given-names>B. K.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Hyperspectral imaging for predicting the allicin and soluble solid content of garlic with variable selection algorithms and chemometric models</article-title>. <source>J. Sci. Food Agric.</source> <volume>98</volume>, <fpage>4715</fpage>&#x2013;<lpage>4725</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1002/jsfa.9006</pub-id>
</citation>
</ref>
<ref id="B23">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rahman</surname> <given-names>A.</given-names>
</name>
<name>
<surname>Kandpal</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Lohumi</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Kim</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Lee</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Mo</surname> <given-names>C.</given-names>
</name>
<etal/>
</person-group>. (<year>2017</year>). <article-title>Nondestructive estimation of moisture content, pH and soluble solid contents in intact tomatoes using hyperspectral imaging</article-title>. <source>Appl. Sci.</source> <volume>7</volume>, <page-range>1&#x2013;13</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.3390/app7010109</pub-id>
</citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sheng</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Dong</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Yin</surname> <given-names>J.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Nondestructive determination of lignin content in korla fragrant pear based on near-infrared spectroscopy</article-title>. <source>Spectrosc. Lett.</source> <volume>53</volume>, <fpage>306</fpage>&#x2013;<lpage>314</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1080/00387010.2020.1740276</pub-id>
</citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Song</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>G.</given-names>
</name>
<name>
<surname>Yang</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Liu</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Xie</surname> <given-names>L.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Rapid analysis of soluble solid content in navel orange based on visible-near infrared spectroscopy combined with a swarm intelligence optimization method</article-title>. <source>Spectrochim Acta A Mol. Biomol Spectrosc</source> <volume>228</volume>, <elocation-id>117815</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.saa.2019.117815</pub-id>
</citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tao</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Khanizadeh</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>S.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>Anatomy, ultrastructure and lignin distribution of stone cells in two pyrus species</article-title>. <source>Plant Sci.</source> <volume>176</volume>, <fpage>413</fpage>&#x2013;<lpage>419</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.plantsci.2008.12.011</pub-id>
</citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tao</surname> <given-names>D.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>G.</given-names>
</name>
<name>
<surname>Xie</surname> <given-names>L.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Sex determination of silkworm pupae using VIS-NIR hyperspectral imaging combined with chemometrics</article-title>. <source>Spectrochim Acta A Mol. Biomol Spectrosc</source> <volume>208</volume>, <fpage>7</fpage>&#x2013;<lpage>12</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.saa.2018.09.049</pub-id>
</citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tian</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>Q.</given-names>
</name>
<name>
<surname>Fan</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Huang</surname> <given-names>W. A.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Bi-layer model for nondestructive prediction of soluble solids content in apple based on reflectance spectra and peel pigments</article-title>. <source>Food Chem.</source> <volume>239</volume>, <fpage>1055</fpage>&#x2013;<lpage>1063</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.foodchem.2017.07.045</pub-id>
</citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname> <given-names>F.</given-names>
</name>
<name>
<surname>Zhao</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Yang</surname> <given-names>G.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Development of a non-destructive method for detection of the juiciness of pear <italic>via</italic> VIS/NIR spectroscopy combined with chemometric methods</article-title>. <source>Foods</source> <volume>9</volume>, <page-range>1&#x2013;15</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.3390/foods9121778</pub-id>
</citation>
</ref>
<ref id="B30">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wu</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>G.</given-names>
</name>
<name>
<surname>He</surname> <given-names>F.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Nondestructive analysis of internal quality in pears with a self-made near-infrared spectrum detector combined with multivariate data processing</article-title>. <source>Foods</source> <volume>10</volume>, <page-range>1&#x2013;24</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.3390/foods10061315</pub-id>
</citation>
</ref>
<ref id="B31">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wu</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>G.</given-names>
</name>
<name>
<surname>Liu</surname> <given-names>X.</given-names>
</name>
<name>
<surname>He</surname> <given-names>F.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Rapid non-destructive analysis of lignin using NIR spectroscopy and chemo-metrics</article-title>. <source>Food Energy Secur.</source> <volume>10</volume>, <page-range>1&#x2013;15</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1002/fes3.289</pub-id>
</citation>
</ref>
<ref id="B32">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xiaobo</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Jiewen</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Povey</surname> <given-names>M. J.</given-names>
</name>
<name>
<surname>Holmes</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Hanpin</surname> <given-names>M.</given-names>
</name>
</person-group> (<year>2010</year>). <article-title>Variables selection methods in near-infrared spectroscopy</article-title>. <source>Anal. Chim. Acta</source> <volume>667</volume>, <fpage>14</fpage>&#x2013;<lpage>32</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.aca.2010.03.048</pub-id>
</citation>
</ref>
<ref id="B33">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xue</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Yao</surname> <given-names>J. L.</given-names>
</name>
<name>
<surname>Qin</surname> <given-names>M. F.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>M. Y.</given-names>
</name>
<name>
<surname>Allan</surname> <given-names>A. C.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>D. F.</given-names>
</name>
<etal/>
</person-group>. (<year>2019</year>). <article-title>PbrmiR397a regulates lignification during stone cell development in pear fruit</article-title>. <source>Plant Biotechnol. J.</source> <volume>17</volume>, <fpage>103</fpage>&#x2013;<lpage>117</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1111/pbi.12950</pub-id>
</citation>
</ref>
<ref id="B34">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xu</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Qi</surname> <given-names>B.</given-names>
</name>
<name>
<surname>Sun</surname> <given-names>T.</given-names>
</name>
<name>
<surname>Fu</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Ying</surname> <given-names>Y.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>Variable selection in visible and near-infrared spectra: Application to on-line determination of sugar content in pears</article-title>. <source>J. Food Eng.</source> <volume>109</volume>, <fpage>142</fpage>&#x2013;<lpage>147</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.jfoodeng.2011.09.022</pub-id>
</citation>
</ref>
<ref id="B35">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yang</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Song</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Wu</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Xie</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Liu</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>G.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Identification of unhealthy panax notoginseng from different geographical origins by means of multi-label classification</article-title>. <source>Spectrochim Acta A Mol. Biomol Spectrosc</source> <volume>222</volume>, <elocation-id>117243</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.saa.2019.117243</pub-id>
</citation>
</ref>
<ref id="B36">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yan</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Song</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Tian</surname> <given-names>K.</given-names>
</name>
<name>
<surname>Gao</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>Q.</given-names>
</name>
<name>
<surname>Xiong</surname> <given-names>Y.</given-names>
</name>
<etal/>
</person-group>. (<year>2019</year>). <article-title>A modification of the bootstrapping soft shrinkage approach for spectral variable selection in the issue of over-fitting, model accuracy and variable selection credibility</article-title>. <source>Spectrochim Acta A Mol. Biomol Spectrosc</source> <volume>210</volume>, <fpage>362</fpage>&#x2013;<lpage>371</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.saa.2018.10.034</pub-id>
</citation>
</ref>
<ref id="B37">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yan</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Yin</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>N.</given-names>
</name>
<name>
<surname>Jin</surname> <given-names>Q.</given-names>
</name>
<name>
<surname>Fang</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Lin</surname> <given-names>Y.</given-names>
</name>
<etal/>
</person-group>. (<year>2014</year>). <article-title>Stone cell distribution and lignin structure in various pear varieties</article-title>. <source>Scientia Hortic.</source> <volume>174</volume>, <fpage>142</fpage>&#x2013;<lpage>150</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.scienta.2014.05.018</pub-id>
</citation>
</ref>
<ref id="B38">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zeaiter</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Roger</surname> <given-names>J. M.</given-names>
</name>
<name>
<surname>Bellon-Maurel</surname> <given-names>V.</given-names>
</name>
</person-group> (<year>2006</year>). <article-title>Dynamic orthogonal projection. a new method to maintain the on-line robustness of multivariate calibrations. application to NIR-based monitoring of wine fermentations</article-title>. <source>Chemometr. Intell. Lab. Syst.</source> <volume>80</volume>, <fpage>227</fpage>&#x2013;<lpage>235</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.chemolab.2005.06.011</pub-id>
</citation>
</ref>
<ref id="B39">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Ye</surname> <given-names>S.</given-names>
</name>
</person-group> (<year>2008</year>). <article-title>Prediction of soluble solids content, firmness and pH of pear by signals of electronic nose sensors</article-title>. <source>Anal. Chim. Acta</source> <volume>606</volume>, <fpage>112</fpage>&#x2013;<lpage>118</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.aca.2007.11.003</pub-id>
</citation>
</ref>
<ref id="B40">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zou</surname> <given-names>P.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Traditional Chinese medicine, food therapy, and hypertension control: A narrative review of Chinese literature</article-title>. <source>Am. J. Chin. Med.</source> <volume>44</volume>, <fpage>1579</fpage>&#x2013;<lpage>1594</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1142/S0192415X16500889</pub-id>
</citation>
</ref>
</ref-list>
</back>
</article>