<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="research-article" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Anal. Sci.</journal-id>
<journal-title>Frontiers in Analytical Science</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Anal. Sci.</abbrev-journal-title>
<issn pub-type="epub">2673-9283</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">754447</article-id>
<article-id pub-id-type="doi">10.3389/frans.2021.754447</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Analytical Science</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Different Methods for Determining the Dimensionality of Multivariate Models</article-title>
<alt-title alt-title-type="left-running-head">Rutledge et&#x20;al.</alt-title>
<alt-title alt-title-type="right-running-head">Dimensionality of Multivariate Models</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Rutledge</surname>
<given-names>Douglas N.</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1259545/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Roger</surname>
<given-names>Jean-Michel</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff4">
<sup>4</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1271768/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Lesnoff</surname>
<given-names>Matthieu</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff5">
<sup>5</sup>
</xref>
<xref ref-type="aff" rid="aff6">
<sup>6</sup>
</xref>
</contrib>
</contrib-group>
<aff id="aff1">
<label>
<sup>1</sup>
</label>ChemHouse Research Group, <addr-line>Montpellier</addr-line>, <country>France</country>
</aff>
<aff id="aff2">
<label>
<sup>2</sup>
</label>INRAe, AgroParisTech, UMR SayFood, Universit&#xe9; Paris-Saclay, <addr-line>Paris</addr-line>, <country>France</country>
</aff>
<aff id="aff3">
<label>
<sup>3</sup>
</label>National Wine and Grape Industry Centre, Charles Sturt University, <addr-line>Wagga Wagga</addr-line>, <addr-line>NSW</addr-line>, <country>Australia</country>
</aff>
<aff id="aff4">
<label>
<sup>4</sup>
</label>UMR ITAP, INRAe, Montpellier Institut Agro, Univ Montpellier, <addr-line>Montpellier</addr-line>, <country>France</country>
</aff>
<aff id="aff5">
<label>
<sup>5</sup>
</label>SELMET, CIRAD, INRAe, Institut Agro, Univ Montpellier, <addr-line>Montpellier</addr-line>, <country>France</country>
</aff>
<aff id="aff6">
<label>
<sup>6</sup>
</label>CIRAD, UMR SELMET, <addr-line>Montpellier</addr-line>, <country>France</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/442161/overview">Hoang Vu Dang</ext-link>, Hanoi University of Pharmacy, Vietnam</p>
</fn>
<fn fn-type="edited-by">
<p>
<bold>Reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1005782/overview">Jahan B Ghasemi</ext-link>, University of Tehran,&#x20;Iran</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1331908/overview">Ludovic Duponchel</ext-link>, Universit&#xe9; de Lille, France</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Douglas N. Rutledge, <email>rutledge@agroparistech.fr</email>
</corresp>
<fn fn-type="other">
<p>This article was submitted to Chemometrics, a section of the journal Frontiers in Analytical Science</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>18</day>
<month>10</month>
<year>2021</year>
</pub-date>
<pub-date pub-type="collection">
<year>2021</year>
</pub-date>
<volume>1</volume>
<elocation-id>754447</elocation-id>
<history>
<date date-type="received">
<day>10</day>
<month>08</month>
<year>2021</year>
</date>
<date date-type="accepted">
<day>06</day>
<month>10</month>
<year>2021</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2021 Rutledge, Roger and Lesnoff.</copyright-statement>
<copyright-year>2021</copyright-year>
<copyright-holder>Rutledge, Roger and Lesnoff</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these&#x20;terms.</p>
</license>
</permissions>
<abstract>
<p>A tricky aspect in the use of all multivariate analysis methods is the choice of the number of Latent Variables to use in the model, whether in the case of exploratory methods such as Principal Components Analysis (PCA) or predictive methods such as Principal Components Regression (PCR), Partial Least Squares regression (PLS). For exploratory methods, we want to know which Latent Variables deserve to be selected for interpretation and which contain only noise. For predictive methods, we want to ensure that we include all the variability of interest for the prediction, without introducing variability that would lead to a reduction in the quality of the predictions for samples other than those used to create the multivariate&#x20;model.</p>
</abstract>
<kwd-group>
<kwd>multivariate models</kwd>
<kwd>dimensionality</kwd>
<kwd>latent variables</kwd>
<kwd>regression</kwd>
<kwd>cross validation (min5-max 8)</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<p>In the case of predictive methods such as PLS, the most common procedure to determine the number of Latent Variables for use in the model is Cross Validation which is based on the difference between the vector of observed values, <bold>y</bold>, and the vector of predicted values, <bold>&#x177;</bold>.</p>
<p>In this article, we will first present this procedure and its extensions, and then other methods based on entirely different principles. Many of these methods may also apply to exploratory methods.</p>
<p>These alternatives to Cross Validation include methods based on the characteristics of the regression coefficients vectors, such as the Durbin-Watson Criterion, the Morphological Factor, the Variance or Norm and the repeatability of the vectors calculated on random subsets of the individuals. Another group of methods is based on characterizing the structure of the <bold>X</bold> matrices after each successive deflation.</p>
<p>The user is often baffled by the multitude of indicators that are available, since no single criterion (even the classical Cross-Validation) works perfectly in all cases. We propose an empirical method to facilitate the final choice of the number of Latent Variables. A set of indicators is chosen and their evolution as a function of the number of Latent Variables extracted is synthesized by a Principal Components Analysis. The set of criteria chosen here is not exhaustive, and the efficacy of the method could be improved by including others.</p>
<sec id="s1">
<title>Introduction</title>
<p>A tricky aspect in the use of all multivariate analysis methods is the determination of the number of Latent Variables, both for exploratory methods such as Principal Components Analysis (PCA) and Independent Components Analysis (ICA), and predictive methods such as Principal Components Regression (PCR), Partial Least Squares regression (PLS) or PLS Discriminant Analysis (PLS-DA). For exploratory methods, we want to know which Latent Variables deserve to be selected for interpretation and which contain only noise. For predictive methods, we want to ensure that we include all the variability of interest for the prediction, without introducing variability that would lead to a reduction in the quality of the predictions for samples other than those used to create the multivariate&#x20;model.</p>
<p>Whatever the type of method (exploratory or predictive), the most common procedure consists in examining the evolution of a criterion, as a function of the number of Latent Variables calculated. In the case of predictive methods such as PLS, the most common criterion is the Cross Validation error, which is based on the difference between the vector of observed values, <bold>y</bold>, and the vector of predicted values, <bold>&#x177;</bold>. But many other criteria can be used. In this article, we will first present the cross-validation procedure and its extensions, and then other methods based on entirely different principles. The objective of this article is not to make an exhaustive review of these criteria, but to present some of those of most interest for chemometrics.</p>
<p>Principal Components Analysis is based on the mathematical transformation of the original variables in the matrix <bold>X</bold> into a smaller number of uncorrelated variables, T.<disp-formula id="e1">
<mml:math id="m1">
<mml:mrow>
<mml:mi mathvariant="bold">X&#x3d;T</mml:mi>
<mml:mtext>&#x2009;</mml:mtext>
<mml:msup>
<mml:mi mathvariant="bold">P</mml:mi>
<mml:mi mathvariant="bold">T</mml:mi>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:mi mathvariant="bold">R</mml:mi>
</mml:mrow>
</mml:math>
<label>(1)</label>
</disp-formula>where the matrices <bold>T</bold> and <bold>P</bold> represent, respectively, the vectors of factorial coordinates (&#x201c;scores&#x201d;) and factorial contributions (&#x201c;loadings&#x201d;) derived from&#x20;<bold>X</bold>.</p>
<p>This method is interesting because, by construction, the PCs are uncorrelated and it is not possible to have more PCs than the rank of <bold>X</bold>, i.e.,&#x20;min (N<sub>individuals</sub>, N<sub>variables</sub>) if the data are not centered and min (N<sub>individuals</sub>-1, N<sub>variables</sub>) otherwise. In addition, since the first PCs correspond to the directions of greatest dispersion of the individuals, it is possible to retain only a small number of PCs, <bold>T&#x2a;</bold>, in the calculation of the coefficients of a PCR regression model.<disp-formula id="equ1">
<mml:math id="m2">
<mml:mrow>
<mml:mi mathvariant="bold">B&#x3d;</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mi mathvariant="bold">T</mml:mi>
<mml:mrow>
<mml:mi mathvariant="bold">&#x2a;T</mml:mi>
</mml:mrow>
</mml:msup>
<mml:msup>
<mml:mi mathvariant="bold">T</mml:mi>
<mml:mi mathvariant="bold">&#x2a;</mml:mi>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:msup>
<mml:mi mathvariant="bold">T</mml:mi>
<mml:mrow>
<mml:mi mathvariant="bold">&#x2a;T</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mi mathvariant="bold">Y</mml:mi>
</mml:mrow>
</mml:math>
<label>(2)</label>
</disp-formula>
<list list-type="simple">
<list-item>
<p>The values of new objects are then be predicted by the classical equation:</p>
</list-item>
</list>
<disp-formula id="e3">
<mml:math id="m3">
<mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mi mathvariant="normal">Y</mml:mi>
<mml:mo>&#x2322;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mi mathvariant="bold">&#x3d;TB&#x3d;XPB</mml:mi>
</mml:mrow>
</mml:math>
<label>(3)</label>
</disp-formula>
</p>
<p>PLS regression (Partial Least Squares regression) also allows to link a set of dependent variables, <bold>Y</bold>, to a set of independent variables, <bold>X</bold>, when the number of variables (independent and dependent) is high.<list list-type="simple">
<list-item>
<p>The independent variables, <bold>X</bold>, and dependent variables, <bold>Y</bold>, are decomposed as follows:</p>
</list-item>
</list>
<disp-formula id="e4">
<mml:math id="m4">
<mml:mrow>
<mml:mi mathvariant="bold">X&#x3d;T</mml:mi>
<mml:msup>
<mml:mi mathvariant="bold">P</mml:mi>
<mml:mi mathvariant="bold">T</mml:mi>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:mi mathvariant="bold">E</mml:mi>
</mml:mrow>
</mml:math>
<label>(4)</label>
</disp-formula>
<disp-formula id="e5">
<mml:math id="m5">
<mml:mrow>
<mml:mi mathvariant="bold">y&#x3d;U</mml:mi>
<mml:msup>
<mml:mi mathvariant="bold">R</mml:mi>
<mml:mi mathvariant="bold">T</mml:mi>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:mi mathvariant="bold">F</mml:mi>
</mml:mrow>
</mml:math>
<label>(5)</label>
</disp-formula>where the <bold>P</bold> and <bold>R</bold> represent the vectors of the factorial contributions (&#x201c;loadings&#x201d;) and <bold>T</bold> and <bold>U</bold> are the factorial coordinates (&#x201c;scores&#x201d;) of <bold>X</bold> and <bold>Y</bold>, respectively.<list list-type="simple">
<list-item>
<p>PLS is based on two principles:</p>
</list-item>
<list-item>
<p>1) the <bold>X</bold> factor coordinates, <bold>T</bold>, are good predictors of&#x20;<bold>Y</bold>;</p>
</list-item>
<list-item>
<p>2) there is a linear relationship between the scores <bold>T</bold> and&#x20;<bold>U.</bold>
</p>
</list-item>
</list>
</p>
<p>In the case of PLS, the model&#x2019;s regression coefficient matrix is given by:<disp-formula id="e6">
<mml:math id="m6">
<mml:mrow>
<mml:mi mathvariant="bold">B&#x3d;</mml:mi>
<mml:msup>
<mml:mi mathvariant="bold">X</mml:mi>
<mml:mi mathvariant="bold">T</mml:mi>
</mml:msup>
<mml:msup>
<mml:mi mathvariant="bold">U</mml:mi>
<mml:mi mathvariant="bold">&#x2a;</mml:mi>
</mml:msup>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mi mathvariant="bold">T</mml:mi>
<mml:mrow>
<mml:mi mathvariant="bold">&#x2a;T</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mi mathvariant="bold">X</mml:mi>
<mml:msup>
<mml:mi mathvariant="bold">X</mml:mi>
<mml:mi mathvariant="bold">T</mml:mi>
</mml:msup>
<mml:msup>
<mml:mi mathvariant="bold">U</mml:mi>
<mml:mi mathvariant="bold">&#x2a;</mml:mi>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:msup>
<mml:mi mathvariant="bold">T</mml:mi>
<mml:mrow>
<mml:mi mathvariant="bold">&#x2a;T</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mi mathvariant="bold">Y</mml:mi>
</mml:mrow>
</mml:math>
<label>(6)</label>
</disp-formula>
</p>
<p>In the case of PCR and PLS, successive scores and loadings are calculated after removing the contribution of each vector of scores from the <bold>X</bold> matrix, a process called deflation.</p>
<p>To present the different methods of determining the number of Latent Variables to use in the regression models, we use a dataset consisting of the near-infrared (NIR) spectra of 106 different olive oils (<xref ref-type="sec" rid="s10">Supplementary Figure S1A</xref>) and the variable to be predicted is the concentration of oleic acid (<xref ref-type="sec" rid="s10">Supplementary Figure S1B</xref>) determined by the classical method (gas chromatography) (<xref ref-type="bibr" rid="B14">Galtier et&#x20;al., 2007</xref>).</p>
<p>It should be stressed that this article is not an exhaustive review of the possible methods that can be used to determine the dimensionality of multivariate models, as was for example the article by <xref ref-type="bibr" rid="B29">Meloun et&#x20;al. (2000)</xref>. Here, a limited number of criteria have been chosen, but based on very different criteria that characterize the multivariate models. Since these criteria may not always indicate the same dimensionality, rather than just examining them all and deciding on a value somewhat subjectively, we propose here the idea of applying a Principal Components Analysis (PCA) to the various criteria so as to have a consensus&#x20;value.</p>
</sec>
<sec id="s2">
<title>Dimensionality</title>
<p>The problem of optimizing model dimensionality comes down to introducing as many as possible of the Latent Variables containing variability of interest, and none that contain &#x201c;detrimental variability&#x201d;, which is often due to contributions from outliers or just different types of noise (gaussian, spike, &#x2026;).</p>
<p>Already a PCA on the spectra shows that the loadings of the later components are noisier than those of the earlier ones (<xref ref-type="sec" rid="s10">Supplementary Figure S2</xref>). It is clear that when including more than a certain number of Latent Variables into a PLS regression model there is a risk of including more noise than information.</p>
<p>When establishing a prediction model based on Latent Variables extracted from a multivariate data table, we must ensure that we have extracted neither too many nor too&#x20;few.</p>
<p>Determining the number of Latent Variables can be done using a number of criteria that could be classified into two categories: prediction error or model characteristics.</p>
</sec>
<sec id="s3">
<title>Criteria Based on Prediction Error</title>
<p>The methods most often used are based on the quality of the predictions for individuals which were not used to create the model - either an independent dataset (test-set validation) or for individuals temporarily removed from the dataset (cross validation).</p>
<p>The term &#x201c;validation&#x201d; as it is used in &#x201c;cross validation&#x201d; is incorrect, because the objective here is not to validate the model, but to adjust its parameters optimally. In <xref ref-type="fig" rid="F1">Figure&#x20;1</xref>, the &#x201c;Calibration&#x201d; branch contains the &#x201c;Cross Validation&#x201d; step that does this model tuning, while the &#x201c;Test&#x201d; branch is for the true validation of the final&#x20;model.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption>
<p>&#x2013;The process of calibration (creating and adjusting the model) by cross validation, followed by its validation with a separate test set. &#x201c;<bold>b</bold>&#x201d; is the model calculated on the calibration set.</p>
</caption>
<graphic xlink:href="frans-01-754447-g001.tif"/>
</fig>
<p>The model is adjusted by creating models with an increasing number of Latent Variables extracted from one set of individuals and observing the evolution of the differences between observed and predicted values for another set of individuals. This evolution can be followed by plotting the sum of squared residuals (RESS Residual Error Sum of Squares) or the square root of the mean sum of squares (RMSE). When this tuning is done with another single set of individuals (test-set validation), we have the SEV and RMSEV; when it is done by removing, with replacement, a few individuals from the data set (cross validation), we have the SECV and the RMSECV.<disp-formula id="equ2">
<mml:math id="m7">
<mml:mrow>
<mml:mi mathvariant="normal">RESS</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:munderover>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mn>1</mml:mn>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:munderover>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="italic">y</mml:mi>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="true">&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi mathvariant="italic">y</mml:mi>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:math>
<label>(7a)</label>
</disp-formula>
<disp-formula id="equ3">
<mml:math id="m8">
<mml:mrow>
<mml:mi mathvariant="normal">RMSEVorRMSECV</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:msqrt>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mn>1</mml:mn>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:msubsup>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="italic">y</mml:mi>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="true">&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi mathvariant="italic">y</mml:mi>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
<mml:mi mathvariant="normal">n</mml:mi>
</mml:mfrac>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:math>
<label>(7b)</label>
</disp-formula>
</p>
<p>Calculating the model and applying it on the entire dataset provides an estimation of <bold>Y</bold> (<bold>&#x176;</bold>), which is used to calculate the RMSEC:<disp-formula id="equ4">
<mml:math id="m9">
<mml:mrow>
<mml:mi mathvariant="normal">RMSEC</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:msqrt>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mn>1</mml:mn>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:msubsup>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="italic">y</mml:mi>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="true">&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi mathvariant="italic">y</mml:mi>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="normal">n</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi mathvariant="normal">nLVs</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:math>
<label>(7c)</label>
</disp-formula>
</p>
<p>The RMSEC is intended to estimate the standard deviation of the fitting error, &#x3c3;. The division by [n-(k&#x2b;1)] instead of n (the number of individuals) is intended to take into account the fact that the number of degrees of freedom for the estimate of &#x3c3; is decreased by the inclusion of <italic>k</italic> Latent Variables plus the intercept. The use of this correction is valid in PCR regression, but subject to much criticism in the case of PLS where the <bold>Y</bold> matrix influences the calculation of the Latent Variables (<xref ref-type="bibr" rid="B22">Kr&#xe4;mer and Sugiyama, 2011</xref>; <xref ref-type="bibr" rid="B23">Lesnoff et&#x20;al., 2021</xref>). It is nevertheless sometimes used as a &#x201c;na&#xef;ve estimate of the RMSEC&#x201d;.</p>
<p>The principle of cross-validation is presented in <xref ref-type="fig" rid="F2">Figure&#x20;2</xref>. Blocks of individuals are removed from the dataset and are used as a test set while the remaining individuals form the calibration dataset to create models which are used to predict the values (<bold>&#x176;</bold>) for the test set individuals. The differences between the observed values (<bold>Y</bold>) and predicted values (<bold>&#x176;</bold>) are calculated for the different models. The test set individuals are then put back in the calibration dataset and another block of individuals is moved to be the test set. This process is repeated until all individuals have been used in the test set. If the size of the blocks is small (large number of blocks), the number of individuals tested each time is low and the number used to create the models is high. The limiting case is called Leave-One-Out Cross Validation (LOO-CV), where the number of blocks is equal to the total number of individuals. In this case, the result tends to be optimistic (small RMSECV) but simulates well the final model, because each prediction is made using a model calculated with a collection of samples close to that in the final&#x20;model.</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption>
<p>The principle of cross-validation. Blocks of individuals are removed from the dataset to be used as tests to measure the differences between their observed values and values predicted by the models created using the remaining individuals.</p>
</caption>
<graphic xlink:href="frans-01-754447-g002.tif"/>
</fig>
<p>On the other hand, using large blocks allows us to better assess the predictive power of the model. In all cases, in order not to distort the results, it is necessary to ensure that repetitions of samples (e.g., triplicates) are kept together in the same&#x20;block.</p>
<p>A fundamental hypothesis of theories on machine learning from empirical data assumes that the training and future datasets are generated from the same probability distribution (e.g., <xref ref-type="bibr" rid="B12">Faber, 1999</xref>; <xref ref-type="bibr" rid="B6">Denham, 2000</xref>; <xref ref-type="bibr" rid="B34">Vapnik, 2006</xref>; <xref ref-type="bibr" rid="B23">Lesnoff et&#x20;al., 2021</xref>). Under this hypothesis, it is known that leave-one-out cross-validation has low bias but can have high variance for the prediction errors (i.e.,&#x20;variable prediction if the training set would be replicated) (<xref ref-type="bibr" rid="B16">Hastie et&#x20;al., 2009</xref>). On the other hand, when K is smaller, cross-validation has lower variance but higher bias. Overall, five-or tenfold cross-validations are recommended as a good compromise between bias and variance (Hastie et&#x20;al., p.&#x20;284).</p>
<p>There are many ways to build blocks, the choice being based on the organization of individuals in the matrix.</p>
<p>Consecutive Blocks: (1, 2, &#x2026;, 10) (11, 12, &#x2026;, 20) (21, 22, &#x2026;,&#x20;30).</p>
<p>Venitian Blind: (1, 4, 7, &#x2026;, 28) (2, 5, 8, &#x2026;, 29) (3, 6, 9, &#x2026;,&#x20;30).</p>
<p>Random Blocks</p>
<p>Predefined Blocks: for example, to manage measurement repetitions.</p>
<p>
<xref ref-type="fig" rid="F3">Figure&#x20;3</xref> presents the evolution of the RMSECV (red circles) and the &#x201c;na&#xef;ve&#x201d; RMSEC (blue squares) based on the number of Latent Variables used to create the prediction model. The &#x201c;na&#xef;ve&#x201d; RMSEC, which quantifies the residual errors for the samples used to create the models, tends to zero. On the other hand, the RMSECV often has a minimum, more or less marked depending on the amount of noise in the data, which corresponds to the balance between information and noise, indicating the optimal number of Latent Variables.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption>
<p>Evolution of the RMSECV (red circles) and the na&#xef;ve RMSEC (blue squares) based on the number of Latent Variables used to create the prediction model. The minimum for 6 Latent Variables is clearly visible.</p>
</caption>
<graphic xlink:href="frans-01-754447-g003.tif"/>
</fig>
<p>Although the minimum in the RMSECV curve is for 6 LVs, this value is not much lower than that for 3 LVs. Parsimony could imply retaining only 3 LVs. To visualize more clearly the point corresponding to the minimum of RMSECV, one can use a rule that says that, on the one hand, the prediction error (here estimated by RMSECV) should be close to the fitting error (here estimated by RMSEC) and on the other hand, the RMSEC curve may present a break. A way of implementing that rule is to plot the RMSECV against the RMSEC (<xref ref-type="bibr" rid="B3">Bissett, 2015</xref>).</p>
<p>In <xref ref-type="fig" rid="F3">Figure&#x20;3</xref> and many subsequent figures, a vertical line indicates the number of LVs resulting from a consensus found by the procedure we propose, i.e.,&#x20;by applying a PCA to the various very different criteria presented&#x20;here.</p>
<p>To get a better indication of variability in the estimation of the optimal number of Latent Variables, repeated cross-validation is often used. In this case, several cross-validations are made with few blocks (here 2 blocks) containing randomly selected individuals each time. It is thus possible to calculate an average RMSECV and its variability (<xref ref-type="fig" rid="F4">Figure&#x20;4</xref>).</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption>
<p>Evolution of the RMSECV (red circles) and na&#xef;ve RMSEC (blue crosses) as a function of the number of Latent Variables in the model for 25 repetitions of a 2 random blocks cross validation.</p>
</caption>
<graphic xlink:href="frans-01-754447-g004.tif"/>
</fig>
<p>Another related procedure is to plot the proportions of variability extracted from the <bold>Y</bold> vectors, <italic>R</italic>
<sup>2</sup>, for the calibration samples, and Q<sup>2</sup>, for the samples removed during the cross validation, as a function of the number of Latent Variables. In <xref ref-type="fig" rid="F5">Figure&#x20;5</xref> one can see that the difference between <italic>R</italic>
<sup>2</sup> and Q<sup>2</sup> is close to zero for from 4 to 6&#x20;LVs.</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption>
<p>Evolution of <italic>R</italic>
<sup>2</sup> (blue squares) and Q<sup>2</sup> (red circles), for &#x201c;calibration&#x201d; samples and &#x201c;test&#x201d; samples, respectively, as a function of the number of Latent Variables in the model; Evolution of the difference between <italic>R</italic>
<sup>2</sup> and Q<sup>2</sup>.</p>
</caption>
<graphic xlink:href="frans-01-754447-g005.tif"/>
</fig>
<p>Other criteria can be calculated based on the values predicted by cross-validation.</p>
<p>Wold&#x2019;s R criterion (<xref ref-type="bibr" rid="B37">Wold, 1978</xref>; <xref ref-type="bibr" rid="B24">Li et&#x20;al., 2002</xref>) is given by:<disp-formula id="equ5">
<mml:math id="m10">
<mml:mrow>
<mml:mi mathvariant="normal">PRESS</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi mathvariant="normal">k</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:munderover>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mn>1</mml:mn>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:munderover>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="italic">y</mml:mi>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="true">&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi mathvariant="italic">y</mml:mi>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:math>
<label>(8a)</label>
</disp-formula>
<disp-formula id="equ6">
<mml:math id="m11">
<mml:mrow>
<mml:mi mathvariant="normal">Wol</mml:mi>
<mml:msup>
<mml:mi mathvariant="normal">d</mml:mi>
<mml:mtext>&#x27;</mml:mtext>
</mml:msup>
<mml:mi mathvariant="normal">s</mml:mi>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mi mathvariant="normal">R</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="normal">PRESS</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="normal">k</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="normal">PRESS</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="normal">k</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
<label>(8b)</label>
</disp-formula>where PRESS(k) is the predicted residual sum of squares for <italic>k</italic> LVs; and Wold&#x2019;s R is a vector of the ratios of successive PRESS values. The usual cutoff for Wold&#x2019;s R criterion is when R is greater than unity. In <xref ref-type="fig" rid="F6">Figure&#x20;6</xref> it can be seen that the maximum R is at 6 LVs but the value is already greater than 1 for 3&#x20;LVs.</p>
<fig id="F6" position="float">
<label>FIGURE 6</label>
<caption>
<p>Evolution of Wold&#x2019;s R; Osten&#x2019;s criterion and Cattell&#x2019;s Residual Percent Variance (RPV) criterion, as a function of the number of Latent Variables in the&#x20;model.</p>
</caption>
<graphic xlink:href="frans-01-754447-g006.tif"/>
</fig>
<p>More recently, Osten proposed the criterion (<xref ref-type="bibr" rid="B30">Osten, 1988</xref>), given by:<disp-formula id="equ7">
<mml:math id="m12">
<mml:mrow>
<mml:mi mathvariant="normal">Oste</mml:mi>
<mml:msup>
<mml:mi mathvariant="normal">n</mml:mi>
<mml:mtext>&#x27;</mml:mtext>
</mml:msup>
<mml:mi mathvariant="normal">s</mml:mi>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mi mathvariant="normal">F</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi mathvariant="normal">k</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="normal">PRESS</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="normal">k</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mi mathvariant="normal">PRESS</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="normal">k</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="normal">PRESS</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="normal">k</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>/</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi mathvariant="normal">N</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi mathvariant="normal">k</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
<label>(9)</label>
</disp-formula>
</p>
<p>
<xref ref-type="fig" rid="F6">Figure&#x20;6</xref> also shows that Osten&#x2019;s F confirms the results for Wold&#x2019;s R: F is less than 0 at 3 LVs but reaches a minimum at 6&#x20;LVs.</p>
<p>When doing a PCA, Cattell&#x2019;s Residual Percent Variance (RPV) criterion (<xref ref-type="bibr" rid="B4">Cattell, 1966</xref>) assumes that the residual variance should level off, as in <xref ref-type="fig" rid="F6">Figure&#x20;6</xref>, after a suitable number of factors have been extracted. RPV for the model with <italic>k</italic> LVs is given by:<disp-formula id="equ8">
<mml:math id="m13">
<mml:mrow>
<mml:mi mathvariant="normal">RPV</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi mathvariant="italic">K</mml:mi>
</mml:msubsup>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msubsup>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi mathvariant="italic">K</mml:mi>
</mml:msubsup>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
<label>(10)</label>
</disp-formula>where <inline-formula id="inf1">
<mml:math id="m14">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is the eigenvalue for the <italic>i</italic>th PC. Here, in the case of PLS, we have replaced the eigenvalues by the variances of the scores for each&#x20;LV.</p>
<p>There are other methods, such as Mallow&#x2019;s Cp (<xref ref-type="bibr" rid="B27">Mallows, 1973</xref>) and Akaike&#x2019;s Information Criterion (AIC) (<xref ref-type="bibr" rid="B1">Akaike, 1969</xref>), that are commonly used to select the dimensionality of regression models, as an alternative to cross-validation (CV). However, the calculation of Cp and AIC requires the determination of the effective number of degrees of freedom of the model, which as mentioned above, is not straightforward in the case of PLS (<xref ref-type="bibr" rid="B23">Lesnoff et&#x20;al., 2021</xref>). For that reason, these criteria will not be considered&#x20;here.</p>
</sec>
<sec id="s4">
<title>Criteria based on other properties of the models</title>
<p>Cross-validation is sometimes difficult to perform, for example when there are many individuals and/or variables, so the calculation time can be excessive. And even when the calculation is feasible, one does not always observe a clear minimum in the RMSECV curve (as in <xref ref-type="fig" rid="F3">Figure&#x20;3</xref>) or maximum in the Q<sup>2</sup> curve (as in <xref ref-type="fig" rid="F5">Figure&#x20;5</xref>), which makes it difficult to choose the number of&#x20;LVs.</p>
<p>As well, as indicated by <xref ref-type="bibr" rid="B36">Wiklund et&#x20;al. (2007)</xref> CV handles &#x201c;the available data economically, but like any data-based statistical test gives an interval of results and hence sometimes gives either an under-fit or an over-fit, that is they reach the minimum RMSEV for a lower or higher model rank than would be achieved using an infinitely large independent validation set&#x201d;. They also stressed the fact that &#x201c;One area where CV works poorly both for PLS and PCR is design of experiments, where exclusion of data has large consequences for modeling&#x201d;. To solve these problems, they proposed carrying out permutation tests on the <bold>Y</bold> vector and then comparing the correlations between the scores of each latent variable and the true <bold>Y</bold> vector with the correlations between the scores obtained for the permuted <bold>Y</bold>s and the corresponding true&#x20;<bold>Y</bold>s.</p>
<p>It should be noted that all these criteria are based on comparing the observed and predicted <bold>Y</bold> vectors. It could therefore be helpful to use other criteria based on entirely different characteristics of the models to facilitate the choice of the number of latent variables.</p>
<p>We will now see a set of such complementary methods, based on the characteristics of the regression coefficients vectors, <bold>b</bold>, and on the characteristics of the <bold>X</bold> matrix after each deflation.</p>
<sec id="s4-1">
<title>Characteristics of the Regression Coefficients Vectors, b</title>
<p>As the number of Latent Variables used to calculate the regression coefficients vector, <bold>b</bold>, increases, more and more noise is included. When the <bold>X</bold> matrix contains <italic>structured signals</italic>, such as the near infrared spectra in <xref ref-type="sec" rid="s11">Supplementary Figure S1</xref>, <bold>b</bold> coefficients are initially structured and gradually become random, as can be seen in <xref ref-type="fig" rid="F7">Figure&#x20;7</xref>.</p>
<fig id="F7" position="float">
<label>FIGURE 7</label>
<caption>
<p>PLS regression coefficient vectors, <bold>b</bold>, based on 1, 3, 6 and 9 Latent Variables plotted with a constant ordinate scale (abscissa: variable numbers).</p>
</caption>
<graphic xlink:href="frans-01-754447-g007.tif"/>
</fig>
<p>In the case of <bold>b</bold>-vectors calculated from structured signals in the rows of the <bold>X</bold> matrix, a &#x201c;signal-to-noise ratio&#x201d; can be calculated using the Durbin-Watson (DW) criterion (<xref ref-type="bibr" rid="B9">Durbin and Watson, 1971</xref>; <xref ref-type="bibr" rid="B33">Rutledge and Barros, 2002</xref>). This criterion is given by:<disp-formula id="e11">
<mml:math id="m15">
<mml:mrow>
<mml:mi mathvariant="bold">DW</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mi mathvariant="bold-italic">n</mml:mi>
</mml:msubsup>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">b</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
<mml:mo>&#xa0;</mml:mo>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">b</mml:mi>
<mml:mrow>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:msubsup>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi mathvariant="bold-italic">n</mml:mi>
</mml:msubsup>
<mml:msubsup>
<mml:mi mathvariant="bold-italic">b</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mi mathvariant="normal">2</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
<label>(11)</label>
</disp-formula>where b<sub>i</sub> and b<sub>(i-1)</sub> are the values for successive points in a series of <bold>b</bold>-coefficients values. DW is close to zero if there is a strong correlation between successive values. On the other hand, if there is a low correlation (i.e.,&#x20;a random distribution), the value of DW tends to 2.0. DW can therefore be used to characterize the degree of correlation between successive points, and thus give an objective measure of the non-random behavior of the <bold>b</bold> coefficients vectors. However, if the noise in the data has been reduced by smoothing, the transition will not be as clear and DW will not increase as&#x20;much.</p>
<p>
<xref ref-type="fig" rid="F8">Figure&#x20;8</xref> shows the evolution of DW calculated for a succession of regression coefficients vectors, as a function of the increasing number of LVs used in the PLS model. It is clear that there is a very sudden increase in DW after 6&#x20;LVs.</p>
<fig id="F8" position="float">
<label>FIGURE 8</label>
<caption>
<p>Evolution of the Durbin-Watson (DW) criterion; the log of the Morphological Factor; the Variance; the Variance Inflation Factor (VIF) calculated on the regression vectors, <bold>b</bold>, as a function of the number of Latent Variables in the PLS models.</p>
</caption>
<graphic xlink:href="frans-01-754447-g008.tif"/>
</fig>
<p>The Morphological Factor (MF) (<xref ref-type="bibr" rid="B35">Wang et&#x20;al., 1996</xref>) is based on the same phenomenon as the DW criterion, noisy vectors are less structured than non-noisy vectors. On the other hand, the mathematical principle is different:<disp-formula id="e12a">
<mml:math id="m16">
<mml:mrow>
<mml:mi mathvariant="normal">MF</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi mathvariant="bold">b</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mo>&#x2016;</mml:mo>
<mml:mi mathvariant="bold">b</mml:mi>
<mml:mo>&#x2016;</mml:mo>
</mml:mrow>
<mml:mi mathvariant="normal">/</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mo>&#x2016;</mml:mo>
<mml:mrow>
<mml:mi mathvariant="bold">MO</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi mathvariant="bold">b</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>&#x2016;</mml:mo>
</mml:mrow>
<mml:mo>.</mml:mo>
<mml:mi mathvariant="normal">ZCP</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi mathvariant="bold">MO</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi mathvariant="bold">b</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<label>(12a)</label>
</disp-formula>
<disp-formula id="e12b">
<mml:math id="m17">
<mml:mrow>
<mml:mi mathvariant="bold">MO</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi mathvariant="bold">b</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold">b</mml:mi>
<mml:mrow>
<mml:mi mathvariant="bold">i&#x2b;1</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold">b</mml:mi>
<mml:mi mathvariant="bold">i</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi mathvariant="normal">for</mml:mi>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mi mathvariant="normal">i&#x3d;1,2,</mml:mi>
<mml:mo>&#x2026;</mml:mo>
<mml:mi mathvariant="normal">n-1</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<label>(12b)</label>
</disp-formula>where <bold>b</bold> is a vector of regression coefficients; <bold>MO(b)</bold> the vector of differences in intensity between successive points in b; ZCP (<bold>MO</bold>(<bold>b)</bold>) the number of times <bold>MO(b)</bold> changes signs, and the operator &#x7c;&#x7c;o&#x7c;&#x7c; is the Euclidian&#x20;norm.</p>
<p>In the case of a noisy vector, <bold>MO(b)</bold> will contain bigger values and there will be more sign changes than in the case of a smooth vector, resulting in lower MF values. <xref ref-type="fig" rid="F8">Figure&#x20;8</xref> shows the evolution of MF as a function of the number of Latent Variables extracted. The log of MF evolves in a similar way to the DW criterion with a decrease after 6 Latent Variables.</p>
<p>In the case of an <bold>X</bold> matrix that does not contain structured signals (e.g., physical-chemical data or mass spectra) DW or MF should not be used. But other characteristics of the regression vectors can be used instead.</p>
<p>It can be seen that the range of <bold>b</bold> vector values initially remains relatively stable, but beyond a certain number of LVs, the b-coefficient values increase enormously (<xref ref-type="fig" rid="F7">Figure&#x20;7</xref>). By plotting the variance of the regression vectors it is possible to see the point at which this phenomenon appears (<xref ref-type="fig" rid="F8">Figure&#x20;8</xref>) for both structured and non-structured data matrices. This is also true for the standard deviation or the norm of the vectors.</p>
<p>The Variance Inflation Factor of a variable <italic>i</italic> in a matrix <bold>X</bold> (VIFi) (<xref ref-type="bibr" rid="B28">Marquardt, 1970</xref>; <xref ref-type="bibr" rid="B13">Ferr&#xe9;, 2009</xref>) is equal to the inverse of (1-Ri<sup>2</sup>), where Ri<sup>2</sup> is the coefficient of determination of the regression between all the other predictor variables in the matrix and the variable <italic>i</italic>. VIFi quantifies the degree to which that variable can be predicted by all the others. The closer the Ri<sup>2</sup> value to 1, the higher the multicollinearity with independent variable <italic>i</italic> and the higher the value of&#x20;VIFi.</p>
<p>As the number of LVs included in a regression model increases, the structure of the b-coefficients vectors changes due to the inclusion of more sources of variability, initially corresponding to information, and later to noise. There are initially significant changes in the b coefficient vectors, due to the fact that the loadings are very different, reflecting different sources of information. Subsequent loadings correspond more and more to noise and change less the shape of the <bold>b</bold>-vectors.</p>
<p>It can therefore be interesting to quantify the correlations between the columns of a matrix <bold>B</bold> containing vectors of b-coefficients calculated with increasing numbers of&#x20;LVs.</p>
<p>To detect the number of LVs at which point the multi-collinearities increase, we can plot the VIF values of the b-coefficient vectors as a function of the number of LVs. In <xref ref-type="fig" rid="F8">Figure&#x20;8</xref>, we see that the VIF values remain low up to 6 LVs, and then increase.</p>
<p>In a way similar to the Random_ICA method (<xref ref-type="bibr" rid="B19">Kassouf et&#x20;al., 2018</xref>), one can study whether similar b-coefficients vectors are extracted from two random subsets of the <bold>X</bold> and <bold>Y</bold> matrices. PLS regressions are performed with increasing numbers of LVs on the two subsets. Too many LVs have been extracted when there is no longer a strong correlation between the pair of b-coefficients vectors. To avoid the possibility of a bias being introduced by a particular distribution of the rows into the two blocks, the whole procedure is repeated <italic>k</italic> times resulting in different sets of blocks, producing a broader perspective for the selection of the number of LVs (<xref ref-type="fig" rid="F9">Figure&#x20;9</xref>).</p>
<fig id="F9" position="float">
<label>FIGURE 9</label>
<caption>
<p>Evolution of the correlations between b-coefficients calculated for 25 randomly selected pairs of subsets of samples for increasing numbers of Latent Variables.</p>
</caption>
<graphic xlink:href="frans-01-754447-g009.tif"/>
</fig>
</sec>
<sec id="s4-2">
<title>Structure of the X Matrix After Each Deflation Step.</title>
<p>Most multivariate analysis methods contain a deflation step where the contribution of each Latent Variables is removed from the matrix before extracting the next Latent Variables. This is true for PCA, PCR and PLS. This process of deflation means that the rows in the deflated matrices contain less and less information and more and more noise. As well, since the remaining variability corresponds more and more to Gaussian noise, the distribution of individuals in the space of the variables gradually approaches that of a hypersphere.</p>
<p>Several criteria can be used to characterize the evolution of the signal/noise ratios in the rows and the sphericity of the deflated matrices so as to determine when all the interesting information has been removed.</p>
<p>Again, the DW criterion can be used, this time to measure the signal-to-noise ratio in each row of the matrix following the successive deflations. <xref ref-type="fig" rid="F10">Figure&#x20;10</xref> shows the evolution of the distribution of DW values calculated as in <xref ref-type="disp-formula" rid="e13">Equation 13</xref>, for each row of the <bold>X</bold> matrix, as a function of the number of Latent Variables extracted.<disp-formula id="e13">
<mml:math id="m18">
<mml:mrow>
<mml:mi mathvariant="bold">DW</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mi mathvariant="bold-italic">n</mml:mi>
</mml:msubsup>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
<mml:mo>&#xa0;</mml:mo>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mrow>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:msubsup>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi mathvariant="bold-italic">n</mml:mi>
</mml:msubsup>
<mml:msubsup>
<mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
<label>(13)</label>
</disp-formula>
</p>
<fig id="F10" position="float">
<label>FIGURE 10</label>
<caption>
<p>Evolution of the Durbin-Watson (DW) criterion; the Morphological Factor; the Variance calculated for each row of the X matrix during deflation and the log of the VIF for all X-matrix variables after each deflation.</p>
</caption>
<graphic xlink:href="frans-01-754447-g010.tif"/>
</fig>
<p>There is a sharp increase in the median value and interquartile interval when 5 latent variables are extracted. The heatmap and boxplot show that not all rows (samples) evolve in the same way, some becoming noisy later than most. This is reflected in the size of the boxplots of the DW values and also in the standard deviation of the values.</p>
<p>As with the DW criterion, the Morphological Factor can be calculated for each row of the matrix after deflation. <xref ref-type="fig" rid="F10">Figure&#x20;10</xref> also shows the evolution of the distribution of the MF values, as a function of the number of Latent Variables extracted. The values stabilize with the elimination of 6 Latent Variables.</p>
<p>For non-structured data, the variance (or the standard deviation or the Norm) of the matrix rows can be used (<xref ref-type="fig" rid="F10">Figure&#x20;10</xref>).</p>
<p>As the <bold>X</bold>-matrix is deflated, the sources of variability corresponding to information are eliminated, leaving behind only random noise, so that there are less and less correlations between the variables in the deflated <bold>X</bold>-matrix. To detect the moment when there are no more multi-collinearities between the variables, we can do linear regressions between each variable and all the others and then examine the corresponding <italic>R</italic>
<sup>2</sup> for all successive models. If the <italic>R</italic>
<sup>2</sup> of a variable is close to 1, there is still a linear relationship between this variable and the others.</p>
<p>The VIF is equal to the inverse of (1-R<sup>2</sup>). If the VIF of a variable is greater than 4, there may be multi-collinearities; if the VIF is greater than 10, there are significant multi-collinearities.</p>
<p>To determine whether all information has been eliminated from the <bold>X</bold>-matrix, the VIFs of all the variables can be plotted as a function of the number of LVs extracted, as in <xref ref-type="fig" rid="F10">Figure&#x20;10</xref>, where only a few variables still have high VIFs after eliminating 6&#x20;LVs.</p>
<p>As the X-matrix is deflated, the dispersion of the samples in the reduced multivariate space tends to become spherical, as all the directions of non-random dispersion are progressively removed. Sphericity tests can therefore be applied to the deflated matrices to determine how many LVs are required to remove all interesting dispersions.</p>
<p>Bartlett&#x2019;s test for Sphericity (<xref ref-type="bibr" rid="B2">Bartlett, 1951</xref>) compares a matrix of Pearson correlations with the identity matrix. The null hypothesis is that the variables are not correlated. If there is redundancy between variables, it can be interesting to proceed with the multivariate analysis. The formula is given by:<disp-formula id="e14">
<mml:math id="m19">
<mml:mrow>
<mml:msup>
<mml:mi mathvariant="bold-italic">&#x3c7;</mml:mi>
<mml:mi mathvariant="bold">2</mml:mi>
</mml:msup>
<mml:mo mathvariant="bold">&#x3d;</mml:mo>
<mml:mo mathvariant="bold">&#x2212;</mml:mo>
<mml:mrow>
<mml:mo mathvariant="bold">[</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mo mathvariant="bold">(</mml:mo>
<mml:mrow>
<mml:mi mathvariant="bold-italic">n</mml:mi>
<mml:mo mathvariant="bold">&#x2212;</mml:mo>
<mml:mn mathvariant="bold">1</mml:mn>
</mml:mrow>
<mml:mo mathvariant="bold">)</mml:mo>
</mml:mrow>
<mml:mo mathvariant="bold">&#x2212;</mml:mo>
<mml:mrow>
<mml:mo mathvariant="bold">(</mml:mo>
<mml:mrow>
<mml:mn mathvariant="bold">2</mml:mn>
<mml:mi mathvariant="bold-italic">k</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:mn mathvariant="bold">5</mml:mn>
</mml:mrow>
<mml:mo mathvariant="bold">)</mml:mo>
</mml:mrow>
<mml:mn mathvariant="bold">6</mml:mn>
</mml:mrow>
<mml:mo mathvariant="bold">]</mml:mo>
</mml:mrow>
<mml:mi mathvariant="bold-italic">&#x2a;log</mml:mi>
<mml:mo mathvariant="bold">&#x7c;</mml:mo>
<mml:mi mathvariant="bold">R</mml:mi>
<mml:mo mathvariant="bold">&#x7c;</mml:mo>
</mml:mrow>
</mml:math>
<label>(14)</label>
</disp-formula>where:</p>
<p>
<italic>n</italic> is the number of observations, <italic>k</italic> the number of variables, and <bold>R</bold> the correlation matrix of the data in <bold>X</bold>. &#x7c;<bold>R</bold>&#x7c; is the determinant of&#x20;<bold>R</bold>.</p>
<p>Bartlett&#x2019;s test in <xref ref-type="fig" rid="F11">Figure&#x20;11</xref> shows that the deflated matrices are very non-spherical until after 6 LVs have been removed.</p>
<fig id="F11" position="float">
<label>FIGURE 11</label>
<caption>
<p>Evolution of Bartlett&#x2019;s test for Sphericity; Hartley&#x2019;s F-test; the Log of Exner&#x2019;s Phi criterion and the KMO criterion for <bold>X</bold>-matrix rows and columns, as a function of the number of Latent Variables in the&#x20;model.</p>
</caption>
<graphic xlink:href="frans-01-754447-g011.tif"/>
</fig>
<p>Similarly, Hartley and Cochran proposed F-tests based on the ratio of the maximum variance/minimum variance (<xref ref-type="bibr" rid="B15">Hartley, 1950</xref>) and the maximum variance/mean variance (<xref ref-type="bibr" rid="B5">Cochran, 1941</xref>), respectively. The Hartley criterion in <xref ref-type="fig" rid="F11">Figure&#x20;11</xref> shows that the deflated matrices are very spherical once 5 LVs are removed.</p>
<p>Exner proposed the &#x3a8; criterion (<xref ref-type="bibr" rid="B11">Exner, 1966</xref>; <xref ref-type="bibr" rid="B21">Kindsvater et&#x20;al., 1974</xref>) as a measure of fit of a set of predicted data to a set of experimental data, given by the equation:<disp-formula id="e15">
<mml:math id="m20">
<mml:mrow>
<mml:mi mathvariant="bold-italic">&#x3c8;</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:msqrt>
<mml:mrow>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">nc</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mi mathvariant="bold-italic">&#xa0;</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="true">&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:msubsup>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">nc</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo>&#xaf;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mfrac>
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="bold-italic">nc</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">nc</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mi mathvariant="bold-italic">k</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:math>
<label>(15)</label>
</disp-formula>where <inline-formula id="inf2">
<mml:math id="m21">
<mml:mrow>
<mml:msub>
<mml:mtext>X</mml:mtext>
<mml:mtext>i</mml:mtext>
</mml:msub>
<mml:mo>&#xa0;</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula> is a data point in the matrix, <inline-formula id="inf3">
<mml:math id="m22">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mtext>X</mml:mtext>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mtext>i</mml:mtext>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is that data point reproduced using <italic>k</italic> LVs, <italic>n</italic> and <italic>c</italic> are the number of rows and columns in the data matrix and <inline-formula id="inf4">
<mml:math id="m23">
<mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mtext>X</mml:mtext>
<mml:mo>&#xaf;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo>&#xa0;</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula> is the grand mean of&#x20;<bold>X</bold>.</p>
<p>Here Exner&#x2019;s criterion (<xref ref-type="fig" rid="F11">Figure&#x20;11</xref>) is calculated between the original <bold>X</bold> matrix and each successive deflated matrix to determine at what point there is no longer any similarity between&#x20;them.</p>
<p>The KMO (Kaiser-Meyer-Olkin Measure of Sampling Adequacy) criterion (<xref ref-type="bibr" rid="B17">Kaiser, 1970</xref>; <xref ref-type="bibr" rid="B18">Kaiser, 1974</xref>) was developed to determine whether it was useful to conduct a multivariate analysis of a data matrix. For example, if the variables are uncorrelated, it is no use to do a&#x20;PCA.</p>
<p>The KMO index is given by:<disp-formula id="e16a">
<mml:math id="m24">
<mml:mrow>
<mml:mi mathvariant="bold">KMO&#x3d;</mml:mi>
<mml:mfrac>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo>&#x2211;</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:munder>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo>&#x2260;</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:msubsup>
<mml:mstyle displaystyle="true">
<mml:mi>r</mml:mi>
</mml:mstyle>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msubsup>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo>&#x2211;</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:munder>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo>&#x2260;</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:msubsup>
<mml:mstyle displaystyle="true">
<mml:mi>r</mml:mi>
</mml:mstyle>
<mml:mrow>
<mml:mi mathvariant="italic">ij</mml:mi>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msubsup>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mstyle>
<mml:mo>&#x2b;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo>&#x2211;</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:munder>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo>&#x2260;</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:msubsup>
<mml:mstyle displaystyle="true">
<mml:mi mathvariant="italic">a</mml:mi>
</mml:mstyle>
<mml:mrow>
<mml:mi mathvariant="italic">ij</mml:mi>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msubsup>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
<label>(16a)</label>
</disp-formula>where r<sub>ij</sub> is the correlation between variables <italic>i</italic> and <italic>j</italic>, and a<sub>ij</sub> is the partial correlation, defined as:<disp-formula id="e16b">
<mml:math id="m25">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="italic">a</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">ij</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="italic">v</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">ij</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msqrt>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="italic">v</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">ij</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi mathvariant="italic">v</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">ij</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
<label>(16b)</label>
</disp-formula>
<list list-type="simple">
<list-item>
<p>&#x3bd;<sub>ij</sub> being an element of the inverse of the correlation matrix (&#x3bd;<sub>ij</sub> &#x3d; r<sub>ij</sub> <sup>&#x2212;1</sup>).</p>
</list-item>
</list>
</p>
<p>The value of the KMO index varies between 0 (no correlation between variables, thus useless to do a multivariate analysis) and 1 (correlated variables, thus useful to do a multivariate analysis). A KMO value of 0.5 is usually considered the cutoff point below which there is no interest in doing a multivariate analysis. Here this index was calculated for the variables (columns) and for the individuals (rows) in each matrix. We can see (<xref ref-type="fig" rid="F11">Figure&#x20;11</xref>) that the values are close to 1 until 6 LVs are removed from the matrix and that there is a second decrease after removing 11 LVs. This means that much of the information shared by the original variables and individuals has been removed by 6 LVs, but there is still some present to a lesser extent up to 11&#x20;LVs.</p>
<p>In 1977, <xref ref-type="bibr" rid="B26">Malinowski (1977a)</xref> developed the idea that there were two types of Factors (or Latent Variables) &#x201c;a primary set which contains the true factors together with a mixture of error and a secondary set which consists of pure error&#x201d;. He also showed that there were three types of errors: RE, real error; XE, extracted error; and IE, Imbedded error, which can be calculated &#x201c;from a knowledge of the secondary eigenvalues, the size of the data matrix, and the number of factors involved&#x201d;, the secondary eigenvalues being those associated with pure&#x20;noise.</p>
<p>He considered that if <italic>k</italic>, the number of LVs associated with the &#x201c;pure data&#x201d; is known, the real error is the difference between the pure data and the raw data, that is the Residual Standard Deviation (RSD) given by:<disp-formula id="e17">
<mml:math id="m26">
<mml:mrow>
<mml:mi mathvariant="bold-italic">RE</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mi mathvariant="bold-italic">RSD</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:msqrt>
<mml:mrow>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi mathvariant="bold-italic">i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mi mathvariant="bold-italic">k</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi mathvariant="bold-italic">c</mml:mi>
</mml:msubsup>
<mml:mi mathvariant="bold-italic">&#xa0;</mml:mi>
<mml:msub>
<mml:mi mathvariant="bold-italic">&#x3bb;</mml:mi>
<mml:mi mathvariant="bold-italic">i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">n</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi mathvariant="bold-italic">c</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mi mathvariant="bold-italic">k</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:math>
<label>(17)</label>
</disp-formula>where, <italic>n</italic> and <italic>c</italic> are the respective number of rows and columns in the data matrix; <italic>k</italic> the number of factors used to reproduce the data; and &#x3bb;i is the <italic>i</italic>th eigenvalue.</p>
<p>He stressed that &#x201c;it was assumed that <italic>n</italic>&#x20;&#x3e; <italic>c</italic>. If the reverse is true, i.e.,&#x20;<italic>n</italic>&#x20;&#x3c; <italic>c</italic>, then <italic>n</italic> and <italic>c</italic> must be interchanged in these equations&#x201d;.</p>
<p>He also proposed that the imbedded error (IE) is the difference between the pure data and the data approximated by the multivariate decomposition:<disp-formula id="e18">
<mml:math id="m27">
<mml:mrow>
<mml:mi mathvariant="bold-italic">IE</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:msqrt>
<mml:mrow>
<mml:mfrac>
<mml:mi mathvariant="bold-italic">k</mml:mi>
<mml:mi mathvariant="bold-italic">c</mml:mi>
</mml:mfrac>
</mml:mrow>
</mml:msqrt>
<mml:mi mathvariant="bold-italic">RSD</mml:mi>
</mml:mrow>
</mml:math>
<label>(18)</label>
</disp-formula>and that the extracted error (XE) is the difference between the data approximated by the multivariate decomposition and the raw data:<disp-formula id="e19">
<mml:math id="m28">
<mml:mrow>
<mml:mi mathvariant="bold-italic">XE</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:msqrt>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="bold-italic">c</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mi mathvariant="bold-italic">k</mml:mi>
</mml:mrow>
<mml:mi mathvariant="bold-italic">c</mml:mi>
</mml:mfrac>
</mml:mrow>
</mml:msqrt>
<mml:mi mathvariant="bold-italic">RSD</mml:mi>
</mml:mrow>
</mml:math>
<label>(19)</label>
</disp-formula>
</p>
<p>Malinowski then proposed another empirical criterion to determine the number of Latent Variables in a data matrix (<xref ref-type="bibr" rid="B25">Malinowski, 1977b</xref>). This indicator function (IND) is closely related to the error functions described above:<disp-formula id="e20">
<mml:math id="m29">
<mml:mrow>
<mml:mi mathvariant="bold-italic">IND</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="bold-italic">RSD</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi mathvariant="bold-italic">c</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mi mathvariant="bold-italic">k</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
<label>(20)</label>
</disp-formula>
</p>
<p>As can be seen in <xref ref-type="fig" rid="F12">Figure&#x20;12</xref>, a plot of these criteria as a function of <italic>k</italic>, the number of LVs, can help to distinguish &#x201c;pure data&#x201d; from &#x201c;error data&#x201d;.</p>
<fig id="F12" position="float">
<label>FIGURE 12</label>
<caption>
<p>Evolution of the 4 criteria proposed by Malinowski after each deflation.</p>
</caption>
<graphic xlink:href="frans-01-754447-g012.tif"/>
</fig>
<p>Several criteria have been proposed to estimate the correlation between matrices. Here 3 of them (<xref ref-type="bibr" rid="B8">Dray, 2008</xref>) will be used to compare the original <bold>X</bold> matrix with each deflated matrix, the assumption being that these correlations will decrease as the information is being removed.</p>
<p>The RV coefficient (<xref ref-type="bibr" rid="B10">Escoufier, 1973</xref>; <xref ref-type="bibr" rid="B32">Robert and Escoufier 1976</xref>) is a measurement of the closeness between two matrices and is defined by:<disp-formula id="e21">
<mml:math id="m30">
<mml:mrow>
<mml:mi mathvariant="bold-italic">RV</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="bold-italic">trace</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:msubsup>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi mathvariant="bold-italic">T</mml:mi>
</mml:msubsup>
<mml:msub>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mi mathvariant="bold-italic">k</mml:mi>
</mml:msub>
<mml:msubsup>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mi mathvariant="bold-italic">k</mml:mi>
<mml:mi mathvariant="bold-italic">T</mml:mi>
</mml:msubsup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:msqrt>
<mml:mrow>
<mml:mi mathvariant="bold-italic">trace</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi mathvariant="bold-italic">T</mml:mi>
</mml:msubsup>
<mml:msub>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:msubsup>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi mathvariant="bold-italic">T</mml:mi>
</mml:msubsup>
<mml:msub>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mi mathvariant="bold-italic">trace</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mi mathvariant="bold-italic">k</mml:mi>
<mml:mi mathvariant="bold-italic">T</mml:mi>
</mml:msubsup>
<mml:msub>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mi mathvariant="bold-italic">k</mml:mi>
</mml:msub>
<mml:msubsup>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mi mathvariant="bold-italic">k</mml:mi>
<mml:mi mathvariant="bold-italic">T</mml:mi>
</mml:msubsup>
<mml:msub>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mi mathvariant="bold-italic">k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
<label>(21)</label>
</disp-formula>
</p>
<p>In our case, <bold>X</bold>
<sub>
<bold>1</bold>
</sub> is the original matrix, <bold>X</bold>
<sub>
<bold>k</bold>
</sub> is the deflated matrix after removing <italic>k</italic>&#x20;LVs.</p>
<p>The numerator of the RV coefficient is the co-inertia criterion (COI) (<xref ref-type="bibr" rid="B7">Dray et&#x20;al., 2003</xref>) which is also a measurement of the link between the two matrices:<disp-formula id="e22">
<mml:math id="m31">
<mml:mrow>
<mml:mi mathvariant="bold-italic">COI</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mi mathvariant="bold-italic">trace</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:msubsup>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi mathvariant="bold-italic">T</mml:mi>
</mml:msubsup>
<mml:msub>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mi mathvariant="bold-italic">k</mml:mi>
</mml:msub>
<mml:msubsup>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mi mathvariant="bold-italic">k</mml:mi>
<mml:mi mathvariant="bold-italic">T</mml:mi>
</mml:msubsup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<label>(22)</label>
</disp-formula>
</p>
<p>According to <xref ref-type="bibr" rid="B31">Ramsay et&#x20;al. (1984)</xref> and <xref ref-type="bibr" rid="B20">Kiers et&#x20;al. (1994)</xref>, the most common matrix correlation coefficient is the &#x2018;inner product&#x2019; matrix correlation coefficient, which we will call RIP, defined as:<disp-formula id="e23">
<mml:math id="m32">
<mml:mrow>
<mml:mi mathvariant="bold-italic">RIP</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="bold-italic">trace</mml:mi>
<mml:msqrt>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi mathvariant="bold-italic">T</mml:mi>
</mml:msubsup>
<mml:msub>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mi mathvariant="bold">k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
<mml:mrow>
<mml:msqrt>
<mml:mrow>
<mml:mi mathvariant="bold-italic">trace</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi mathvariant="bold-italic">T</mml:mi>
</mml:msubsup>
<mml:msub>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mi mathvariant="bold-italic">trace</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mi mathvariant="bold-italic">k</mml:mi>
<mml:mi mathvariant="bold-italic">T</mml:mi>
</mml:msubsup>
<mml:msub>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mi mathvariant="bold-italic">k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
<label>(23)</label>
</disp-formula>
</p>
<p>
<xref ref-type="fig" rid="F13">Figure&#x20;13</xref> shows the evolution of these 3 measures of the correlation between the original <bold>X</bold> matrix and the matrices after deflation.</p>
<fig id="F13" position="float">
<label>FIGURE 13</label>
<caption>
<p>Evolution of RV, COI and RIP of the <bold>X</bold>-matrix after each deflation.</p>
</caption>
<graphic xlink:href="frans-01-754447-g013.tif"/>
</fig>
</sec>
</sec>
<sec id="s5">
<title>Consensus Number of Latent Values</title>
<p>Given all the criteria that can be calculated, one needs to find a consensus value for the number of LVs to retain in the PLS regression model. Some criteria (RMSEC and RMSECV in <xref ref-type="fig" rid="F3">Figure&#x20;3</xref>; <italic>R</italic>
<sup>2</sup> and Q<sup>2</sup> in <xref ref-type="fig" rid="F5">Figure&#x20;5</xref>; Wold&#x2019;s R, Osten&#x2019;s F and Cattell&#x2019;s RPV in <xref ref-type="fig" rid="F6">Figure&#x20;6</xref>) characterize the proximity of the predicted values to the observed values, but they can be subject to errors due to the particular choice of the calibration and test sets. Others characterize the regression coefficients (B-DW, B_Morph, B_VIF in <xref ref-type="fig" rid="F8">Figure&#x20;8</xref>) which should not be excessively noisy or of too high a magnitude (B_Var in <xref ref-type="fig" rid="F8">Figure&#x20;8</xref>). As well, similar B-coefficients vectors should be extracted from subsets of the data matrix (mean of the correlations between regression coefficients vectors in <xref ref-type="fig" rid="F9">Figure&#x20;9</xref>). Still others characterize the noisy structure of the residual variability in the deflated matrices (mean and standard deviations of DW_X, Morph_X, Var_X and VIF_X in <xref ref-type="fig" rid="F10">Figure&#x20;10</xref> as well as Malinowski&#x2019;s RE, IE, XE and IND in <xref ref-type="fig" rid="F12">Figure&#x20;12</xref>).</p>
<p>These deflated matrices should also tend towards a spherical structure (Bartlett_X, Hartley_X, Exner_X, KMO_X_rows, KMO_X_columns in <xref ref-type="fig" rid="F11">Figure&#x20;11</xref>). As well, as successive components are removed, the correlations between the original matrix and the deflated matrices should decrease (RV, COI and RIP in <xref ref-type="fig" rid="F13">Figure&#x20;13</xref>).</p>
<p>To create a consensus of all these different types of information, we propose to apply a Principal Components Analysis to the various criteria.</p>
<p>All the criteria were concatenated so that each row corresponded to a number of Latent Values and the columns contained the criteria. Criteria such as DW were used <italic>as is</italic> while for criteria like RMSECV the inverse was used, so that in all cases, earlier LVs are associated with lower values.</p>
<p>The matrix was then z-transformed by subtracting the column means and dividing by the column standard deviations.</p>
<p>The resulting PC1-PC2 Scores plot and Loadings plot are presented in <xref ref-type="fig" rid="F14">Figure&#x20;14</xref>.</p>
<fig id="F14" position="float">
<label>FIGURE 14</label>
<caption>
<p>Plot of PC1-PC2 scores and loadings after applying a PCA to the standardized criteria.</p>
</caption>
<graphic xlink:href="frans-01-754447-g014.tif"/>
</fig>
<p>The scores plot shows a clear evolution from low dimensionality models to high dimensionality along PC1, reflecting the increase in all values as the number of LVs increases. The evolution along PC2 corresponds to another phenomenon since the scores are highly positive for both small and large numbers of LVs, with a very clear negative minimum for a model at 6 LVs. The loadings plots shows an opposition between RMSEC, COI, std_Var_X, std_Morph_X and most of the criteria based on the B-coefficients vectors on the positive side; while mean_VIF_X, IE, RMSECV, Wold&#x2019;s R, Cattell&#x2019;s RPV, R2-Q2 and most of the criteria based on the deflated X matrices are on the negative side. This contrast between the criteria based on the B-coefficients vectors and those based on the deflated X matrices shows their complementary nature.</p>
<p>Only the first 2&#xa0;PCs are presented as the following scores (corresponding to models with increasing numbers of LVs) did not have any interpretable structure.</p>
</sec>
<sec sec-type="conclusion" id="s6">
<title>Conclusion</title>
<p>PLS regression is a high-performance calibration and prediction method to link predictive <bold>X</bold>-variables to the <bold>Y</bold>-variables to be predicted, even when variables are highly correlated and in very large numbers.</p>
<p>However, adjusting the number of latent variables in the model is crucial. This adjustment should be done on the basis of several criteria.</p>
<p>To do this, various methods can be used:</p>
<p>The most common method is to observe the evolution of calibration errors (RMSEC) and validation or cross validation errors (RMSEV or RMSECV); One can also examine the evolution of the vectors of regression coefficients. This also provides information on the role of the variables or spectral components in the model; Finally, the evolution in the structure of the rows and columns as well as the sphericity of the X-matrix after each deflation step, can be examined.</p>
<p>To do this we have proposed applying a Principal Components Analysis to a collection of criteria characterizing the different aspects of models obtained with increasing numbers of Latent Variables. The set of criteria used in the present study is far from exhaustive, and the efficacy of the method may even be improved by including others.</p>
<p>Matlab function to calculate most of the non-trivial criteria are to be found at: <ext-link ext-link-type="uri" xlink:href="https://github.com/DNRutledge/LV_Criteria">https://github.com/DNRutledge/LV_Criteria</ext-link>.</p>
</sec>
</body>
<back>
<sec id="s7">
<title>Data Availability Statement</title>
<p>The data analyzed in this study is subject to the following licenses/restrictions: Interested readers should contact the authors of the article cited as producers of the data. Requests to access these datasets should be directed to nathalie.dupuy@imbe.fr.</p>
</sec>
<sec id="s8">
<title>Author Contributions</title>
<p>DR: Conception, calculations, writing J-MR: Corrections, calculations, writing ML: Corrections, calculations, writing.</p>
</sec>
<sec sec-type="COI-statement" id="s9">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s10">
<title>Publisher&#x2019;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors, and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<sec id="s11">
<title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/frans.2021.754447/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/frans.2021.754447/full&#x23;supplementary-material</ext-link>
</p>
<supplementary-material xlink:href="Image2.TIF" id="SM1" mimetype="application/TIF" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Image1.TIF" id="SM2" mimetype="application/TIF" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Akaike</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>1969</year>). <article-title>Fitting Autoregressive Models for Prediction</article-title>. <source>Ann. Inst. Stat. Math.</source> <volume>21</volume>, <fpage>243</fpage>&#x2013;<lpage>247</lpage>. <pub-id pub-id-type="doi">10.1007/BF02532251</pub-id> </citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bartlett</surname>
<given-names>M. S.</given-names>
</name>
</person-group> (<year>1951</year>). <article-title>The Effect of Standardization on A &#x3a7;2 Approximation in Factor Analysis</article-title>. <source>Biometrika</source> <volume>38</volume> (<issue>3/4</issue>), <fpage>337</fpage>&#x2013;<lpage>344</lpage>. <pub-id pub-id-type="doi">10.1093/biomet/38.3-4.337</pub-id> </citation>
</ref>
<ref id="B3">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Bissett</surname>
<given-names>A. C.</given-names>
</name>
</person-group> (<year>2015</year>). <source>Improvements to PLS Methodology, PhD</source>. <publisher-loc>Manchester</publisher-loc>: <publisher-name>University of Manchester</publisher-name>. <comment>Available at: <ext-link ext-link-type="uri" xlink:href="http://www.manchester.ac.uk/escholar/uk-ac-man-scw:261814">http://www.manchester.ac.uk/escholar/uk-ac-man-scw:261814</ext-link>
</comment>. </citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cattell</surname>
<given-names>R. B.</given-names>
</name>
</person-group> (<year>1966</year>). <article-title>The Scree Test for the Number of Factors</article-title>. <source>Multivariate Behav. Res.</source> <volume>1</volume> (<issue>2</issue>), <fpage>245</fpage>&#x2013;<lpage>276</lpage>. <pub-id pub-id-type="doi">10.1207/s15327906mbr0102_10</pub-id> </citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cochran</surname>
<given-names>W. G.</given-names>
</name>
</person-group> (<year>1941</year>). <article-title>The Distribution of the Largest of a Set of Estimated Variances as a Fraction of Their Total</article-title>. <source>Ann. Eugenics</source> <volume>11</volume> (<issue>1</issue>), <fpage>47</fpage>&#x2013;<lpage>52</lpage>. <pub-id pub-id-type="doi">10.1111/j.1469-1809.1941.tb02271.x</pub-id> </citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Denham</surname>
<given-names>M. C.</given-names>
</name>
</person-group> (<year>2000</year>). <article-title>Choosing the Number of Factors in Partial Least Squares Regression: Estimating and Minimizing the Mean Squared Error of Prediction</article-title>. <source>J.&#x20;Chemometrics</source> <volume>14</volume> (<issue>4</issue>), <fpage>351</fpage>&#x2013;<lpage>361</lpage>. <pub-id pub-id-type="doi">10.1002/1099-128X(200007/08)14:4&#x3c;351::AID-CEM598&#x3e;3.0.CO;2-Q</pub-id> </citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dray</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Chessel</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Chessel</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Thioulouse</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2003</year>). <article-title>Co-inertia Analysis and the Linking of Ecological Data Tables</article-title>. <source>Ecology</source> <volume>84</volume>, <fpage>3078</fpage>&#x2013;<lpage>3089</lpage>. <pub-id pub-id-type="doi">10.1890/03-0178</pub-id> </citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dray</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2008</year>). <article-title>On the Number of Principal Components: A Test of Dimensionality Based on Measurements of Similarity between Matrices</article-title>. <source>Comput. Stat. Data Anal.</source> <volume>52</volume>, <fpage>2228</fpage>&#x2013;<lpage>2237</lpage>. <pub-id pub-id-type="doi">10.1016/j.csda.2007.07.015</pub-id> </citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Durbin</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Watson</surname>
<given-names>G. S.</given-names>
</name>
</person-group> (<year>1971</year>). <article-title>Testing for Serial Correlation in Least Squares Regression. III</article-title>. <source>Biometrika</source> <volume>58</volume>, <fpage>1</fpage>&#x2013;<lpage>19</lpage>. <pub-id pub-id-type="doi">10.2307/2334313</pub-id> </citation>
</ref>
<ref id="B10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Escoufier</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>1973</year>). <article-title>Le traitement des variables vectorielles</article-title>. <source>Biometrics</source> <volume>29</volume>, <fpage>751</fpage>&#x2013;<lpage>760</lpage>. <pub-id pub-id-type="doi">10.2307/2529140</pub-id> </citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Exner</surname>
<given-names>O.</given-names>
</name>
</person-group> (<year>1966</year>). <article-title>Additive Physical Properties. I. General Relationships and Problems of Statistical Nature</article-title>. <source>Collect. Czech. Chem. Commun.</source> <volume>31</volume>, <fpage>3222</fpage>&#x2013;<lpage>3251</lpage>. <pub-id pub-id-type="doi">10.1135/cccc19663222</pub-id> </citation>
</ref>
<ref id="B12">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Faber</surname>
<given-names>N. M.</given-names>
</name>
</person-group> (<year>1999</year>). <article-title>Estimating the Uncertainty in Estimates of Root Mean Square Error of Prediction: Application to Determining the Size of an Adequate Test Set in Multivariate Calibration</article-title>. <source>Chemometrics Intell. Lab. Syst.</source> <volume>49</volume> (<issue>1</issue>), <fpage>79</fpage>&#x2013;<lpage>89</lpage>. <pub-id pub-id-type="doi">10.1016/S0169-7439(99)00027-1</pub-id> </citation>
</ref>
<ref id="B13">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Ferr&#xe9;</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2009</year>). &#x201c;<article-title>Regression Diagnostics</article-title>,&#x201d; in <source>Comprehensive Chemometrics</source>. Editors <person-group person-group-type="editor">
<name>
<surname>Brown</surname>
<given-names>S. D.</given-names>
</name>
<name>
<surname>Tauler</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Walczak</surname>
<given-names>B.</given-names>
</name>
</person-group> (<publisher-loc>Amsterdam</publisher-loc>: <publisher-name>Elsevier</publisher-name>), <fpage>33</fpage>&#x2013;<lpage>89</lpage>. <comment>9780444527011</comment>. <pub-id pub-id-type="doi">10.1016/B978-044452701-1.00076-4</pub-id> </citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Galtier</surname>
<given-names>O.</given-names>
</name>
<name>
<surname>Dupuy</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Le Dr&#xe9;au</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Ollivier</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Pinatel</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Kister</surname>
<given-names>J.</given-names>
</name>
<etal/>
</person-group> (<year>2007</year>). <article-title>Geographic Origins and Compositions of virgin Olive Oils Determinated by Chemometric Analysis of NIR Spectra</article-title>. <source>Analytica Chim. Acta</source> <volume>595</volume>, <fpage>136</fpage>&#x2013;<lpage>144</lpage>. <pub-id pub-id-type="doi">10.1016/j.aca.2007.02.033</pub-id> </citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hartley</surname>
<given-names>H. O.</given-names>
</name>
</person-group> (<year>1950</year>). <article-title>The Maximum F-Ratio as a Short-Cut Test for Heterogeneity Op Variance</article-title>. <source>Biometrika</source> <volume>37</volume>, <fpage>308</fpage>&#x2013;<lpage>312</lpage>. <pub-id pub-id-type="doi">10.1093/biomet/37.3-4.308</pub-id> </citation>
</ref>
<ref id="B16">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Hastie</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Tibshirani</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Friedman</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2009</year>). <source>The Elements of Statistical Learning: Data Mining, Inference, and Prediction</source>. <edition>2nd ed.</edition> <publisher-loc>New York</publisher-loc>: <publisher-name>Springer</publisher-name>. </citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kaiser</surname>
<given-names>H. F.</given-names>
</name>
</person-group> (<year>1970</year>). <article-title>A Second Generation Little Jiffy</article-title>. <source>Psychometrika</source> <volume>35</volume>, <fpage>401</fpage>&#x2013;<lpage>415</lpage>. <pub-id pub-id-type="doi">10.1007/BF02291817</pub-id> </citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kaiser</surname>
<given-names>H. F.</given-names>
</name>
</person-group> (<year>1974</year>). <article-title>An index of Factorial Simplicity</article-title>. <source>Psychometrika</source> <volume>39</volume>, <fpage>31</fpage>&#x2013;<lpage>36</lpage>. <pub-id pub-id-type="doi">10.1007/BF02291575</pub-id> </citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kassouf</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Jouan-Rimbaud Bouveresse</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Rutledge</surname>
<given-names>D. N.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Determination of the Optimal Number of Components in Independent Components Analysis</article-title>. <source>Talanta</source> <volume>179</volume>, <fpage>538</fpage>&#x2013;<lpage>545</lpage>. <pub-id pub-id-type="doi">10.1016/j.talanta.2017.11.051</pub-id> </citation>
</ref>
<ref id="B20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kiers</surname>
<given-names>H. A. L.</given-names>
</name>
<name>
<surname>Cl&#xe9;roux</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Ten Berge</surname>
<given-names>J.&#x20;M. F.</given-names>
</name>
</person-group> (<year>1994</year>). <article-title>Generalized Canonical Analysis Based on Optimizing Matrix Correlations and a Relation with IDIOSCAL</article-title>. <source>Comput. Stat. Data Anal.</source> <volume>18</volume>, <fpage>331</fpage>&#x2013;<lpage>340</lpage>. <pub-id pub-id-type="doi">10.1016/0167-9473(94)90067-1</pub-id> </citation>
</ref>
<ref id="B21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kindsvanter</surname>
<given-names>J.&#x20;H.</given-names>
</name>
<name>
<surname>Weiner</surname>
<given-names>P. H.</given-names>
</name>
<name>
<surname>Klingen</surname>
<given-names>T. J.</given-names>
</name>
</person-group> (<year>1974</year>). <article-title>Correlation of Retention Volumes of Substitutued Carboranes with Molecular Properties in High Pressure Liquid Chromatography Using Factor Analysis</article-title>. <source>Anal. Chem.</source> <volume>46</volume>, <fpage>982</fpage>&#x2013;<lpage>988</lpage>. <pub-id pub-id-type="doi">10.1021/ac60344a032</pub-id> </citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kr&#xe4;mer</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Sugiyama</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>The Degrees of Freedom of Partial Least Squares Regression</article-title>. <source>J.&#x20;Am. Stat. Assoc.</source> <volume>106</volume> (<issue>494</issue>), <fpage>697</fpage>&#x2013;<lpage>705</lpage>. <pub-id pub-id-type="doi">10.1198/jasa.2011.tm10107</pub-id> </citation>
</ref>
<ref id="B23">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lesnoff</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Roger</surname>
<given-names>J.&#x20;M.</given-names>
</name>
<name>
<surname>Rutledge</surname>
<given-names>D. N.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Monte Carlo Methods for Estimating Mallows&#x27;s Cp and AIC Criteria for PLSR Models. Illustration on Agronomic Spectroscopic NIR Data</article-title>. <source>J.&#x20;Chemometrics</source>. <pub-id pub-id-type="doi">10.1002/cem.3369</pub-id> </citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Morris</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Martin</surname>
<given-names>E. B.</given-names>
</name>
</person-group> (<year>2002</year>). <article-title>Model Selection for Partial Least Squares Regression</article-title>. <source>Chemometrics Intell. Lab. Syst.</source> <volume>64</volume>, <fpage>79</fpage>&#x2013;<lpage>89</lpage>. <pub-id pub-id-type="doi">10.1016/S0169-7439(02)00051-5</pub-id> </citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Malinowski</surname>
<given-names>E. R.</given-names>
</name>
</person-group> (<year>1977b</year>). <article-title>Determination of the Number of Factors and the Experimental Error in a Data Matrix</article-title>. <source>Anal. Chem.</source> <volume>49</volume> (<issue>4</issue>), <fpage>612</fpage>&#x2013;<lpage>617</lpage>. <pub-id pub-id-type="doi">10.1021/ac50012a027</pub-id> </citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Malinowski</surname>
<given-names>E. R.</given-names>
</name>
</person-group> (<year>1977a</year>). <article-title>Theory of Error in Factor Analysis</article-title>. <source>Anal. Chem.</source> <volume>49</volume> (<issue>4</issue>), <fpage>606</fpage>&#x2013;<lpage>612</lpage>. <pub-id pub-id-type="doi">10.1021/ac50012a02710.1021/ac50012a026</pub-id> </citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mallows</surname>
<given-names>C. L.</given-names>
</name>
</person-group> (<year>1973</year>). <article-title>Some Comments onCp</article-title>. <source>Technometrics</source> <volume>15</volume> (<issue>4</issue>), <fpage>661</fpage>&#x2013;<lpage>675</lpage>. <pub-id pub-id-type="doi">10.1080/00401706.1973.10489103</pub-id> </citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Marquardt</surname>
<given-names>D. W.</given-names>
</name>
</person-group> (<year>1970</year>). <article-title>Generalized Inverses, Ridge Regression, Biased Linear Estimation, and Nonlinear Estimation</article-title>. <source>Technometrics</source> <volume>12</volume> (<issue>3</issue>), <fpage>591</fpage>&#x2013;<lpage>612</lpage>. <pub-id pub-id-type="doi">10.1080/00401706.1970.1048869910.2307/1267205</pub-id> </citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Meloun</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>&#x10c;apek</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Mik&#x161;&#x131;&#x301;k</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Brereton</surname>
<given-names>R. G.</given-names>
</name>
</person-group> (<year>2000</year>). <article-title>Critical Comparison of Methods Predicting the Number of Components in Spectroscopic Data</article-title>. <source>Analytica Chim. Acta</source> <volume>423</volume>, <fpage>51</fpage>&#x2013;<lpage>68</lpage>. <pub-id pub-id-type="doi">10.1016/S0003-2670(00)01100-4</pub-id> </citation>
</ref>
<ref id="B30">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Osten</surname>
<given-names>D. W.</given-names>
</name>
</person-group> (<year>1988</year>). <article-title>Selection of Optimal Regression Models via Cross-Validation</article-title>. <source>J.&#x20;Chemometrics</source> <volume>2</volume>, <fpage>39</fpage>&#x2013;<lpage>48</lpage>. <pub-id pub-id-type="doi">10.1002/cem.1180020106</pub-id> </citation>
</ref>
<ref id="B31">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ramsay</surname>
<given-names>J.&#x20;O.</given-names>
</name>
<name>
<surname>Ten Berge</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Styan</surname>
<given-names>G. P. H.</given-names>
</name>
</person-group> (<year>1984</year>). <article-title>Matrix Correlation</article-title>. <source>Psychometrika</source> <volume>49</volume>, <fpage>403</fpage>&#x2013;<lpage>423</lpage>. <pub-id pub-id-type="doi">10.1007/BF02306029</pub-id> </citation>
</ref>
<ref id="B32">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Robert</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Escoufier</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>1976</year>). <article-title>A Unifying Tool for Linear Multivariate Statistical Methods: The RV- Coefficient</article-title>. <source>Appl. Stat.</source> <volume>25</volume>, <fpage>257</fpage>&#x2013;<lpage>265</lpage>. <pub-id pub-id-type="doi">10.2307/2347233</pub-id> </citation>
</ref>
<ref id="B33">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rutledge</surname>
<given-names>D. N.</given-names>
</name>
<name>
<surname>Barros</surname>
<given-names>A. S.</given-names>
</name>
</person-group> (<year>2002</year>). <article-title>The Durbin-Watson Statistic as a Morphological Estimator of Information Content</article-title>. <source>Analytica Chim. Acta</source> <volume>446</volume>, <fpage>279</fpage>&#x2013;<lpage>294</lpage>. <pub-id pub-id-type="doi">10.1016/S0003-2670(01)01555-0</pub-id> </citation>
</ref>
<ref id="B34">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Vapnik</surname>
<given-names>V.</given-names>
</name>
</person-group> (<year>2006</year>). <source>Estimation of Dependences Based on Empirical Data</source>. <edition>2nd ed.</edition> <publisher-loc>New York</publisher-loc>: <publisher-name>Springer</publisher-name>. </citation>
</ref>
<ref id="B35">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>J.-H.</given-names>
</name>
<name>
<surname>Liang</surname>
<given-names>Y.-Z.</given-names>
</name>
<name>
<surname>Jiang</surname>
<given-names>J.-H.</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>R.-Q.</given-names>
</name>
</person-group> (<year>1996</year>). <article-title>Local Chemical Rank Estimation of Two-Way Data in the Presence of Heteroscedastic Noise: A Morphological Approach</article-title>. <source>Chemometrics Intell. Lab. Syst.</source> <volume>32</volume>, <fpage>265</fpage>&#x2013;<lpage>272</lpage>. <pub-id pub-id-type="doi">10.1016/0169-7439(95)00072-0</pub-id> </citation>
</ref>
<ref id="B36">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wiklund</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Nilsson</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Eriksson</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Sj&#xf6;str&#xf6;m</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Wold</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Faber</surname>
<given-names>K.</given-names>
</name>
</person-group> (<year>2007</year>). <article-title>A Randomization Test for PLS Component Selection</article-title>. <source>J.&#x20;Chemometrics</source> <volume>21</volume>, <fpage>427</fpage>&#x2013;<lpage>439</lpage>. <pub-id pub-id-type="doi">10.1002/cem.1086</pub-id> </citation>
</ref>
<ref id="B37">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wold</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>1978</year>). <article-title>Cross-Validatory Estimation of the Number of Components in Factor and Principal Components Models</article-title>. <source>Technometrics</source> <volume>20</volume> (<issue>4</issue>), <fpage>397</fpage>&#x2013;<lpage>405</lpage>. <pub-id pub-id-type="doi">10.2307/126763910.1080/00401706.1978.10489693</pub-id> </citation>
</ref>
</ref-list>
</back>
</article>