<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="2.3" xml:lang="EN">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Plant Sci.</journal-id>
<journal-title>Frontiers in Plant Science</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Plant Sci.</abbrev-journal-title>
<issn pub-type="epub">1664-462X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fpls.2023.1247181</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Plant Science</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Improving power of genome-wide association studies via transforming ordinal phenotypes into continuous phenotypes</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" equal-contrib="yes">
<name>
<surname>Yang</surname>
<given-names>Ming</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="author-notes" rid="fn003">
<sup>&#x2020;</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/2556157"/>
</contrib>
<contrib contrib-type="author" equal-contrib="yes">
<name>
<surname>Wen</surname>
<given-names>Yangjun</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="author-notes" rid="fn003">
<sup>&#x2020;</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/576120"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Zheng</surname>
<given-names>Jinchang</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/2556024"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Zhang</surname>
<given-names>Jin</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/570781"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Zhao</surname>
<given-names>Tuanjie</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/345840"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Feng</surname>
<given-names>Jianying</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="author-notes" rid="fn001">
<sup>*</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/788544"/>
</contrib>
</contrib-group>
<aff id="aff1">
<sup>1</sup>
<institution>Key Laboratory of Biology and Genetics Improvement of Soybean, Ministry of Agriculture/Zhongshan Biological Breeding Laboratory (ZSBBL)/National Innovation Platform for Soybean Breeding and Industry-Education Integration/State Key Laboratory of Crop Genetics &amp; Germplasm Enhancement and Utilization/College of Agriculture, Nanjing Agricultural University</institution>, <addr-line>Nanjing</addr-line>, <country>China</country>
</aff>
<aff id="aff2">
<sup>2</sup>
<institution>College of Science, Nanjing Agricultural University</institution>, <addr-line>Nanjing</addr-line>, <country>China</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>Edited by: Shang-Qian Xie, University of Idaho, United States</p>
</fn>
<fn fn-type="edited-by">
<p>Reviewed by: Jia Wen, University of North Carolina at Chapel Hill, United States; Suhong Bu, South China Agricultural University, China; Shibo Wang, University of California, Riverside, United States</p>
</fn>
<fn fn-type="corresp" id="fn001">
<p>*Correspondence: Jianying Feng, <email xlink:href="mailto:fengjianying@njau.edu.cn">fengjianying@njau.edu.cn</email>
</p>
</fn>
<fn fn-type="equal" id="fn003">
<p>&#x2020;These authors have contributed equally to this work</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>02</day>
<month>11</month>
<year>2023</year>
</pub-date>
<pub-date pub-type="collection">
<year>2023</year>
</pub-date>
<volume>14</volume>
<elocation-id>1247181</elocation-id>
<history>
<date date-type="received">
<day>25</day>
<month>06</month>
<year>2023</year>
</date>
<date date-type="accepted">
<day>18</day>
<month>10</month>
<year>2023</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2023 Yang, Wen, Zheng, Zhang, Zhao and Feng</copyright-statement>
<copyright-year>2023</copyright-year>
<copyright-holder>Yang, Wen, Zheng, Zhang, Zhao and Feng</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<sec>
<title>Introduction</title>
<p>Ordinal traits are important complex traits in crops, while genome-wide association study (GWAS) is a widely-used method in their gene mining. Presently, GWAS of continuous quantitative traits (C-GWAS) and single-locus association analysis method of ordinal traits are the main methods used for ordinal traits. However, the detection power of these two methods is low.</p>
</sec>
<sec>
<title>Methods</title>
<p>To address this issue, we proposed a new method, named MTOTC, in which hierarchical data of ordinal traits are transformed into continuous phenotypic data (CPData).</p>
</sec>
<sec>
<title>Results</title>
<p>Then, FASTmrMLM, one C-GWAS method, was used to conduct GWAS for CPData. The results from the simulation studies showed that, MTOTC+FASTmrMLM for ordinal traits was better than the classical methods when there were four and fewer hierarchical levels. In addition, when MTOTC was combined with FASTmrEMMA, mrMLM, ISIS EM-BLASSO, pLARmEB, and pKWmEB, relatively high power and low false positive rate in QTN detection were observed as well. Subsequently, MTOTC was applied to analyze the hierarchical data of soybean salt-alkali tolerance. It was revealed that more significant QTNs were detected when MTOTC was combined with any of the above six C-GWAs.</p>
</sec>
<sec>
<title>Discussion</title>
<p>Accordingly, the new method increases the choices of the GWAS methods for ordinal traits and helps to mine the genes for ordinal traits in resource populations.</p>
</sec>
</abstract>
<kwd-group>
<kwd>ordinal trait</kwd>
<kwd>genome-wide association study</kwd>
<kwd>salt-alkali tolerance</kwd>
<kwd>soybean</kwd>
<kwd>hierarchical data</kwd>
</kwd-group>
<counts>
<fig-count count="6"/>
<table-count count="3"/>
<equation-count count="11"/>
<ref-count count="31"/>
<page-count count="14"/>
<word-count count="7468"/>
</counts>
<custom-meta-wrap>
<custom-meta>
<meta-name>section-in-acceptance</meta-name>
<meta-value>Technical Advances in Plant Science</meta-value>
</custom-meta>
</custom-meta-wrap>
</article-meta>
</front>
<body>
<sec id="s1" sec-type="intro">
<label>1</label>
<title>Introduction</title>
<p>The hierarchical data (HData), phenotypic data for ordinal traits, is commonly used to describe many important traits in crop germplasm resources. This includes count data for quantitative traits and hierarchical data for resistance traits, such as the number of main stem nodes (<xref ref-type="bibr" rid="B3">Chang et&#xa0;al., 2018</xref>), the number of branches (<xref ref-type="bibr" rid="B15">Shim et&#xa0;al., 2019</xref>), and disease resistance (<xref ref-type="bibr" rid="B10">Megerssa et&#xa0;al., 2020</xref>). Ordinal traits are important in crop breeding and have a considerable impact on crop yield and quality. Genome-wide association studies (GWAS) for ordinal traits can further promote the mining of relevant excellent genes, which plays a key role in molecular design breeding and gene cloning. <xref ref-type="bibr" rid="B4">Cuevas et&#xa0;al. (2018)</xref> divided the degree of infection of anthracnose-inoculated sorghum leaves into five levels and identified three loci for anthracnose resistance in chromosome 5 using the GWAS methods. <xref ref-type="bibr" rid="B3">Chang et&#xa0;al. (2018)</xref> detected three loci significantly associated with &#x201c;the number of nodes on the main stem&#x201d; in 368 soybean cultivars with 62,423 SNPs. Meanwhile, <xref ref-type="bibr" rid="B15">Shim et&#xa0;al. (2019)</xref> identified five quantitative trait nucleotides (QTNs) for soybean branch number via GWAS and linkage analysis and mined a candidate gene <italic>Glyma.06g210600</italic>.</p>
<p>Ordinal traits are discrete traits that are controlled by multiple genes. However, their phenotypic data is hierarchical and non-continuous and contains relatively limited information; accordingly, GWAS for ordinal traits is more complex than that for continuous quantitative traits. The threshold model represents a reasonable method for the genetic analysis of ordinal traits, and most association mapping methods are developed under this framework (<xref ref-type="bibr" rid="B27">Xu et&#xa0;al., 2005</xref>; <xref ref-type="bibr" rid="B11">Osval et&#xa0;al., 2015</xref>). Generalized linear model is based on the threshold model and link phenotypic data with latent variables through a link function. They are widely used for genetic analysis of ordinal traits and can deal with non-normal data (<xref ref-type="bibr" rid="B5">Feng et&#xa0;al., 2013</xref>; <xref ref-type="bibr" rid="B16">Song et&#xa0;al., 2016</xref>; <xref ref-type="bibr" rid="B22">Wang et&#xa0;al., 2018</xref>). The logistic regression model is another classical way for dealing with association studies of ordinal traits (<xref ref-type="bibr" rid="B20">Tan et&#xa0;al., 2007</xref>; <xref ref-type="bibr" rid="B7">Hoggart et&#xa0;al., 2008</xref>; <xref ref-type="bibr" rid="B25">Wu et&#xa0;al., 2009</xref>; <xref ref-type="bibr" rid="B8">Jiang et&#xa0;al., 2021</xref>). When sample size is limited, the application of a set-valued (SV) system model can improve the statistical power and the accuracy of parameter estimation (<xref ref-type="bibr" rid="B2">Bi et&#xa0;al., 2015</xref>). Bayesian and maximum likelihood methods are both widely used for parameter estimation in GWAS (<xref ref-type="bibr" rid="B27">Xu et&#xa0;al., 2005</xref>; <xref ref-type="bibr" rid="B7">Hoggart et&#xa0;al., 2008</xref>; <xref ref-type="bibr" rid="B22">Wang et&#xa0;al., 2018</xref>), while several studies have also employed non-parametric methods for association analysis of ordinal traits (<xref ref-type="bibr" rid="B17">Sun et&#xa0;al., 2016</xref>; <xref ref-type="bibr" rid="B23">Wang et&#xa0;al., 2017</xref>; <xref ref-type="bibr" rid="B6">He and Kulminski, 2020</xref>). However, most of them were either single-locus or were only suitable for the analysis of binary traits, and they had very few applications in crop. GWAS for continuous quantitative traits and single-locus methods are currently the main methods used for association analysis of ordinal traits; however, both have low power in QTN detection.</p>
<p>Accordingly, in this study, we proposed a method for transforming ordinal phenotypes into continuous phenotypes (MTOTC). First, the hierarchical phenotypic data for ordinal traits (HData) was transformed into continuous phenotypic data (CPData). Subsequently, FASTmrMLM (<xref ref-type="bibr" rid="B19">Tamba and Zhang, 2018</xref>), one GWAS method suitable for continuous quantitative traits, was used to perform GWAS for CPData. In Monte Carlo simulation studies, we validated the feasibility of the new method through the statistical power, false-positive rate in QTN detection and the accuracies for the estimates of QTN effects and positions, and obtained the number of hierarchical levels suitable for MTOTC+FASTmrMLM. The new method was validated by re-analyzing the salt-alkali resistance traits in soybean germplasm resource population of <xref ref-type="bibr" rid="B29">Zhang et&#xa0;al. (2014)</xref> and <xref ref-type="bibr" rid="B31">Zhou et&#xa0;al. (2015)</xref>. This study provides more choices for association analysis of ordinal traits and helps to identify excellent genes for important complex traits in crops.</p>
</sec>
<sec id="s2">
<label>2</label>
<title>Theory and methods</title>
<p>Here we proposed a method, named MTOTC, to transform the discrete hierarchical data (HData) of ordinal traits into continuous phenotypic data. Then, GWAS for continuous quantitative traits (C-GWAS) are used to analyze the transformed continuous phenotypic data. The new method was described as below.</p>
<sec id="s2_1">
<label>2.1</label>
<title>Genetic mapping population</title>
<p>In Monte Carlo simulation studies, 199 <italic>Arabidopsis thaliana</italic> lines harboring 10,000 SNPs with a minimum allele frequency &gt;0.1 (<xref ref-type="bibr" rid="B1">Atwell et&#xa0;al., 2010</xref>) were selected as the genetic mapping population. For real data analysis, the population was comprised of 286 soybean cultivars assessed for salt-alkali tolerance, the phenotypic data consisted of the main root length index in 2009 and 2010 (<xref ref-type="bibr" rid="B29">Zhang et&#xa0;al., 2014</xref>), and the marker data were 54,296 high-quality SNP markers present in <xref ref-type="bibr" rid="B31">Zhou et&#xa0;al. (2015)</xref>.</p>
</sec>
<sec id="s2_2">
<label>2.2</label>
<title>Method for transforming ordinal phenotypes into continuous phenotypes</title>
<p>To transform ordinal phenotypes into continuous phenotypes, we proposed the MTOTC method. In detail, the Chi-square test and logistic regression were used to initially select the SNPs that were significantly related to the trait. Subsequently, these significant SNPs and ordinal phenotypes were used to construct a multi-locus model, Bayesian method was used to estimate the SNP effects, and the effect estimates were used to predict the continuous phenotypic data (CPData). This is MTOTC. Then, the predicted CPData is analyzed by C-GWAS methods, such as FASTmrMLM (<xref ref-type="fig" rid="f1">
<bold>Figure&#xa0;1</bold>
</xref>).</p>
<fig id="f1" position="float">
<label>Figure&#xa0;1</label>
<caption>
<p>Technology framework of the MTOTC method in this study.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-14-1247181-g001.tif"/>
</fig>
<sec id="s2_2_1">
<label>2.2.1</label>
<title>The Chi-square test and logistic regression</title>
<p>The Chi-square test in R 4.0.5 (function &#x201c;chisq.test&#x201d;) was used to scan the SNPs in the whole genome using a single marker method (<italic>P</italic>-value &#x2264;0.05). To further improve the quality of the significant correlated SNPs in the initial screening for reducing interference and improving detection accuracy, logistic regression was used as a secondary SNP screening method. Logistic regression was performed using function &#x201c;glm&#x201d; (2 hierarchical levels) and &#x201c;polr&#x201d; (the number of hierarchical levels greater than 2) with a <italic>P</italic>-value &#x2264;0.05. The aim of this step was to further eliminate SNPs that were not associated with the traits for simplifying the iterations in the following multi-locus genetic model.</p>
</sec>
<sec id="s2_2_2">
<label>2.2.2</label>
<title>Multi-locus genetic model</title>
<p>Based on the potentially associated markers identified in the above-described initial screening, a multi-locus model was established to transform ordinal phenotypes into continuous phenotypes. The linear model is expressed as:</p>
<disp-formula>
<label>(1)</label>
<mml:math display="block" id="M1">
<mml:mrow>
<mml:mi>y</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi>W</mml:mi>
<mml:mi>&#x3b1;</mml:mi>
<mml:mo>+</mml:mo>
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>q</mml:mi>
</mml:msubsup>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:msub>
<mml:mi>&#x3b2;</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:mi>&#x3f5;</mml:mi>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <italic>y</italic> represents <inline-formula>
<mml:math display="inline" id="im1">
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula> ordinal phenotype vector, with <italic>n</italic> representing sample size; <inline-formula>
<mml:math display="inline" id="im2">
<mml:mrow>
<mml:mi>W</mml:mi>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>w</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msub>
<mml:mi>w</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msub>
<mml:mi>w</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#xa0;</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>represents <inline-formula>
<mml:math display="inline" id="im3">
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>c</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>matrix of covariates (fixed effects), including a column vector of <bold>1</bold> and population structure, and represents <inline-formula>
<mml:math display="inline" id="im4">
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>vector of fixed effects, including intercept; <inline-formula>
<mml:math display="inline" id="im5">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>and represent respectively <inline-formula>
<mml:math display="inline" id="im6">
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>genotype vector and effect of the <italic>i-</italic>th potential associated SNP; <italic>q</italic> represents the number of SNPs selected in the initial screening step; <inline-formula>
<mml:math display="inline" id="im7">
<mml:mrow>
<mml:mi>&#x3f5;</mml:mi>
<mml:mo>~</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mtext>MVN</mml:mtext>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mn>0</mml:mn>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msubsup>
<mml:mi>&#x3c3;</mml:mi>
<mml:mi>e</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
<mml:msub>
<mml:mtext>I</mml:mtext>
<mml:mi>n</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#xa0;</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>represents <inline-formula>
<mml:math display="inline" id="im8">
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>error vector.</p>
<p>The population structure <bold>Q</bold> matrix used in the linear model was calculated using Structure software (<xref ref-type="bibr" rid="B12">Pritchard et&#xa0;al., 2000</xref>). Based on the <bold>Q</bold> matrix, the population is divided into corresponding subgroups, and the optimal subgroup number <bold>K</bold> value is determined according to the corresponding standard, yielding the final <bold>Q</bold> matrix. The optimal value of the <italic>Arabidopsis</italic> population structure was calculated as <bold>K</bold>=2, and the optimal value of the salt-alkali tolerant soybean population structure in the actual study was <bold>K</bold>=3.</p>
</sec>
<sec id="s2_2_3">
<label>2.2.3</label>
<title>Parameter estimation</title>
<p>In the second step of the novel method, a multi-locus linear mixed model for transforming ordinal phenotypes into continuous phenotypes was established, based on the empirical Bayesian algorithm (<xref ref-type="bibr" rid="B26">Xu, 2010</xref>). And significant loci were screened in threshold value LOD=3.0.</p>
<p>In model (1), set <inline-formula>
<mml:math display="inline" id="im9">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3b2;</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> to obey the following prior normal distribution:</p>
<disp-formula>
<mml:math display="block" id="M10">
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3b2;</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>|</mml:mo>
<mml:msubsup>
<mml:mi>&#x3c3;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mi>N</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mn>0</mml:mn>
<mml:mo>|</mml:mo>
<mml:msubsup>
<mml:mi>&#x3c3;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula>
<mml:math display="block" id="M2">
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x3c3;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
<mml:mo>|</mml:mo>
<mml:mi>&#x3c4;</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mi>&#x3c9;</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#x221d;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x3c3;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mn>2</mml:mn>
</mml:mfrac>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>&#x3c4;</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msup>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>e</mml:mi>
<mml:mi>x</mml:mi>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mfrac>
<mml:mi>&#x3c9;</mml:mi>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:msubsup>
<mml:mi>&#x3c3;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>The parameters were estimated using empirical Bayes, as follows, and the Newton&#x2013;Raphson method.</p>
<disp-formula>
<mml:math display="block" id="M11">
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x3c3;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>E</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x3b2;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>T</mml:mi>
</mml:msubsup>
<mml:msub>
<mml:mi>&#x3b2;</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>+</mml:mo>
<mml:mtext>&#x3c9;</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi>&#x3c4;</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>3</mml:mn>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula>
<mml:math display="block" id="M12">
<mml:mrow>
<mml:mi>&#x3b1;</mml:mi>
<mml:mo>=</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mi>W</mml:mi>
<mml:mi>T</mml:mi>
</mml:msup>
<mml:msup>
<mml:mi>V</mml:mi>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mi>W</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>&#x2212;</mml:mo>
</mml:msup>
<mml:msup>
<mml:mi>W</mml:mi>
<mml:mi>T</mml:mi>
</mml:msup>
<mml:msup>
<mml:mi>V</mml:mi>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mtext>y</mml:mtext>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula>
<mml:math display="block" id="M13">
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x3c3;</mml:mi>
<mml:mi>e</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mi>n</mml:mi>
</mml:mfrac>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>y</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>W</mml:mi>
<mml:mi>&#x3b1;</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mi>T</mml:mi>
</mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>y</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>W</mml:mi>
<mml:mi>&#x3b1;</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>q</mml:mi>
</mml:msubsup>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mi>E</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3b2;</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula>
<mml:math display="block" id="M3">
<mml:mrow>
<mml:mi>E</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3b2;</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:msubsup>
<mml:mi>&#x3c3;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
<mml:msubsup>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>T</mml:mi>
</mml:msubsup>
<mml:msup>
<mml:mi>V</mml:mi>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>y</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>W</mml:mi>
<mml:mi>&#x3b1;</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>Among them,</p>
<disp-formula>
<mml:math display="block" id="M14">
<mml:mrow>
<mml:mi>E</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x3b2;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>T</mml:mi>
</mml:msubsup>
<mml:msub>
<mml:mi>&#x3b2;</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mi>E</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x3b2;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>T</mml:mi>
</mml:msubsup>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mi>E</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3b2;</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>+</mml:mo>
<mml:mi>t</mml:mi>
<mml:mi>r</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:mrow>
<mml:mi>V</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>r</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3b2;</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula>
<mml:math display="block" id="M15">
<mml:mrow>
<mml:mi>V</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>r</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3b2;</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mi>I</mml:mi>
<mml:msubsup>
<mml:mi>&#x3c3;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
<mml:mo>&#x2212;</mml:mo>
<mml:msubsup>
<mml:mi>&#x3c3;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
<mml:msubsup>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>T</mml:mi>
</mml:msubsup>
<mml:msup>
<mml:mi>V</mml:mi>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:msubsup>
<mml:mi>&#x3c3;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula>
<mml:math display="block" id="M16">
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>&#x3c4;</mml:mi>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:mi>&#x3c9;</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mn>0</mml:mn>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula>
<mml:math display="block" id="M4">
<mml:mrow>
<mml:mi>V</mml:mi>
<mml:mo>=</mml:mo>
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>q</mml:mi>
</mml:msubsup>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:msubsup>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>T</mml:mi>
</mml:msubsup>
<mml:msubsup>
<mml:mi>&#x3c3;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
<mml:mo>+</mml:mo>
<mml:mtext>I</mml:mtext>
<mml:msubsup>
<mml:mi>&#x3c3;</mml:mi>
<mml:mi>e</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
</mml:mrow>
</mml:math>
</disp-formula>
<p>Then, the empirical Bayesian estimates of these SNPs effects were obtained in the multi-locus model (1) based on the selected significant SNP markers and ordinal phenotype, and estimates of these effect were used to predict the phenotype, obtaining the continuous phenotypic data (CPData) of ordinal trait.</p>
</sec>
</sec>
<sec id="s2_3">
<label>2.3</label>
<title>GWAS with MTOTC method for ordinal trait</title>
<p>When continuous phenotypic data was obtained by the above MTOTC method, a C-GWAS method could be used to detect significant loci. In this work, FASTmrMLM, one C-GWAS method, was used. So loci significantly associated with ordinal traits were detected by FASTmrMLM using the obtained continuous phenotypic data and the potential associated markers identified in the above-described initial screening. The GWAS method is henceforth referred to as MTOTC+FASTmrMLM. Moreover, the effects of five other C-GWAS (FASTmrEMMA, mrMLM, ISIS EM-BLASSO, pLARmEB, and pKWmEB) methods are also discussed based on the MTOTC method for ordinal trait, in order to verify the feasibility of MTOTC.</p>
</sec>
<sec id="s2_4">
<label>2.4</label>
<title>Monte Carlo simulation datasets for ordinal trait</title>
<p>We conducted six simulation studies to evaluate the feasibility of the new method. For each study, the loci 278, 2143, 2054, 3698, 1716, 6178, and 8501, located on chromosomes 1, 2, 2, 2, 1, 4, and 5, respectively, were selected as the causal loci related to the simulated trait. There were three types of phenotypic data in the simulation experiment&#x2014;original data (OData), which were continuous and generated by Monte Carlo simulation; HData, which were generated from the above OData according to specific distribution proportions (i.e., classification proportion of phenotype distribution); and CPData, which were generated from the above HData by MTOTC. Then, FASTmrMLM, one multi-locus C-GWAS algorithm, was used to conduct GWAS for CPData.</p>
</sec>
</sec>
<sec id="s3" sec-type="results">
<label>3</label>
<title>Results</title>
<sec id="s3_1">
<label>3.1</label>
<title>Monte Carlo simulation studies</title>
<sec id="s3_1_1">
<label>3.1.1</label>
<title>Threshold value in the initial screening</title>
<p>To determine the most suitable threshold value for the Chi-square test and logistic regression in the initial screening, four probability thresholds (0.0001 [i.e., 1/SNP number], 0.01, 0.05, and 0.10) were set for the Chi-square test in the first simulation study, while three probability thresholds (0.0001 [i.e., 1/SNP number], 0.01, and 0.05) were set for logistic regression. The Chi-square test can eliminate a large number of SNPs that are not significantly related to a given phenotype. However, the simulation study showed that some SNPs screened in the above Chi-square test (those with a <italic>P</italic>-value &gt;0.98 and an unusually large absolute value of effect estimate in logistic regression) were not truly related to the phenotype and interfered greatly with subsequent association analysis. Therefore, to further improve the quality of the screened significantly related SNPs and detection accuracy, logistic regression was used as a secondary screening method for SNPs in MTOTC.</p>
<p>In the Chi-square test, the single-locus retention rate decreased with decreasing <italic>P</italic>-values (i.e., threshold values) (<xref ref-type="fig" rid="f2">
<bold>Figure&#xa0;2A</bold>
</xref>). For instance, the single-locus retention rate at loci 278 and 2143 with <italic>P</italic>-values of 0.05 and 0.10 was as high as 96.62%~99.68%, which are very close. When the <italic>P</italic>-value was 0.01, the single-locus retention rate began to decrease, and when the <italic>P</italic>-value was 0.0001, the retention rate dropped to between 59.06% and 68.45%. Moreover, the total retention rate (i.e., the proportion of retained loci among the total loci after chi-square test screening) was the lowest when the <italic>P</italic>-value was 0.0001, followed by 0.01, 0.05, and 0.10 (<xref ref-type="fig" rid="f3">
<bold>Figure&#xa0;3A</bold>
</xref>).</p>
<fig id="f2" position="float">
<label>Figure&#xa0;2</label>
<caption>
<p>The effect of threshold value on the single-locus retention rate after the initial screening. <bold>(A)</bold> is the single-locus retention rate after chi-square test screening; <bold>(B)</bold> is the single-locus retention rate after logistic regression screening.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-14-1247181-g002.tif"/>
</fig>
<fig id="f3" position="float">
<label>Figure&#xa0;3</label>
<caption>
<p>The effect of threshold value on the total retention rate after the initial screening. <bold>(A)</bold> is the total retention rate in chi-square test; <bold>(B)</bold> is the total retention rate in logistic regression.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-14-1247181-g003.tif"/>
</fig>
<p>In logistic regression after the Chi-square test, the single-locus retention rate was the highest when the <italic>P</italic>-value was 0.05 (<xref ref-type="fig" rid="f2">
<bold>Figure&#xa0;2B</bold>
</xref>). For instance, the retention rates of loci 278 and 2143 were as high as 97.56%~99.68% when the <italic>P</italic>-value was 0.01 or 0.05; when the <italic>P</italic>-value was 0.0001, the retention rate dropped to between 60.51% and 69.88%. Additionally, the total retention rate was the lowest (only 0.22%) when the <italic>P</italic>-value was 0.0001, followed by 0.01 and 0.05 (<xref ref-type="fig" rid="f3">
<bold>Figure&#xa0;3B</bold>
</xref>).</p>
<p>Owing to too low single-locus retention rate at the <italic>P</italic>-values of 0.01 and 0.0001, the two <italic>P</italic>-values were unsuitable as a threshold for initial screening. Although the total retention rate was high when the <italic>P</italic>-value was 0.10, this <italic>P</italic>-value retains more loci that are not associated with the trait, in which it did not contribute to simplifying the model. Therefore, the probability threshold <italic>P</italic>=0.05, which is commonly used in statistics, was selected as the probability threshold for the Chi-square test and logistic regression of the initial screening in this study. In addition, we also investigated the effect of threshold value on the single-locus retention rate and the total retention rate under different proportions distribution in binary data and the similar results were observed.</p>
</sec>
<sec id="s3_1_2">
<label>3.1.2</label>
<title>MTOTC+FASTmrMLM displayed greater power than other classical mapping methods</title>
<p>In Monte Carlo simulation studies, the GWAS results of hierarchical data using MTOTC+ FASTmrMLM were compared with those using two classical mapping methods (Chi-square test and logistics regression) (<xref ref-type="table" rid="T1">
<bold>Table&#xa0;1</bold>
</xref>). The results showed that these methods had greater power at the three loci 278, 2143, and 3698, but had less power (&lt;10%) at the other four loci. Compared with the two classical mapping methods, MTOTC+FASTmrMLM had higher power at the three loci 278, 2143, and 3698, and lower false-positive rate, when the number of hierarchical levels of HData was &#x2264;4. The power of the classical methods was higher in a few instances, it was less than 1.5-fold that of MTOTC+FASTmrMLM, but their false-positive rates were 6.8&#x2013;9.5-fold higher than that of MTOTC+FASTmrMLM. In addition, the results showed that when the number of hierarchical levels was&lt;5, MTOTC+FASTmrMLM was more suitable for HData analysis as compared with FASTmrMLM alone. Moreover, in <xref ref-type="table" rid="T1">
<bold>Table&#xa0;1</bold>
</xref>, MTOTC+FASTmrMLM had a relatively higher F1 score, especially for binary data (HData with two hierarchical levels). Here the F1 score combines the precision and recall, it is used to effectively measure the accuracy of the statistical methods and balance power and FPR. Therefore, MTOTC is recommended for the analysis of HData under four or fewer hierarchical levels.</p>
<table-wrap id="T1" position="float">
<label>Table&#xa0;1</label>
<caption>
<p>Comparison of different genome-wide association study methods.</p>
</caption>
<table frame="hsides">
<thead>
<tr>
<th valign="middle" align="center">Hierarchical number</th>
<th valign="middle" colspan="2" align="center">Locus</th>
<th valign="middle" align="center">Chi-square test</th>
<th valign="middle" align="center">logistic regression</th>
<th valign="middle" align="center">FASTmrMLM</th>
<th valign="middle" align="center">MTOTC+<break/>FASTmrMLM</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" rowspan="6" align="center">2</td>
<td valign="middle" rowspan="3" align="center">Power(%)</td>
<td valign="middle" align="center">278</td>
<td valign="middle" align="center">66.20</td>
<td valign="middle" align="center">22.50</td>
<td valign="middle" align="center">41.85</td>
<td valign="middle" align="center">57.38</td>
</tr>
<tr>
<td valign="middle" align="right">2143</td>
<td valign="middle" align="center">57.70</td>
<td valign="middle" align="center">19.40</td>
<td valign="middle" align="center">28.62</td>
<td valign="middle" align="center">55.98</td>
</tr>
<tr>
<td valign="middle" align="right">3698</td>
<td valign="middle" align="center">18.40</td>
<td valign="middle" align="center">10.10</td>
<td valign="middle" align="center">9.89</td>
<td valign="middle" align="center">22.76</td>
</tr>
<tr>
<td valign="middle" colspan="2" align="left">Mean of Power (%)</td>
<td valign="middle" align="center">20.87</td>
<td valign="middle" align="center">7.60</td>
<td valign="middle" align="center">13.84</td>
<td valign="middle" align="center">19.54</td>
</tr>
<tr>
<td valign="middle" colspan="2" align="center">FPR (&#x2030;)</td>
<td valign="middle" align="center">7.27</td>
<td valign="middle" align="center">0.07</td>
<td valign="middle" align="center">0.44</td>
<td valign="middle" align="center">0.77</td>
</tr>
<tr>
<td valign="middle" colspan="2" align="center">F1 score</td>
<td valign="middle" align="center">0.04</td>
<td valign="middle" align="center">0.13</td>
<td valign="middle" align="center">0.16</td>
<td valign="middle" align="center">0.17</td>
</tr>
<tr>
<td valign="top" rowspan="6" align="center">3</td>
<td valign="middle" rowspan="3" align="center">Power(%)</td>
<td valign="middle" align="center">278</td>
<td valign="middle" align="center">62.00</td>
<td valign="middle" align="center">71.30</td>
<td valign="middle" align="center">53.41</td>
<td valign="middle" align="center">70.87</td>
</tr>
<tr>
<td valign="middle" align="center">2143</td>
<td valign="middle" align="center">56.40</td>
<td valign="middle" align="center">57.20</td>
<td valign="middle" align="center">45.76</td>
<td valign="middle" align="center">66.64</td>
</tr>
<tr>
<td valign="middle" align="center">3698</td>
<td valign="middle" align="center">20.00</td>
<td valign="middle" align="center">26.90</td>
<td valign="middle" align="center">19.66</td>
<td valign="middle" align="center">36.82</td>
</tr>
<tr>
<td valign="middle" colspan="2" align="left">Mean of Power (%)</td>
<td valign="middle" align="center">20.47</td>
<td valign="middle" align="center">23.30</td>
<td valign="middle" align="center">22.06</td>
<td valign="middle" align="center">26.53</td>
</tr>
<tr>
<td valign="middle" colspan="2" align="center">FPR (&#x2030;)</td>
<td valign="middle" align="center">6.15</td>
<td valign="middle" align="center">4.77</td>
<td valign="middle" align="center">0.45</td>
<td valign="middle" align="center">0.70</td>
</tr>
<tr>
<td valign="middle" colspan="2" align="center">F1 score</td>
<td valign="middle" align="center">0.04</td>
<td valign="middle" align="center">0.06</td>
<td valign="middle" align="center">0.24</td>
<td valign="middle" align="center">0.24</td>
</tr>
<tr>
<td valign="top" rowspan="6" align="center">4</td>
<td valign="middle" rowspan="3" align="center">Power(%)</td>
<td valign="middle" align="center">278</td>
<td valign="middle" align="center">68.40</td>
<td valign="middle" align="center">80.30</td>
<td valign="middle" align="center">65.70</td>
<td valign="middle" align="center">75.98</td>
</tr>
<tr>
<td valign="middle" align="center">2143</td>
<td valign="middle" align="center">58.50</td>
<td valign="middle" align="center">65.40</td>
<td valign="middle" align="center">58.71</td>
<td valign="middle" align="center">71.83</td>
</tr>
<tr>
<td valign="middle" align="center">3698</td>
<td valign="middle" align="center">20.80</td>
<td valign="middle" align="center">37.30</td>
<td valign="middle" align="center">27.76</td>
<td valign="middle" align="center">45.69</td>
</tr>
<tr>
<td valign="middle" colspan="2" align="left">Mean of Power (%)</td>
<td valign="middle" align="center">21.77</td>
<td valign="middle" align="center">27.37</td>
<td valign="middle" align="center">28.29</td>
<td valign="middle" align="center">28.53</td>
</tr>
<tr>
<td valign="middle" colspan="2" align="center">FPR (&#x2030;)</td>
<td valign="middle" align="center">8.11</td>
<td valign="middle" align="center">5.90</td>
<td valign="middle" align="center">0.45</td>
<td valign="middle" align="center">0.63</td>
</tr>
<tr>
<td valign="middle" colspan="2" align="center">F1 score</td>
<td valign="middle" align="center">0.03</td>
<td valign="middle" align="center">0.06</td>
<td valign="middle" align="center">0.29</td>
<td valign="middle" align="center">0.26</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s3_1_3">
<label>3.1.3</label>
<title>The effect of the number of hierarchical levels on the new method</title>
<p>The third simulation study investigated the effect of the number of hierarchical levels on MTOTC. Based on symmetrical distribution, the number of hierarchical levels was set to 2, 3, 4, and 5, respectively, and the number of replicates was 10,000. Meanwhile, we compared the results of OData, HData and CPData using FASTmrMLM.</p>
<p>Compared with CPData from the other hierarchical levels, the distribution of CPData2 (i.e., the CPData converted from the HData of 2 hierarchical levels by MTOTC) was closer to the original data (OData). First, the frequency distribution of the CPData was closer to that of the OData when the hierarchical level was low, especially when it was equal to 2 (<xref ref-type="fig" rid="f4">
<bold>Figure&#xa0;4</bold>
</xref>). As the number of hierarchical levels increased, the peak of CPData began to shift to the right and was far from the peak of the OData, which was expected to affect the GWAS results. The frequency distribution of the OData and the corresponding CPData with different hierarchical levels in the 10th and 613th replicates, randomly selected out of the 10,000 replicates using the uniformly distributed random number generator in R, is shown in <xref ref-type="fig" rid="f4">
<bold>Figure&#xa0;4</bold>
</xref>. Second, the range of the coefficient of variation (<italic>CV</italic>) of the OData was between 29.5% and 55.5%. Among the 10,000 replicates, the number of replicates beyond the <italic>CV</italic> range of the OData (4.09%, 18.94%, 21.47%, and 25.37% of CPData2, CPData3, CPData4, and CPData5, respectively) also increased with increasing hierarchical level. Thus, the <italic>CV</italic> range of CPData2 was the closest to that of the OData. Third, among the 10,000 replicates, the skewness range between the CPData and the OData was the closest at 2 hierarchical levels. Among them, the skewness range of the OData was between &#x2212;1.00 and 0.46 and the range of CPData2 was between &#x2212;1.28 and 0.35. As the number of hierarchical levels increased, the skewness of the CPData gradually deviated from that of the OData; the kurtosis showed the same tendency as the skewness.</p>
<fig id="f4" position="float">
<label>Figure&#xa0;4</label>
<caption>
<p>The frequency distribution of the OData and the corresponding CPData for different hierarchical levels in the 10th and 613th repetition. <bold>(A&#x2013;D)</bold> is the 10th repetition, <bold>(E&#x2013;H)</bold> is the 613th repetition. CPData2 transformed from HData of two hierarchical levels by MTOTC; CPData3 transformed from HData of three hierarchical levels by MTOTC; CPData5 transformed from HData of five hierarchical levels by MTOTC.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-14-1247181-g004.tif"/>
</fig>
<p>MTOTC performed well for the estimates of QTN position under different numbers of hierarchical levels. The position estimates via MTOTC+FASTmrMLM (i.e., the position estimates of the CPData via FASTmrMLM) were unbiased at loci 278, 2143, and 3698 (<xref ref-type="supplementary-material" rid="SM1">
<bold>Supplementary Table&#xa0;1</bold>
</xref>). Although the position estimates at loci 2054 and 8501 in CPData2, and at loci 1716 and 6178 in all the CPData were biased, the relative mean absolute deviations of their position estimates were all less than 8.96E-05. The accuracy of the estimates of QTN positions for ordinal traits was significantly improved by MTOTC when the number of hierarchical levels was less than 5, i.e., the estimates of QTN positions for the CPData were better than those for the HData when FASTmrMLM was used (<xref ref-type="supplementary-material" rid="SM1">
<bold>Supplementary Table&#xa0;1</bold>
</xref>).</p>
<p>The effect of MTOTC on the relative power at loci 278, 2143, and 3698 was the greatest when the number of hierarchical levels is equal to 2 (<xref ref-type="supplementary-material" rid="SM1">
<bold>Supplementary Figure&#xa0;1</bold>
</xref>). Here, &#x201c;the effect of MTOTC on the relative power&#x201d; refers to the increment of the relative power of CPData compared to the relative power of HData. The relative power of the CPData (50%~100%) was significantly higher than that of the HData (22%~88%) and was relatively closer to the power of the OData. When the number of the hierarchical levels of the CPData was less than or equal to 5, the relative power exhibited an increasing trend with increasing the number of hierarchical levels and was significantly superior to that of the HData.</p>
<p>The false-positive rates of CPData2, CPData3, CPData4, and CPData5 via MTOTC+FASTmrMLM were 0.77&#x2030;, 0.70&#x2030;, 0.63&#x2030;, and 0.55&#x2030;, respectively.</p>
</sec>
<sec id="s3_1_4">
<label>3.1.4</label>
<title>The effect of the number of replicates on the new method</title>
<p>The fourth simulation study assessed the impact of the number of replicates on the estimates of QTN effects and positions, relative power, and false-positive rate using MTOTC+FASTmrMLM. Based on the results of CPData2 (1:1), CPData3 (1:3:1), and CPData5 (1:2:4:2:1), 10 replicates were set at equal intervals from 1,000 to 10,000. As a result, the results across various numbers of replicates at each locus and for each hierarchical levels (CPData2, CPData3, and CPData5) were insignificant (<xref ref-type="fig" rid="f5">
<bold>Figure&#xa0;5</bold>
</xref>). This indicated that the number of replicates did not affect the power, false-positive rate, and the estimates of QTN effects and positions. Therefore, 1,000 replicates were used in subsequent simulation studies.</p>
<fig id="f5" position="float">
<label>Figure&#xa0;5</label>
<caption>
<p>The impact of repetition number of simulation experiment on the association analysis results of CPData (2143 Locus). <bold>(A, B)</bold> MSE and MAD of QTN effect at 2143, respectively; <bold>(C)</bold> false-positive rates; <bold>(D)</bold> relative power.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-14-1247181-g005.tif"/>
</fig>
</sec>
<sec id="s3_1_5">
<label>3.1.5</label>
<title>The effect of distribution proportion skewness on the new method</title>
<p>In the fifth simulation study, we investigated the effect of distribution proportion skewness on the new method under three hierarchical levels. Here the distribution proportion skewness were set as symmetrical distribution (distribution proportion, 1:2:1), uniform distribution (1:1:1), and skewed distribution (4:2:1). The indicators were the relative power, false-positive rate, the estimates of QTN effects and positions. The skewed distribution had the lowest relative power at loci 278, 2143, and 3698, followed by the uniform distribution, and the symmetrical distribution (Supplementary <xref ref-type="fig" rid="f2">
<bold>Figure&#xa0;2</bold>
</xref>). The MAD and mean squared error (MSE) of QTN position estimates showed unbiasedness under the three distribution proportion skewness. The skewed distribution (7.09&#x2030;) was slightly higher false-positive rate than symmetrical distribution (6.71&#x2030;) and uniform distribution (6.82&#x2030;). When the kurtosis values of the three distributions for the CPData and the OData were compared, it was found that the steepness of the CPData under 1:2:1 was closer to that of the OData (the kurtosis values for the OData, 1:2:1 CPData, 1:1:1 CPData, and 4:2:1 CPData ranged from 2.163&#x2013;5.415, 1.963&#x2013;5.412, 1.958&#x2013;5.196, and 1.980&#x2013;3.830, respectively). The CPData under 1:2:1 and 1:1:1 and the OData were relatively close in terms of skewness (the skewness of OData, 1:2:1 CPData, 1:1:1 CPData, and 4:2:1 CPData were in the range of &#x2212;1.001~0.462, &#x2212;1.466~0.319, &#x2212;1.256~0.282, and &#x2212;0.812~0.777, respectively). The skewness of the CPData under 4:2:1 and the OData differed markedly. Therefore, the accuracy of symmetric distribution via MTOTC+FASTmrMLM was higher than that of uniform distribution and skewed distribution.</p>
</sec>
<sec id="s3_1_6">
<label>3.1.6</label>
<title>The effect of distribution proportion kurtosis on the new method</title>
<p>Here we studied the effect of distribution proportion kurtosis on the new method. The proportions were set as 1:2:1, 1:4:1, and 1:5:1. The association detection results of the 1:2:1 proportion had the best, e.g., the relative powers of the 1:2:1 proportion at loci 2143, 278, 3698, and 1716 via MTOTC+FASTmrMLM was better than those under others distribution proportion (<xref ref-type="fig" rid="f6">
<bold>Figure&#xa0;6A</bold>
</xref>). The MSE and MAD of effect estimates at locus 278, 2143, and 3698 were lower at 1:2:1 than at 1:4:1 and 1:5:1; however, the differences were insignificant (<xref ref-type="fig" rid="f6">
<bold>Figures&#xa0;6B, C</bold>
</xref>), while the trends at the other loci were unclear. Under the three distribution proportions, the MSE and MAD of QTN position estimates were all unbiased at loci 278, 2143, 2054, and 3698. However, a lower false-positive rate was observed with the 1:2:1 distribution proportion (<xref ref-type="fig" rid="f6">
<bold>Figure&#xa0;6D</bold>
</xref>). Moreover, the steepness of the CPData under distribution proportion 1:2:1 was closer to that of the OData (the kurtosis values of the OData, 1:2:1 CPData, 1:4:1 CPData, and 1:5:1 CPData were 2.163~5.415, 1.963~5.412, 1.967~7.343, and 1.974~7.920, respectively). The skewness showed the same tendency as the kurtosis (the skewness ranges of the OData, 1:2:1 CPData, 1:4:1 CPData, and 1:5:1 CPData were &#x2212;1.001~0.462, &#x2212;1.466~0.319, &#x2212;1.788~0.142, and &#x2212;1.796~0.150, respectively). In summary, the distribution of the CPData at the 1:2:1 proportion was closer to that of the OData, and MTOTC worked better, compared with the other distribution proportions.</p>
<fig id="f6" position="float">
<label>Figure&#xa0;6</label>
<caption>
<p>The effect of phenotype distribution kurtosis on the association detection results of MTOTC+FASTmrMLM. <bold>(A, B)</bold> MSE and MAD of QTN effect at 2143, respectively; <bold>(C)</bold> false-positive rates; <bold>(D)</bold> relative power.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-14-1247181-g006.tif"/>
</fig>
</sec>
<sec id="s3_1_7">
<label>3.1.7</label>
<title>The performance of MTOTC with different GWAS methods</title>
<p>The HData of ordinal trait were transformed by MTOTC, and the obtained CPData were found to be suitable for association analysis via FASTmrMLM when there were five or fewer hierarchical levels, owing to high power. Meanwhile, similar results were obtained when MTOTC was combined with others methods in the mrMLM software (<xref ref-type="bibr" rid="B30">Zhang et&#xa0;al., 2020</xref>) (<xref ref-type="supplementary-material" rid="SM1">
<bold>Supplementary Figure&#xa0;1</bold>
</xref>; <xref ref-type="supplementary-material" rid="SM1">
<bold>Supplementary Table&#xa0;1</bold>
</xref>). They were also suitable for GWAS for the CPData of ordinal traits, having the characteristics of high relative power, low false-positive rates, and high accuracy of position and effect estimates. Moreover, similar trends from FASTmrMLM in the simulation experiments with the number of the hierarchical levels and their distribution proportions were observed as well (<xref ref-type="supplementary-material" rid="SM1">
<bold>Supplementary Figure&#xa0;2</bold>
</xref>). MTOTC + FASTmrMLM had the best performance, followed by mrMLM (<xref ref-type="bibr" rid="B21">Wang et&#xa0;al., 2016</xref>), ISIS EM-BLASSO (<xref ref-type="bibr" rid="B18">Tamba et&#xa0;al., 2017</xref>), and FASTmrEMMA (<xref ref-type="bibr" rid="B24">Wen et&#xa0;al., 2018</xref>); and finally by pLARmEB (<xref ref-type="bibr" rid="B28">Zhang et&#xa0;al., 2017</xref>) and pKWmEB (<xref ref-type="bibr" rid="B13">Ren et&#xa0;al., 2018</xref>). Therefore, MTOTC can be integrated with different methods to conduct GWAS for ordinal traits. Considering the diversity and complexity of phenotypic data in ordinal traits in practice, multiple methods might be simultaneously used in a complementary manner. Accordingly, MTOTC improves the performance in identifying significant loci for ordinal traits.</p>
</sec>
</sec>
<sec id="s3_2">
<label>3.2</label>
<title>Real data analysis</title>
<p>To validate the new method, the salt-alkali tolerant data in 286 soybean accessions obtained in 2009 and 2010 from <xref ref-type="bibr" rid="B29">Zhang et&#xa0;al. (2014)</xref> was re-analyzed in this study. The experiments were conducted in a completely randomized Design, and the number of high-quality SNP markers in this population was 54,296 (<xref ref-type="bibr" rid="B31">Zhou et&#xa0;al., 2015</xref>). First, MTOTC was applied to obtain the CPData. Then, the index data, HData5 [hierarchical data generated from the index data by 1:1:1:1:1 (<xref ref-type="bibr" rid="B14">Shao, 1986</xref>)], CPData2 (continuous phenotypic data generated from HData2 by MTOTC), and CPData5 (continuous phenotypic data generated from HData5 by MTOTC) for salt-alkali tolerance in soybean were analyzed using the mrMLM, ISIS EM-BLASSO, pLARmEB, FASTmrEMMA, pKWmEB, and FASTmrMLM methods.</p>
<sec id="s3_2_1">
<label>3.2.1</label>
<title>QTNs significantly associated with soybean salt-alkali tolerance</title>
<p>For the four types of phenotypic data of salt-alkali tolerance, a greater number of significant QTNs were detected in CPData than in the index data or HData. Six GWAS methods mapped 65 and 99 QTNs in CPData2 and CPData5 of salt tolerance traits, respectively, and 134 and 60 QTNs in CPData2 and CPData5 of alkali tolerance traits, respectively. pLARmEB detected a greater number of QTNs in CPData (116 for salt tolerance traits and 166 for alkali tolerance traits) compared with the other five GWAS methods, which may be related to its relatively higher false-positive rate. Additionally, the numbers of significant QTNs detected by pKWmEB, mrMLM, and FASTmrMLM in CPData (44, 25, and 14 for the salt tolerance trait and 25, 21, and 19 for the alkali-tolerance trait, respectively) were second only to the number of QTNs detected with pLARmEB.</p>
<p>Four QTNs (locus 9682 on chromosome 2 [Chr2-9682], Chr11-54042, Chr13-64738, and Chr13-65248) for salt tolerance were simultaneously detected in the index data and at least one CPData; however, none of them was detected in HData5. For instance, Chr13-64738 was simultaneously detected in CPData2 by five methods and in the salt tolerance index data by two methods. Chr13-65248 was detected in CPData5 by four methods and in both CPData5 and the index data by FASTmrMLM. Three QTNs (Chr7-34669, Chr13-67342, and Chr20-105040) for alkali tolerance were simultaneously detected in the index data and in at least one CPData, two of them were also detected in HData5.</p>
<p>The results of six GWAS methods for the CPData of salt-alkali tolerance showed that only a few significant QTNs were coincident between 2009 and 2010, which can be explained by the differences in environmental influences between the two years. For salt tolerance, no QTNs were found to overlap between 2009 and 2010 in the six methods. For alkali tolerance, only Chr1-5051 and Chr16-82333 were detected in both years. There was indeed an environmental (year) effect according to variance analysis of the phenotypic results for the two years (<xref ref-type="bibr" rid="B29">Zhang et&#xa0;al., 2014</xref>).</p>
</sec>
<sec id="s3_2_2">
<label>3.2.2</label>
<title>Candidate genes for salt-alkali tolerance</title>
<p>Potential candidate genes were mined from 100 kb upstream to 100 kb downstream (<xref ref-type="bibr" rid="B9">Liu et&#xa0;al., 2020</xref>) of significant QTNs that were detected in at least two types of data or by two methods (<xref ref-type="table" rid="T2">
<bold>Tables&#xa0;2</bold>
</xref> and <xref ref-type="table" rid="T3">
<bold>3</bold>
</xref>). Functional annotation information in the SoyBase database (<bold>Error! Hyperlink reference not valid.</bold> <ext-link ext-link-type="uri" xlink:href="http://www.Soybase.org/">http://www.Soybase.org/</ext-link>) was also used to screen candidate genes. A total of 34 potentially candidate genes for salt tolerance and 25 potentially candidate genes for alkali tolerance were mined.</p>
<table-wrap id="T2" position="float">
<label>Table&#xa0;2</label>
<caption>
<p>Salt stress-related candidate genes from six genome-wide association study methods.</p>
</caption>
<table frame="hsides">
<thead>
<tr>
<th valign="middle" align="center">Candidate genes</th>
<th valign="middle" align="center">QTN positions</th>
<th valign="middle" align="center">Methods</th>
<th valign="middle" align="center">Functional annotation</th>
<th valign="middle" align="center">Arabidopsis homologous</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">
<italic>Glyma02g38320</italic>
</td>
<td valign="top" align="left">43804331</td>
<td valign="top" align="left">mrMLM<sup>1**</sup>, pLARmEB<sup>3**</sup>
</td>
<td valign="top" align="left">transmembrane transport</td>
<td valign="top" align="left">AT5G22900</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma02g38350</italic>
</td>
<td valign="top" align="left">43804331</td>
<td valign="top" align="left">mrMLM<sup>1**</sup>, pLARmEB<sup>3**</sup>
</td>
<td valign="top" align="left">Pentatricopeptide repeat (PPR -like) superfamily protein</td>
<td valign="top" align="left">AT5G37570</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma02g38370</italic>
</td>
<td valign="top" align="left">43804331</td>
<td valign="top" align="left">mrMLM<sup>1**</sup>, pLARmEB<sup>3**</sup>
</td>
<td valign="top" align="left">zinc ion binding</td>
<td valign="top" align="left">AT2G40770</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma02g38380</italic>
</td>
<td valign="top" align="left">43804331</td>
<td valign="top" align="left">mrMLM<sup>1**</sup>, pLARmEB<sup>3**</sup>
</td>
<td valign="top" align="left">catalytic activity</td>
<td valign="top" align="left">AT5G05200</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma02g38395</italic>
</td>
<td valign="top" align="left">43804331</td>
<td valign="top" align="left">mrMLM<sup>1**</sup>, pLARmEB<sup>3**</sup>
</td>
<td valign="top" align="left">respiratory burst involved in defense response</td>
<td valign="top" align="left">AT5G05190</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma04g13670</italic>
</td>
<td valign="top" align="left">13441084</td>
<td valign="top" align="left">FASTmrEMMA<sup>3**</sup>, mrMLM<sup>3**</sup>, pLARmEB<sup>3**</sup>
</td>
<td valign="top" align="left">oxidoreductase activity</td>
<td valign="top" align="left">AT4G25240</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma05g25331 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">31519270</td>
<td valign="top" align="left">FASTmrEMMA<sup>3*</sup>, ISIS EM-BLASSO<sup>3*</sup>, mrMLM<sup>3*</sup>, pKWmEB<sup>3*</sup>, pLARmEB<sup>3*</sup>
</td>
<td valign="top" align="left">WRKY DNA-binding domain</td>
<td valign="top" align="left">AT2G34830</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma05g25420 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">31519270</td>
<td valign="top" align="left">FASTmrEMMA<sup>3*</sup>, ISIS EM-BLASSO<sup>3*</sup>, mrMLM<sup>3*</sup>, pKWmEB<sup>3*</sup>, pLARmEB<sup>3*</sup>
</td>
<td valign="top" align="left">zinc ion binding</td>
<td valign="top" align="left">AT5G37930</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma05g25450 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">31519270</td>
<td valign="top" align="left">FASTmrEMMA<sup>3*</sup>, ISIS EM-BLASSO<sup>3*</sup>, mrMLM<sup>3*</sup>, pKWmEB<sup>3*</sup>, pLARmEB<sup>3*</sup>
</td>
<td valign="top" align="left">catalytic activity</td>
<td valign="top" align="left">AT5G44440</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma05g25460 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">31519270</td>
<td valign="top" align="left">FASTmrEMMA<sup>3*</sup>, ISIS EM-BLASSO<sup>3*</sup>, mrMLM<sup>3*</sup>, pKWmEB<sup>3*</sup>, pLARmEB<sup>3*</sup>
</td>
<td valign="top" align="left">catalytic activity</td>
<td valign="top" align="left">AT2G34790</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma08g13260</italic>
</td>
<td valign="top" align="left">9687628</td>
<td valign="top" align="left">FASTmrEMMA<sup>3**</sup>, FASTmrMLM<sup>3**</sup>, ISIS EM-BLASSO<sup>3**</sup>, mrMLM<sup>3**</sup>, pKWmEB<sup>3**</sup>, pLARmEB<sup>3**</sup>
</td>
<td valign="top" align="left">Serine/threonine protein kinase</td>
<td valign="top" align="left">AT3G16030</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma10g40400 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">47864560</td>
<td valign="top" align="left">FASTmrMLM<sup>2*</sup>, ISIS EM-BLASSO<sup>2*</sup>, mrMLM<sup>2*</sup>, pKWmEB<sup>2*</sup>, pLARmEB<sup>2*</sup>
</td>
<td valign="top" align="left">zinc ion binding</td>
<td valign="top" align="left">AT5G67450</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma10g40510 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">47864560</td>
<td valign="top" align="left">FASTmrMLM<sup>2*</sup>, ISIS EM-BLASSO<sup>2*</sup>, mrMLM<sup>2*</sup>, pKWmEB<sup>2*</sup>, pLARmEB<sup>2*</sup>
</td>
<td valign="top" align="left">zinc ion binding</td>
<td valign="top" align="left">AT4G15090</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma10g40520 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">47864560</td>
<td valign="top" align="left">FASTmrMLM<sup>2*</sup>, ISIS EM-BLASSO<sup>2*</sup>, mrMLM<sup>2*</sup>, pKWmEB<sup>2*</sup>, pLARmEB<sup>2*</sup>
</td>
<td valign="top" align="left">oxidoreductase activity</td>
<td valign="top" align="left">AT4G33910</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma11g14030 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">10094063</td>
<td valign="top" align="left">mrMLM<sup>1**</sup>, pKWmEB<sup>1**</sup>, pLARmEB<sup>3**</sup>
</td>
<td valign="top" align="left">protein serine/threonine kinase activity</td>
<td valign="top" align="left">AT3G20830</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma11g14040 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">10094063</td>
<td valign="top" align="left">mrMLM<sup>1**</sup>, pKWmEB<sup>1**</sup>, pLARmEB<sup>3**</sup>
</td>
<td valign="top" align="left">sequence-specific DNA binding transcription factor activity</td>
<td valign="top" align="left">AT1G51190</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma11g14050 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">10094063</td>
<td valign="top" align="left">mrMLM<sup>1**</sup>, pKWmEB<sup>1**</sup>, pLARmEB<sup>3**</sup>
</td>
<td valign="top" align="left">zinc ion binding</td>
<td valign="top" align="left">AT1G51200</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma11g14081 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">10094063</td>
<td valign="top" align="left">mrMLM<sup>1**</sup>, pKWmEB<sup>1**</sup>, pLARmEB<sup>3**</sup>
</td>
<td valign="top" align="left">catalytic activity</td>
<td valign="top" align="left">AT3G18080</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma11g14090 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">10094063</td>
<td valign="top" align="left">mrMLM<sup>1**</sup>, pKWmEB<sup>1**</sup>, pLARmEB<sup>3**</sup>
</td>
<td valign="top" align="left">transmembrane transport</td>
<td valign="top" align="left">AT3G20870</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma11g14100 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">10094063</td>
<td valign="top" align="left">mrMLM<sup>1**</sup>, pKWmEB<sup>1**</sup>, pLARmEB<sup>3**</sup>
</td>
<td valign="top" align="left">zinc ion binding</td>
<td valign="top" align="left">AT1G51220</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma11g14110 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">10094063</td>
<td valign="top" align="left">mrMLM<sup>1**</sup>, pKWmEB<sup>1**</sup>, pLARmEB<sup>4**</sup>
</td>
<td valign="top" align="left">Zinc finger, C3HC4 type (RING finger)</td>
<td valign="top" align="left">AT3G63530</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma12g03490</italic>
</td>
<td valign="top" align="left">2356018</td>
<td valign="top" align="left">FASTmrEMMA<sup>3*</sup>, FASTmrMLM<sup>3*</sup>, ISIS EM-BLASSO<sup>3*</sup>, mrMLM<sup>3*</sup>, pKWmEB<sup>3*</sup>, pLARmEB<sup>3*</sup>
</td>
<td valign="top" align="left">transmembrane transporter</td>
<td valign="top" align="left">AT2G21050</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma12g03570</italic>
</td>
<td valign="top" align="left">2356018</td>
<td valign="top" align="left">FASTmrEMMA<sup>3*</sup>, FASTmrMLM<sup>3*</sup>, ISIS EM-BLASSO<sup>3*</sup>, mrMLM<sup>3*</sup>, pKWmEB<sup>3*</sup>, pLARmEB<sup>3*</sup>
</td>
<td valign="top" align="left">catalytic activity</td>
<td valign="top" align="left">AT4G34980</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma12g03580</italic>
</td>
<td valign="top" align="left">2356018</td>
<td valign="top" align="left">FASTmrEMMA<sup>3*</sup>, FASTmrMLM<sup>3*</sup>, ISIS EM-BLASSO<sup>3*</sup>, mrMLM<sup>3*</sup>, pKWmEB<sup>3*</sup>, pLARmEB<sup>3*</sup>
</td>
<td valign="top" align="left">transmembrane transporter</td>
<td valign="top" align="left">AT5G09220</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma13g25266 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">28469311</td>
<td valign="top" align="left">FASTmrEMMA<sup>2**</sup>, FASTmrMLM<sup>1,2**</sup>, ISIS EM-BLASSO<sup>2**</sup>, pKWmEB<sup>2**</sup>, pLARmEB<sup>1,2**</sup>
</td>
<td valign="top" align="left">hyperosmotic salinity response</td>
<td valign="top" align="left">AT1G61120</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma13g27630 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">30845044</td>
<td valign="top" align="left">FASTmrEMMA<sup>3**</sup>, FASTmrMLM<sup>1,3**</sup>, mrMLM<sup>3**</sup>, pKWmEB<sup>3**</sup>
</td>
<td valign="top" align="left">protein serine/threonine kinase activity</td>
<td valign="top" align="left">AT3G20530</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma13g27680 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">30845044</td>
<td valign="top" align="left">FASTmrEMMA<sup>3**</sup>, FASTmrMLM<sup>1,3**</sup>, mrMLM<sup>3**</sup>, pKWmEB<sup>3**</sup>
</td>
<td valign="top" align="left">transmembrane transport</td>
<td valign="top" align="left">AT1G61800</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma13g27691 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">30845044</td>
<td valign="top" align="left">FASTmrEMMA<sup>3**</sup>, FASTmrMLM<sup>1,3**</sup>, mrMLM<sup>3**</sup>, pKWmEB<sup>3**</sup>
</td>
<td valign="top" align="left">zinc ion binding</td>
<td valign="top" align="left">AT4G14220</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma13g27701 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">30845044</td>
<td valign="top" align="left">FASTmrEMMA<sup>3**</sup>, FASTmrMLM<sup>1,3**</sup>, mrMLM<sup>3**</sup>, pKWmEB<sup>3**</sup>
</td>
<td valign="top" align="left">response to oxidative stress</td>
<td valign="top" align="left">AT3G06050</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma13g27710 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">30845044</td>
<td valign="top" align="left">FASTmrEMMA<sup>3**</sup>, FASTmrMLM<sup>1,3**</sup>, mrMLM<sup>3**</sup>, pKWmEB<sup>3**</sup>
</td>
<td valign="top" align="left">response to oxidative stress</td>
<td valign="top" align="left">AT3G06050</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma13g27740 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">30845044</td>
<td valign="top" align="left">FASTmrEMMA<sup>3**</sup>, FASTmrMLM<sup>1,3**</sup>, mrMLM<sup>3**</sup>, pKWmEB<sup>3**</sup>
</td>
<td valign="top" align="left">oxidoreductase activity</td>
<td valign="top" align="left">AT3G06060</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma13g27770 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">30845044</td>
<td valign="top" align="left">FASTmrEMMA<sup>3**</sup>, FASTmrMLM<sup>1,3**</sup>, mrMLM<sup>3**</sup>, pKWmEB<sup>3**</sup>
</td>
<td valign="top" align="left">sequence-specific DNA binding transcription factor activity</td>
<td valign="top" align="left">AT1G54830</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma15g42440</italic>
</td>
<td valign="top" align="left">49869431</td>
<td valign="top" align="left">FASTmrEMMA<sup>2*</sup>, mrMLM<sup>2*</sup>, ISIS EM-BLASSO<sup>2*</sup>, pKWmEB<sup>2*</sup>, pLARmEB<sup>2*</sup>
</td>
<td valign="top" align="left">Myb-like DNA-binding domain</td>
<td valign="top" align="left">AT2G44430</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma15g42460</italic>
</td>
<td valign="top" align="left">49869431</td>
<td valign="top" align="left">FASTmrEMMA<sup>2*</sup>, mrMLM<sup>2*</sup>, ISIS EM-BLASSO<sup>2*</sup>, pKWmEB<sup>2*</sup>, pLARmEB<sup>2*</sup>
</td>
<td valign="top" align="left">Serine/threonine protein kinase</td>
<td valign="top" align="left">AT2G32850</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>1: index data; 2: continuous phenotypic data (CPData2) generated from HData2 by MTOTC; 3: continuous phenotypic data (CPData5) generated from HData5 by MTOTC; *: 2009; **: 2010, <sup>#</sup>: candidate genes were further screened by haplotype block analysis.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<table-wrap id="T3" position="float">
<label>Table&#xa0;3</label>
<caption>
<p>Alkali stress-related candidate genes from six genome-wide association study methods.</p>
</caption>
<table frame="hsides">
<thead>
<tr>
<th valign="middle" align="center">Candidate genes</th>
<th valign="middle" align="center">QTN positions</th>
<th valign="middle" align="center">Methods</th>
<th valign="middle" align="center">Functional annotation</th>
<th valign="middle" align="center">Arabidopsis homologous</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">
<italic>Glyma01g41510 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">53035914</td>
<td valign="top" align="left">pLARmEB<sup>2,3*</sup>
</td>
<td valign="top" align="left">Protein serine/threonine kinase activity</td>
<td valign="top" align="left">AT5G60900</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma01g41520 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">53035914</td>
<td valign="top" align="left">pLARmEB<sup>2,3*</sup>
</td>
<td valign="top" align="left">sequence-specific DNA binding transcription factor activity</td>
<td valign="top" align="left">AT4G17500</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma01g41527 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">53035914</td>
<td valign="top" align="left">pLARmEB<sup>2,3*</sup>
</td>
<td valign="top" align="left">sequence-specific DNA binding transcription factor activity</td>
<td valign="top" align="left">AT5G47230</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma01g41560 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">53035914</td>
<td valign="top" align="left">pLARmEB<sup>2,3*</sup>
</td>
<td valign="top" align="left">zinc ion binding</td>
<td valign="top" align="left">AT5G53110</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma01g41581 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">53035914</td>
<td valign="top" align="left">pLARmEB<sup>2,3*</sup>
</td>
<td valign="top" align="left">sequence-specific DNA binding transcription factor activity</td>
<td valign="top" align="left">AT5G47370</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma01g41610 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">53035914</td>
<td valign="top" align="left">pLARmEB<sup>2,3*</sup>
</td>
<td valign="top" align="left">sequence-specific DNA binding transcription factor activity</td>
<td valign="top" align="left">AT3G13540</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma03g28210 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">36121029</td>
<td valign="top" align="left">FASTmrEMMA<sup>2**</sup>, pLARmEB<sup>2**</sup>
</td>
<td valign="top" align="left">F-box family protein</td>
<td valign="top" align="left">AT2G32560</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma03g28222 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">36121029</td>
<td valign="top" align="left">FASTmrEMMA<sup>2**</sup>, pLARmEB<sup>2**</sup>
</td>
<td valign="top" align="left">F-box family protein</td>
<td valign="top" align="left">AT2G26850</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma03g28234 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">36121029</td>
<td valign="top" align="left">FASTmrEMMA<sup>2**</sup>, pLARmEB<sup>2**</sup>
</td>
<td valign="top" align="left">F-box family protein</td>
<td valign="top" align="left">AT2G32560</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma03g28247 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">36121029</td>
<td valign="top" align="left">FASTmrEMMA<sup>2**</sup>, pLARmEB<sup>2**</sup>
</td>
<td valign="top" align="left">F-box family protein</td>
<td valign="top" align="left">AT2G26850</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma07g20380 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">20580766</td>
<td valign="top" align="left">FASTmrEMMA<sup>3**</sup>, FASTmrMLM<sup>1,3**</sup>, ISIS EM-BLASSO<sup>3**</sup>, mrMLM<sup>3**</sup>, pKWmEB<sup>3**</sup>, pLARmEB<sup>1,3**</sup>
</td>
<td valign="top" align="left">Pentatricopeptide repeat (PPR) superfamily protein</td>
<td valign="top" align="left">AT3G48810</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma13g44560 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">43999096</td>
<td valign="top" align="left">FASTmrMLM<sup>1*</sup>, pLARmEB<sup>1,3*</sup>, pKWmEB<sup>3*</sup>
</td>
<td valign="top" align="left">transmembrane transport</td>
<td valign="top" align="left">AT3G19640</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma13g44570 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">43999096</td>
<td valign="top" align="left">FASTmrMLM<sup>1*</sup>, pLARmEB<sup>1,3*</sup>, pKWmEB<sup>3*</sup>
</td>
<td valign="top" align="left">sequence-specific DNA binding transcription factor activity</td>
<td valign="top" align="left">AT4G37850</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma13g44582 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">43999096</td>
<td valign="top" align="left">FASTmrMLM<sup>1*</sup>, pLARmEB<sup>1,3*</sup>, pKWmEB<sup>3*</sup>
</td>
<td valign="top" align="left">sequence-specific DNA binding transcription factor activity</td>
<td valign="top" align="left">AT2G22760</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma13g44594 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">43999096</td>
<td valign="top" align="left">FASTmrMLM<sup>1*</sup>, pLARmEB<sup>1,3*</sup>, pKWmEB<sup>3*</sup>
</td>
<td valign="top" align="left">sequence-specific DNA binding transcription factor activity</td>
<td valign="top" align="left">AT4G37850</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma13g44640 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">43999096</td>
<td valign="top" align="left">FASTmrMLM<sup>1*</sup>, pLARmEB<sup>1,3*</sup>, pKWmEB<sup>3*</sup>
</td>
<td valign="top" align="left">Serine/threonine-protein kinase PBS1</td>
<td valign="top" align="left">AT1G80640</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma13g44660 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">43999096</td>
<td valign="top" align="left">FASTmrMLM<sup>1*</sup>, pLARmEB<sup>1,3*</sup>, pKWmEB<sup>3*</sup>
</td>
<td valign="top" align="left">sequence-specific DNA binding transcription factor activity</td>
<td valign="top" align="left">AT5G25190</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma16g25280 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">29252235</td>
<td valign="top" align="left">pLARmEB<sup>2,3*</sup>
</td>
<td valign="top" align="left">sequence-specific DNA binding transcription factor activity</td>
<td valign="top" align="left">AT2G18350</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma16g25310 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">29252235</td>
<td valign="top" align="left">pLARmEB<sup>2,3*</sup>
</td>
<td valign="top" align="left">transmembrane transport</td>
<td valign="top" align="left">AT1G75220</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma16g25320 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">29252235</td>
<td valign="top" align="left">pLARmEB<sup>2,3*</sup>
</td>
<td valign="top" align="left">transmembrane transport</td>
<td valign="top" align="left">AT1G75220</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma19g39270</italic>
</td>
<td valign="top" align="left">46014852</td>
<td valign="top" align="left">FASTmrMLM<sup>1*</sup>, pKWmEB<sup>1*</sup>, pLARmEB<sup>1*</sup>
</td>
<td valign="top" align="left">response to oxidative stress</td>
<td valign="top" align="left">AT4G11290</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma19g39320</italic>
</td>
<td valign="top" align="left">46014852</td>
<td valign="top" align="left">FASTmrMLM<sup>1*</sup>, pKWmEB<sup>1*</sup>, pLARmEB<sup>1*</sup>
</td>
<td valign="top" align="left">oxidoreductase activity</td>
<td valign="top" align="left">AT4G03140</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma19g39340</italic>
</td>
<td valign="top" align="left">46014852</td>
<td valign="top" align="left">FASTmrMLM<sup>1*</sup>, pKWmEB<sup>1*</sup>, pLARmEB<sup>1*</sup>
</td>
<td valign="top" align="left">Regulation of transcription</td>
<td valign="top" align="left">AT5G62000</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma20g31790 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">40400845</td>
<td valign="top" align="left">pLARmEB<sup>1,2*</sup>
</td>
<td valign="top" align="left">zinc ion binding</td>
<td valign="top" align="left">AT3G52300</td>
</tr>
<tr>
<td valign="top" align="left">
<italic>Glyma20g31800 <sup>#</sup>
</italic>
</td>
<td valign="top" align="left">40400845</td>
<td valign="top" align="left">pLARmEB<sup>1,2*</sup>
</td>
<td valign="top" align="left">transmembrane transport</td>
<td valign="top" align="left">AT2G35800</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>1: index data; 2: continuous phenotypic data (CPData2) generated from HData2 by MTOTC; 3: continuous phenotypic data (CPData5) generated from HData5 by MTOTC; *: 2009; **: 2010, <sup>#</sup>: candidate genes were further screened by haplotype block analysis.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>For salt tolerance, 19 candidate genes were detected simultaneously in the index data and CPData5. Among them, <italic>Glyma05g25331</italic>, <italic>Glyma05g25420</italic>, <italic>Glyma05g25450</italic>, and <italic>Glyma05g25460</italic> were all detected by five GWAS methods in CPData5 in 2009. Only one gene, <italic>Glyma13g25266</italic>, was detected in both the index data and CPData2 detected by five GWAS methods in CPData2 and two methods in the index data in 2010. In addition, five candidate genes were detected only in CPData2 by five methods, and nine candidate genes were detected only in CPData5 by three or more methods. No overlapping genes were found between HData5 and the index data or the CPData (<xref ref-type="table" rid="T2">
<bold>Table&#xa0;2</bold>
</xref>).</p>
<p>For alkali tolerance, 7 candidate genes for alkali stress were concurrently detected in the index data and CPData5. For instance, <italic>Glyma07g20380</italic> was simultaneously detected by 2, 1, and 6 GWAS methods in the index data, HData5, and CPData5 in 2010, respectively (<xref ref-type="table" rid="T3">
<bold>Table&#xa0;3</bold>
</xref>). Two candidate genes were detected in the index data and CPData2. Ten candidate genes were simultaneously detected in CPData2 and CPData5. <italic>Glyma10g02920</italic> was detected by one GWAS method in CPData2 and five GWAS methods in CPData5 in 2009. <italic>Glyma07g20380</italic> was detected by all six association analysis methods in CPData5 in 2010.</p>
</sec>
<sec id="s3_2_3">
<label>3.2.3</label>
<title>QTN based haplotype and phenotypic difference analysis</title>
<p>Based on the above 34 salt stress-related candidate genes and 25 alkali stress-related candidate genes, Haploview software was used to perform haplotype block analysis. And the phenotypic differences across haplotypes were examined using the t-test in SAS9.4. Four stable QTNs for salt tolerance and six stable QTNs for alkali resistance were screened to form haplotype blocks based on linkage disequilibrium (<xref ref-type="supplementary-material" rid="SM1">
<bold>Supplementary Figures&#xa0;3</bold>
</xref> and <xref ref-type="supplementary-material" rid="SM1">
<bold>4</bold>
</xref>).</p>
<p>In haplotype block with the significant QTNs Chr13-64738 for salt tolerance, t-test showed significant phenotypic differences between haplotypes ACAT and AATT (<italic>P</italic>=0.0341 in 2009 and <italic>P</italic>=0.0083 in 2010), between haplotypes TCAT and AATT (<italic>P</italic>=0.0091) in 2010, and between haplotypes TCAT and TCTT (<italic>P</italic>=0.0471) in 2010. However, for haplotype blocks of other salt tolerance QTNs, it was showed that the significant phenotypic differences existed between haplotypes only in a single year, and the haplotype pairs with significant differences included haplotype AGTGC and TACCC (<italic>P</italic>=0.0348), AGTGC and TGTCA (<italic>P</italic>=0.0345) for Chr5-24153; haplotype GCG and ATA (<italic>P</italic>=0.0408) for Chr10-52140; haplotypes GTAGA and GTAGT (<italic>P</italic>=0.0397), GTAGT and AAGTT (<italic>P</italic>=0.0540) for Chr11-54042.</p>
<p>There were two significant QTNs Chr16-82333 and Chr3-14262 for alkali tolerance with significant phenotypic differences across haplotypes in both years. The Chr16-82333 recorded significant differences between haplotypes CTGACG and CCGGAG (<italic>P</italic>=0.0158 in 2009, <italic>P</italic>=0.0614 in 2010), between haplotypes CTGACG and CCGGAG (<italic>P</italic>=0.0005 in 2009), between haplotypes CTGACG and CCGAAG (<italic>P</italic>=0.0231 in 2009), between haplotypes TCGAAG and CCGAAG (<italic>P</italic>=0.0619 in 2009, <italic>P</italic>=0.0261 in 2010), and between haplotypes CCAAAG and CCGGAG (<italic>P</italic>=0.0296 in 2010). For Chr3-14262, the haplotype pairs with significant differences were detected as follows: TTT and TCT (<italic>P</italic>=0.0217 in 2009, <italic>P</italic>=0.0085 in 2010), TTT and GCT (<italic>P</italic>=0.0102 in 2010), GCT and TCT (<italic>P</italic>=0.0171). The other haplotype blocks of alkali tolerance showed significant phenotypic differences between haplotypes only in a single year and they include: GTGT and TTAT (<italic>P</italic>&lt;0.0001), TTGT and TTAC (<italic>P</italic>=0.0038), TTAT and TAGT (<italic>P</italic>=0.0132) for Chr13-67342; CAG and TGT (<italic>P</italic>=0.0183) for Chr1-5051; ATCG and GATC (<italic>P</italic>=0.0009) for Chr7-34669; TAGGCG and AATGCA (<italic>P</italic>=0.0157), and TAGGCG and TATGCG (<italic>P</italic>=0.0128) for Chr20-105040.</p>
<p>Genes with significant phenotypic differences across haplotypes were considered as the candidate genes (<xref ref-type="table" rid="T2">
<bold>Tables&#xa0;2</bold>
</xref> and <xref ref-type="table" rid="T3">
<bold>3</bold>
</xref>), including 22 salt stress-related candidate genes and 22 alkali stress-related candidate genes. Among them, six salt stress-related candidate genes (<italic>Glyma05g25420</italic>, <italic>Glyma11g14030</italic>, <italic>Glyma11g14040</italic>, <italic>Glyma11g14050</italic>, <italic>Glyma13g27691</italic>, <italic>Glyma13g27701</italic>) and six alkali stress-related candidate genes (<italic>Glyma03g28222</italic>, <italic>Glyma03g28234</italic>, <italic>Glyma03g28247</italic>, <italic>Glyma16g25320</italic>, <italic>Glyma20g31790</italic>, <italic>Glyma20g31800</italic>) were found in the haplotype block.</p>
</sec>
</sec>
</sec>
<sec id="s4" sec-type="discussion">
<label>4</label>
<title>Discussion</title>
<p>In this study, we established a method for transforming ordinal phenotypes into continuous phenotypes (MTOTC) based on hierarchical data for ordinal trait phenotypes and molecular marker data in resource populations. Therefore, the process of association analysis for ordinal traits is as follows: first, MTOTC is used to transform HData into continuous phenotypic data (CPData), and then a C-GWAS method (i.e. GWAS method for continuous quantitative traits) is selected to analyze the CPData to identify the QTNs that are significantly associated with ordinal traits.</p>
<p>In this study, simulation experiments and soybean saline-alkali tolerance analysis indicated that the new method, MTOTC, is suitable for ordinal traits when they are less than five hierarchical levels. Moreover, the combination of MTOTC with any one of the proposed C-GWAs methods exhibited high power, low false-positive rates, and low bias in estimating the positions and effects of the QTN. The purpose of MTOTC is to provide a different approach for undertaking GWAS for ordinal traits. The feasibility of the MTOTC method was verified in real data analysis of soybean salt-alkaline tolerance using 286 soybean accessions. Compared with HData5 (i.e., the data classified as five hierarchical levels), a greater number of significant QTNs was detected concurrently by at least two GWAS methods or in two years, and more candidate genes for salt and alkali stress were screened in the CPData for salt and alkali tolerance traits. A greater number of QTNs was detected simultaneously by multiple GWAS methods in the CPData than in the index data and HData for salt-alkaline tolerance. For the three types of data, the number of QTNs detected simultaneously was respectively 4, 1, and 1 in salt tolerance and respectively 5, 2, and 3 in alkali resistance.When the phenotype distribution of the CPData generated by the new method were closer to those from the index data of salt-alkali tolerance, the GWAS results were better, and a greater number of candidate genes could be mined. This may be beneficial for selecting the appropriate distribution proportion to obtain hierarchical data of ordinal trait, screening stable QTNs, and promoting the development of molecular breeding. We also applied symmetric distribution (1:2:4:2:1) to generate HData5 for the salt tolerance index data and used MTOTC to generate the corresponding CPData5. The phenotype distribution of CPData5 with symmetric 1:2:4:2:1 exhibited a large deviation from that of the index data, and the phenotype distribution of CPData5 with uniform 1:1:1:1:1 was closer to that of the index data. Under the six methods, there were no overlapping QTNs in CPData5 and the index data for salt tolerance, which was far inferior to the above uniform distribution observed with the distribution proportion 1:1:1:1:1, under which three coincident QTNs were detected in CPData5 and the index data. This result corresponded precisely to the results presented in simulation study 5.</p>
<p>MTOTC performed well in the initial SNP screening. After preliminary screening under a <italic>P &#x2264;</italic> 0.05 threshold, a large number of SNPs that were significantly unrelated to the trait could be eliminated. Meanwhile, the simulation experiment showed that the retention rates of related loci remained high. MTOTC serves to simplify the model and save a substantial amount of computing time for subsequent association studies.</p>
<p>MTOTC helps to improve association analyses of ordinal traits. Regarding coefficient of variation, skewness, kurtosis, and frequency distribution, compared with the HData, the results obtained for the CPData were closer to those of the OData. Meanwhile, the results using six GWAS methods showed that the statistical power, the false-positive rate, and the position estimates in CPData were better than those in HData. Moreover, MTOTC performed better when the frequency distribution of the CPData was close to that of the OData.</p>
<p>The fewer hierarchical levels, the more suitable MTOTC is. Regarding the relative power in CPData under different hierarchical levels, a trend of increasing relative power with increasing number of hierarchical levels was found for all six methods when there were four or less hierarchical levels. When there were five hierarchical levels, the power of MTOTC+FASTmrMLM was close to that of FASTmrMLM in HData, but slightly lower than the power from logistic regression; only three GWAS methods had higher relative power in CPData than in HData. In addition, MTOTC had a tendency to increase variation, especially with increasing numbers of hierarchical levels. This indicates that MTOTC is more suitable for ordinal traits with fewer hierarchical levels, especially those with two or three levels. Among the six GWAS methods, FASTmrEMMA, FASTmrMLM, and mrMLM are significantly better when combined with MTOTC. This is partly attributed to that the distribution and parameter estimation principles set in MTOTC were relatively consistent with those in these three GWAS models.</p>
<p>This study will contribute to further research in association analysis of ordinal traits. This is especially in improving the retention rate of small-effect loci in preliminary screening, reducing the impact on variability when transforming ordinal phenotypes into continuous phenotypes, and developing novel methods for association analyses of ordinal traits.</p>
</sec>
<sec id="s5" sec-type="data-availability">
<title>Data availability statement</title>
<p>The original contributions presented in the study are included in the article/<xref ref-type="supplementary-material" rid="SM1"><bold>Supplementary Material</bold></xref>. Further inquiries can be directed to the corresponding author.</p>
</sec>
<sec id="s6" sec-type="author-contributions">
<title>Author contributions</title>
<p>MY, JF, and YW designed the methodologies. MY, JF, JZ, and JCZ drafted the manuscript, conducted simulation studies, and analyzed the data. TZ and JF revised the paper. All authors contributed to the article and approved the submitted version.</p>
</sec>
</body>
<back>
<sec id="s7" sec-type="funding-information">
<title>Funding</title>
<p>The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This work was supported by Major National Agricultural Science and Technology Projects of China (2022ZD0400704), the National Key R &amp; D Program of China (2021YFD1201603), the National Natural Science Foundation of China (32070688).</p>
</sec>
<sec id="s8" sec-type="COI-statement">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec id="s9" sec-type="disclaimer">
<title>Publisher&#x2019;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<sec id="s10" sec-type="supplementary-material">
<title>Supplementary material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fpls.2023.1247181/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fpls.2023.1247181/full#supplementary-material</ext-link>
</p>
<supplementary-material xlink:href="DataSheet_1.docx" id="SM1" mimetype="application/vnd.openxmlformats-officedocument.wordprocessingml.document"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Atwell</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Huang</surname> <given-names>Y. S.</given-names>
</name>
<name>
<surname>Vilhj&#xe1;lmsson</surname> <given-names>B. J.</given-names>
</name>
<name>
<surname>Willems</surname> <given-names>G.</given-names>
</name>
<name>
<surname>Horton</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>Y.</given-names>
</name>
<etal/>
</person-group>. (<year>2010</year>). <article-title>Genome-wide association study of 107 phenotypes in Arabidopsis thaliana inbred lines</article-title>. <source>Nature.</source> <volume>465</volume> (<issue>7298</issue>), <fpage>627</fpage>&#x2013;<lpage>631</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/nature08800</pub-id>
</citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bi</surname> <given-names>W.</given-names>
</name>
<name>
<surname>Kang</surname> <given-names>G.</given-names>
</name>
<name>
<surname>Zhao</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Zhao</surname> <given-names>Y. L.</given-names>
</name>
<name>
<surname>Cui</surname> <given-names>Y. H.</given-names>
</name>
<name>
<surname>Yan</surname> <given-names>S.</given-names>
</name>
<etal/>
</person-group>. (<year>2015</year>). <article-title>SVSI: fast and powerful set-valued system identification approach to identifying rare variants in sequencing studies for ordered categorical traits</article-title>. <source>Ann. Hum. Genet.</source> <volume>79</volume> (<issue>4</issue>), <fpage>294</fpage>&#x2013;<lpage>309</lpage>. doi: <pub-id pub-id-type="doi">10.1111/ahg.12117</pub-id>
</citation>
</ref>
<ref id="B3">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chang</surname> <given-names>F. G.</given-names>
</name>
<name>
<surname>Guo</surname> <given-names>C. Y.</given-names>
</name>
<name>
<surname>Sun</surname> <given-names>F. L.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>J. S.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>Z. L.</given-names>
</name>
<name>
<surname>Kong</surname> <given-names>J. J.</given-names>
</name>
<etal/>
</person-group>. (<year>2018</year>). <article-title>Genome-wide association studies for synamic plant height and number of nodes on the main stem in summer sowing soybeans</article-title>. <source>Front. Plant science.</source> <volume>9</volume>, <elocation-id>1184</elocation-id>. doi: <pub-id pub-id-type="doi">10.3389/fpls.2018.01184</pub-id>
</citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cuevas</surname> <given-names>H. E.</given-names>
</name>
<name>
<surname>Prom</surname> <given-names>L. K.</given-names>
</name>
<name>
<surname>Cooper</surname> <given-names>E. A.</given-names>
</name>
<name>
<surname>Knoll</surname> <given-names>J. E.</given-names>
</name>
<name>
<surname>Ni</surname> <given-names>X. Z.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Genome-wide association mapping of anthracnose ( Colletotrichum sublineolum) resistance in the U.S. Sorghum association panel</article-title>. <source>Plant Genome</source> <volume>11</volume> (<issue>2</issue>), <fpage>1</fpage>&#x2013;<lpage>13</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.3835/plantgenome2017.11.0099</pub-id>
</citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Feng</surname> <given-names>J. Y.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>W. J.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>S. B.</given-names>
</name>
<name>
<surname>Han</surname> <given-names>S. F.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>Y. M.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>An efficient hierarchical generalized linear mixed model for mapping QTL of ordinal traits in crop cultivars</article-title>. <source>PloS One</source> <volume>8</volume> (<issue>4</issue>), <elocation-id>e59541</elocation-id>. doi: <pub-id pub-id-type="doi">10.1371/journal.pone.0059541</pub-id>
</citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>He</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Kulminski</surname> <given-names>A. M.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Fast algorithms for conducting large-scale GWAS of age-at-onset traits using Cox mixed-effects models</article-title>. <source>Genetics</source> <volume>215</volume> (<issue>14</issue>), <fpage>41</fpage>&#x2013;<lpage>58</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1534/genetics.119.302940</pub-id>
</citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hoggart</surname> <given-names>C. J.</given-names>
</name>
<name>
<surname>Whittaker</surname> <given-names>J. C.</given-names>
</name>
<name>
<surname>De Iorio</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Balding</surname> <given-names>D. J.</given-names>
</name>
</person-group> (<year>2008</year>). <article-title>Simultaneous analysis of all SNPs in genome-wide and re-sequencing association studies</article-title>. <source>PloS Genet.</source> <volume>4</volume> (<issue>7</issue>), <elocation-id>e1000130</elocation-id>. doi: <pub-id pub-id-type="doi">10.1371/journal.pgen.1000130</pub-id>
</citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jiang</surname> <given-names>L. D.</given-names>
</name>
<name>
<surname>Zheng</surname> <given-names>Z. L.</given-names>
</name>
<name>
<surname>Fang</surname> <given-names>H. L.</given-names>
</name>
<name>
<surname>Yang</surname> <given-names>J.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>A generalized linear mixed model association tool for biobank-scale data</article-title>. <source>Nat. Genet.</source> <volume>53</volume>, <fpage>1616</fpage>&#x2013;<lpage>1621</lpage>. doi: <pub-id pub-id-type="doi">10.1038/s41588-021-00954-4</pub-id>
</citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname> <given-names>J. Y.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>P.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>Y. W.</given-names>
</name>
<name>
<surname>Zuo</surname> <given-names>J. F.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>G.</given-names>
</name>
<name>
<surname>Han</surname> <given-names>X.</given-names>
</name>
<etal/>
</person-group>. (<year>2020</year>). <article-title>Three-dimension genetic networks among seed oil-related traits, metabolites and genes reveal the genetic foundations of oil synthesis in soybean</article-title>. <source>Plant J.</source> <volume>103</volume> (<issue>3</issue>), <fpage>1103</fpage>&#x2013;<lpage>1124</lpage>. doi: <pub-id pub-id-type="doi">10.1111/tpj.14788</pub-id>
</citation>
</ref>
<ref id="B10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Megerssa</surname> <given-names>S. H.</given-names>
</name>
<name>
<surname>Ammar</surname> <given-names>K.</given-names>
</name>
<name>
<surname>Acevedo</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Brown-Guedira</surname> <given-names>G.</given-names>
</name>
<name>
<surname>Ward</surname> <given-names>B.</given-names>
</name>
<name>
<surname>Degete</surname> <given-names>A. G.</given-names>
</name>
<etal/>
</person-group>. (<year>2020</year>). <article-title>Multiple-race stem rust resistance loci identified in durum wheat using genome-wide association mapping</article-title>. <source>Front. Plant Science.</source> <volume>11</volume>, <elocation-id>1934</elocation-id>. doi: <pub-id pub-id-type="doi">10.3389/fpls.2020.598509</pub-id>
</citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Osval</surname> <given-names>A. M.</given-names>
</name>
<name>
<surname>Abelardo</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Paulino</surname> <given-names>P.</given-names>
</name>
<name>
<surname>Gustavo</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Eskridge</surname> <given-names>K. M.</given-names>
</name>
<name>
<surname>Crossa</surname> <given-names>J.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Threshold models for genome-enabled prediction of ordinal categorical traits in plant breeding</article-title>. <source>G3: Genes|Genomes|Genetics.</source> <volume>5</volume> (<issue>2</issue>), <fpage>291</fpage>&#x2013;<lpage>300</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1534/g3.114.016188</pub-id>
</citation>
</ref>
<ref id="B12">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pritchard</surname> <given-names>J. K.</given-names>
</name>
<name>
<surname>Stephens</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Donnelly</surname> <given-names>P.</given-names>
</name>
</person-group> (<year>2000</year>). <article-title>Inference of population structure using multilocus genotype data</article-title>. <source>Genetics.</source> <volume>155</volume> (<issue>2</issue>), <fpage>945</fpage>&#x2013;<lpage>959</lpage>. doi: <pub-id pub-id-type="doi">10.1093/genetics/155.2.945</pub-id>
</citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ren</surname> <given-names>W. L.</given-names>
</name>
<name>
<surname>Wen</surname> <given-names>Y. J.</given-names>
</name>
<name>
<surname>Dunwell</surname> <given-names>J. M.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>Y. M.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>pKWmEB: integration of Kruskal-Wallis test with empirical Bayes under polygenic background control for multi-locus genome-wide association study</article-title>. <source>Heredity.</source> <volume>120</volume> (<issue>3</issue>), <fpage>208</fpage>&#x2013;<lpage>218</lpage>. doi: <pub-id pub-id-type="doi">10.1038/s41437-017-0007-4</pub-id>
</citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shao</surname> <given-names>G. H.</given-names>
</name>
</person-group> (<year>1986</year>). <article-title>Field identification method of salt tolerance of soybean germplasm resources</article-title>. <source>Crops.</source> <volume>3</volume>, <fpage>1001</fpage>&#x2013;<lpage>1986</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.16035/j.issn.1001-7286.1986.03.031</pub-id>
</citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shim</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Ha</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Kim</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Choi</surname> <given-names>M. S.</given-names>
</name>
<name>
<surname>Kang</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Jeong</surname> <given-names>S.</given-names>
</name>
<etal/>
</person-group>. (<year>2019</year>). <article-title>GmBRC1 is a candidate gene for branching in soybean [<italic>Glycine max</italic> (L.) Merrill]</article-title>. <source>Plant Genet. Mol. Breed.</source> <volume>20</volume> (<issue>1</issue>), <fpage>135</fpage>. doi: <pub-id pub-id-type="doi">10.3390/ijms20010135</pub-id>
</citation>
</ref>
<ref id="B16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Song</surname> <given-names>X. Y.</given-names>
</name>
<name>
<surname>Iuliana</surname> <given-names>I. L.</given-names>
</name>
<name>
<surname>Liu</surname> <given-names>M. L.</given-names>
</name>
<name>
<surname>Reibman</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Wei</surname> <given-names>Y.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>A General and robust framework for secondary traits analysis</article-title>. <source>Genetics.</source> <volume>202</volume>, <fpage>1329</fpage>&#x2013;<lpage>1343</lpage>. doi: <pub-id pub-id-type="doi">10.1534/genetics.115.181073</pub-id>
</citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sun</surname> <given-names>L. M.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Hu</surname> <given-names>Y. Q.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Utilizing mutual information for detecting rare and common variants associated with a categorical trait</article-title>. <source>PeerJ.</source> <volume>4</volume>, <elocation-id>e2139</elocation-id>. doi: <pub-id pub-id-type="doi">10.7717/peerj.2139</pub-id>
</citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tamba</surname> <given-names>C. L.</given-names>
</name>
<name>
<surname>Ni</surname> <given-names>Y. L.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>Y. M.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Iterative sure independence screening EM-Bayesian LASSO algorithm for multi-locus genome-wide association studies</article-title>. <source>PloS Comput. Biol.</source> <volume>13</volume> (<issue>1</issue>), <elocation-id>e1005357</elocation-id>. doi: <pub-id pub-id-type="doi">10.1371/journal.pcbi.1005357</pub-id>
</citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tamba</surname> <given-names>C. L.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>Y. M.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>A fast mrMLM algorithm for multi-locus genome-wide association studies</article-title>. <source>bioRxiv</source>. doi:&#xa0;<pub-id pub-id-type="doi">10.1101/341784</pub-id>
</citation>
</ref>
<ref id="B20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tan</surname> <given-names>Q. H.</given-names>
</name>
<name>
<surname>Christiansen</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Charlotte</surname> <given-names>B. A.</given-names>
</name>
<name>
<surname>Zhao</surname> <given-names>J. H.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>S. X.</given-names>
</name>
<name>
<surname>Kruse</surname> <given-names>T. A.</given-names>
</name>
<etal/>
</person-group>. (<year>2007</year>). <article-title>Retrospective analysis of main and interaction effects in genetic association studies of human complex traits</article-title>. <source>BMC Genet.</source> <volume>8</volume> (<issue>1</issue>), <fpage>70</fpage>&#x2013;<lpage>75</lpage>. doi: <pub-id pub-id-type="doi">10.1186/1471-2156-8-70</pub-id>
</citation>
</ref>
<ref id="B21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname> <given-names>S. B.</given-names>
</name>
<name>
<surname>Feng</surname> <given-names>J. Y.</given-names>
</name>
<name>
<surname>Ren</surname> <given-names>W. L.</given-names>
</name>
<name>
<surname>Huang</surname> <given-names>B.</given-names>
</name>
<name>
<surname>Zhou</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Wen</surname> <given-names>Y. J.</given-names>
</name>
<etal/>
</person-group>. (<year>2016</year>). <article-title>Improving power and accuracy of genome-wide association studies <italic>via</italic> a multi-locus mixed linear model methodology</article-title>. <source>Sci. Rep.</source> <volume>6</volume> (<issue>1</issue>), <fpage>19444</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/srep19444</pub-id>
</citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Philip</surname> <given-names>V. M.</given-names>
</name>
<name>
<surname>Ananda</surname> <given-names>G.</given-names>
</name>
<name>
<surname>White</surname> <given-names>C. C.</given-names>
</name>
<name>
<surname>Malhotra</surname> <given-names>A.</given-names>
</name>
<name>
<surname>Michalski</surname> <given-names>P. J.</given-names>
</name>
<etal/>
</person-group>. (<year>2018</year>). <article-title>A Bayesian framework for generalized linear mixed modeling identifies new candidate loci for late-onset Alzheimer&#x2019;s disease</article-title>. <source>Genetics.</source> <volume>209</volume>, <fpage>51</fpage>&#x2013;<lpage>64</lpage>. doi: <pub-id pub-id-type="doi">10.1534/genetics.117.300673</pub-id>
</citation>
</ref>
<ref id="B23">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Ruggeri</surname> <given-names>F.</given-names>
</name>
<name>
<surname>Hsiao</surname> <given-names>C. K.</given-names>
</name>
<name>
<surname>Argiento</surname> <given-names>R.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Bayesian nonparametric clustering and association studies for candidate SNP observations</article-title>. <source>Int. J. Approximate Reasoning.</source> <volume>80</volume>, <fpage>19</fpage>&#x2013;<lpage>35</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.ijar.2016.07.014</pub-id>
</citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wen</surname> <given-names>Y. J.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Ni</surname> <given-names>Y. L.</given-names>
</name>
<name>
<surname>Huang</surname> <given-names>B.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Feng</surname> <given-names>J. Y.</given-names>
</name>
<etal/>
</person-group>. (<year>2018</year>). <article-title>Methodological implementation of mixed linear models in multi-locus genome-wide association studies</article-title>. <source>Brief Bioinform.</source> <volume>19</volume> (<issue>4</issue>), <fpage>700</fpage>&#x2013;<lpage>712</lpage>. doi: <pub-id pub-id-type="doi">10.1093/bib/bbw145</pub-id>
</citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wu</surname> <given-names>T. T.</given-names>
</name>
<name>
<surname>Chen</surname> <given-names>Y. F.</given-names>
</name>
<name>
<surname>Hastie</surname> <given-names>T.</given-names>
</name>
<name>
<surname>Sobel</surname> <given-names>E.</given-names>
</name>
<name>
<surname>Lange.</surname> <given-names>K.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>Genome-wide association analysis by lasso penalized logistic regression</article-title>. <source>Bioinformatics.</source> <volume>25</volume> (<issue>6</issue>), <fpage>714</fpage>&#x2013;<lpage>721</lpage>. doi: <pub-id pub-id-type="doi">10.1093/bioinformatics/btp041</pub-id>
</citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xu</surname> <given-names>S.</given-names>
</name>
</person-group> (<year>2010</year>). <article-title>An expectation-maximization algorithm for the Lasso estimation of quantitative trait locus effects</article-title>. <source>Heredity.</source> <volume>105</volume> (<issue>5</issue>), <fpage>483</fpage>&#x2013;<lpage>494</lpage>. doi: <pub-id pub-id-type="doi">10.1038/hdy.2009.180</pub-id>
</citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xu</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>Y. M.</given-names>
</name>
<name>
<surname>Xu</surname> <given-names>S.</given-names>
</name>
</person-group> (<year>2005</year>). <article-title>An EM algorithm for mapping quantitative resistance loci</article-title>. <source>Heredity.</source> <volume>94</volume>, <fpage>119</fpage>&#x2013;<lpage>128</lpage>. doi: <pub-id pub-id-type="doi">10.1038/sj.hdy.6800583</pub-id>
</citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Feng</surname> <given-names>J. Y.</given-names>
</name>
<name>
<surname>Ni</surname> <given-names>Y. L.</given-names>
</name>
<name>
<surname>Wen</surname> <given-names>Y. J.</given-names>
</name>
<name>
<surname>Niu</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Tamba</surname> <given-names>C. L.</given-names>
</name>
<etal/>
</person-group>. (<year>2017</year>). <article-title>pLARmEB: integration of least angle regression with empirical Bayes for multilocus genome-wide association studies</article-title>. <source>Heredity.</source> <volume>118</volume> (<issue>6</issue>), <fpage>517</fpage>&#x2013;<lpage>524</lpage>. doi: <pub-id pub-id-type="doi">10.1038/hdy.2017.8</pub-id>
</citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname> <given-names>W. J.</given-names>
</name>
<name>
<surname>Niu</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Bu</surname> <given-names>S. H.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Feng</surname> <given-names>J. Y.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>J.</given-names>
</name>
<etal/>
</person-group>. (<year>2014</year>). <article-title>Epistatic association mapping for alkaline and salinity tolerance traits in the soybean germination stage</article-title>. <source>PloS One</source> <volume>9</volume> (<issue>1</issue>), <elocation-id>e84750</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1371/journal.pone.0084750</pub-id>
</citation>
</ref>
<ref id="B30">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname> <given-names>Y. W.</given-names>
</name>
<name>
<surname>Tamba</surname> <given-names>C. L.</given-names>
</name>
<name>
<surname>Wen</surname> <given-names>Y. J.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>P.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>Y. M.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>mrMLM v4.0: an R platform for multi-locus genome-wide association studies</article-title>. <source>Genomics Proteomies Bioinf.</source> <volume>18</volume> (<issue>4</issue>), <fpage>481</fpage>&#x2013;<lpage>487</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.gpb.2020.06.006</pub-id>
</citation>
</ref>
<ref id="B31">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhou</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>S. B.</given-names>
</name>
<name>
<surname>Jian</surname> <given-names>J. B.</given-names>
</name>
<name>
<surname>Geng</surname> <given-names>Q. C.</given-names>
</name>
<name>
<surname>Wen</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Song</surname> <given-names>Q. J.</given-names>
</name>
<etal/>
</person-group>. (<year>2015</year>). <article-title>Identification of domestication-related loci associated with flowering time and seed size in soybean with the RAD-seq genotyping method</article-title>. <source>Sci. Rep.</source> <volume>5</volume>, <elocation-id>9350</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/srep09350</pub-id>
</citation>
</ref>
</ref-list>
</back>
</article>