<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="2.3">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Plant Sci.</journal-id>
<journal-title>Frontiers in Plant Science</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Plant Sci.</abbrev-journal-title>
<issn pub-type="epub">1664-462X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fpls.2022.773107</article-id>
<article-categories>
<subj-group subj-group-type="heading"><subject>Plant Science</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>GRAND: An Integrated Genome, Transcriptome Resources, and Gene Network Database for <italic>Gossypium</italic></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Zhang</surname>
<given-names>Zhibin</given-names>
</name>
<xref rid="aff1" ref-type="aff"><sup>1</sup></xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Chai</surname>
<given-names>Mao</given-names>
</name>
<xref rid="aff1" ref-type="aff"><sup>1</sup></xref>
<uri xlink:href="https://loop.frontiersin.org/people/1459788/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Yang</surname>
<given-names>Zhaoen</given-names>
</name>
<xref rid="aff1" ref-type="aff"><sup>1</sup></xref>
<xref rid="aff2" ref-type="aff"><sup>2</sup></xref>
<uri xlink:href="https://loop.frontiersin.org/people/470489/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Yang</surname>
<given-names>Zuoren</given-names>
</name>
<xref rid="aff1" ref-type="aff"><sup>1</sup></xref>
<xref rid="aff2" ref-type="aff"><sup>2</sup></xref>
<uri xlink:href="https://loop.frontiersin.org/people/401258/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Fan</surname>
<given-names>Liqiang</given-names>
</name>
<xref rid="aff1" ref-type="aff"><sup>1</sup></xref>
<xref rid="c001" ref-type="corresp"><sup>&#x002A;</sup></xref>
<uri xlink:href="https://loop.frontiersin.org/people/1472557/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences</institution>, <addr-line>Anyang</addr-line>, <country>China</country></aff>
<aff id="aff2"><sup>2</sup><institution>Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University</institution>, <addr-line>Zhengzhou</addr-line>, <country>China</country></aff>
<author-notes>
<fn id="fn0001" fn-type="edited-by">
<p>Edited by: Ingo Ebersberger, Goethe University Frankfurt, Germany</p>
</fn>
<fn id="fn0002" fn-type="edited-by">
<p>Reviewed by: Jinpeng Wang, Institute of Botany (CAS), China; Joshua A. Udall, United States Department of Agriculture (USDA), United States</p>
</fn>
<corresp id="c001">&#x002A;Correspondence: Liqiang Fan, <email>fanliqiang@caas.cn</email></corresp>
<fn id="fn0003" fn-type="other">
<p>This article was submitted to Technical Advances in Plant Science, a section of the journal Frontiers in Plant Science</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>21</day>
<month>01</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>13</volume>
<elocation-id>773107</elocation-id>
<history>
<date date-type="received">
<day>09</day>
<month>09</month>
<year>2021</year>
</date>
<date date-type="accepted">
<day>04</day>
<month>01</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2022 Zhang, Chai, Yang, Yang and Fan.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Zhang, Chai, Yang, Yang and Fan</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<p>With the increasing amount of cotton omics data, breeding scientists are confronted with the question of how to use massive cotton data to mine effective breeding information. Here, we construct a <italic>Gossypium</italic> Resource And Network Database (GRAND), which integrates 18 cotton genome sequences, genome annotations, two cotton genome variations information, and also four transcriptomes for <italic>Gossypium</italic> species. GRAND allows to explore and mine this data with the help of a toolbox that comprises a flexible search system, BLAST and BLAT suite, orthologous gene ID, networks of co-expressed genes, primer design, Gbrowse and Jbrowse, and drawing instruments. GRAND provides important information regarding <italic>Gossypium</italic> resources and hopefully can accelerate the progress of cultivating cotton varieties.</p>
</abstract>
<kwd-group>
<kwd><italic>Gossypium</italic></kwd>
<kwd>genome</kwd>
<kwd>comparative genomics</kwd>
<kwd>variation</kwd>
<kwd>GRAND</kwd>
</kwd-group>
<contract-num rid="cn1">31621005</contract-num>
<contract-sponsor id="cn1">National Natural Science Foundation of China<named-content content-type="fundref-id">10.13039/501100001809</named-content>
</contract-sponsor>
<counts>
<fig-count count="4"/>
<table-count count="1"/>
<equation-count count="0"/>
<ref-count count="29"/>
<page-count count="7"/>
<word-count count="4298"/>
</counts>
</article-meta>
</front>
<body>
<sec id="sec1" sec-type="intro">
<title>Introduction</title>
<p>Cotton (<italic>Gossypium</italic> spp.) produces natural fiber for the textile industry worldwide and also plays an important role in edible oil for daily life. The <italic>Gossypium</italic> genus includes more than 50 different species, and it is an excellent model for studying genome evolution and polyploidization. Moreover, multiple high-quality <italic>de novo</italic> assembled genomes of <italic>Gossypium</italic> have been reported in recent years. These genomes have considerable improvements and contiguity compared to previously assembled draft genomes. For example, the high-quality genomes of <italic>Gossypium arboreum</italic> (<xref ref-type="bibr" rid="ref5">Du et al., 2018</xref>), <italic>Gossypium austral</italic> (<xref ref-type="bibr" rid="ref3">Cai et al., 2019</xref>), <italic>Gossypium raimondii</italic> and <italic>Gossypium turneri</italic> (<xref ref-type="bibr" rid="ref18">Udall et al., 2019</xref>), <italic>Gossypium davidsonii</italic> and <italic>Gossypium thurberi</italic> (<xref ref-type="bibr" rid="ref23">Yang et al., 2021</xref>) were sequenced and released in 2018, 2019, 2019, and 2021, respectively. Genomes of tetraploid <italic>Gossypium barbadense</italic> and <italic>Gossypium hirsutum</italic> were <italic>de novo</italic> sequenced and released by <xref ref-type="bibr" rid="ref7">Hu et al. (2019)</xref> and <xref ref-type="bibr" rid="ref21">Wang et al. (2019)</xref>, respectively. <xref ref-type="bibr" rid="ref24">Yang et al. (2019)</xref> also sequenced and assembled genomes of two upland cotton cultivars TM-1 and zhongmiansuo24 (ZM24). The assembly of these cotton genomes (diploid and tetraploid) transitioned <italic>Gossypium</italic> research into the genomics and pan-genomics era. However, effective integration and utilization of a large number of cotton datasets to mine valuable information for cotton researchers have become an important research hotspot.</p>
<p>Several online databases about cotton have been designed worldwide. CottonGen (<xref ref-type="bibr" rid="ref26">Yu et al., 2014</xref>) is a relatively comprehensive cotton database with a collection of cotton genomes, genetic markers, and breeding germplasm accessions, while it is sometimes unfriendly for users and the functional modules need to be further expanded. ccNet (<xref ref-type="bibr" rid="ref25">You et al., 2017</xref>) is a co-expression network database of diploid <italic>G. arboreum</italic> and polyploid <italic>G. hirsutum</italic>. CottonFGD (<xref ref-type="bibr" rid="ref27">Zhu et al., 2017</xref>) is a cotton functional genome database, which integrates cotton genomes and transcriptomes as well as sequence retrieval, analysis, and visualization modules, but it does not contain genetic data, such as molecular markers. MaGenDB (<xref ref-type="bibr" rid="ref20">Wang et al., 2020</xref>) focuses on constructing an integrative database of 13 Malvaceae species, including cotton, to enable users to jointly compare and analyze relevant data. COTTONOMICS<xref rid="fn0004" ref-type="fn"><sup>1</sup></xref> is a comparative genomics platform and variation database for <italic>G. hirsutum</italic> and <italic>G. barbadense</italic>. CottonGVD (<xref ref-type="bibr" rid="ref13">Peng et al., 2021</xref>) is a cotton database specifically focused on trait-associated loci visualization. Therefore, it is necessary to build a cotton database that systematically gathers the latest cotton genomes, transcriptomes, and molecular markers data together.</p>
<p>To meet this goal, here, we developed a comprehensive cotton database GRAND by integrating high-quality genomic and transcriptomic resources of cotton and providing tools for multi-level integrative analysis. GRAND covers a systematic view of genomic and transcriptomic information, integrates gene searching, gene list analysis, and visualization tools (such as Expression Visualization, Heatmap Draw, KEGG Dot Plot, and Annotation function). Besides, GRAND is an omics database for cotton (<italic>Gossypium</italic> spp.), in which all data can be freely accessed and downloaded.</p>
</sec>
<sec id="sec2" sec-type="materials|methods">
<title>Materials and Methods</title>
<sec id="sec3">
<title>Data Sources</title>
<p>The sequences of 18 cotton genome assemblies representing 14 <italic>Gossypium</italic> species and their respective gene annotations data, together with four transcriptomes were downloaded directly from relevant databases or sequenced by our laboratory and further used in GRAND (<xref rid="tab1" ref-type="table">Table 1</xref>). In <xref rid="tab1" ref-type="table">Table 1</xref>, the suffix of each species name indicates the institution that sequenced and published the genome [such as ICR: Institute of Cotton Research of CAAS, ZJU: Zhejiang University, HAU: Huazhong Agricultural University, USDA-ARS: US Department of Agriculture (USDA) Agricultural Research Service (ARS)]. The transcriptome data of <italic>G. hirsutum_TM-1</italic> and <italic>G. barbadense_H7124_ZJU</italic> were downloaded from the NCBI Sequence Read Archive with the accession number PRJNA490626 and were published by <xref ref-type="bibr" rid="ref7">Hu et al. (2019)</xref> from Zhejiang University. The other two transcriptomes, <italic>G. arboreum_ICR</italic> and <italic>G. hirsutum_ZM24_ICR</italic>, were sequenced and assembled by <xref ref-type="bibr" rid="ref5">Du et al. (2018)</xref> and <xref ref-type="bibr" rid="ref24">Yang et al. (2019)</xref> of the Institute of Cotton Research (ICR), respectively. Illumina reads were aligned to references <italic>G. hirsutum_TM-1_ICR</italic> (<xref ref-type="bibr" rid="ref24">Yang et al., 2019</xref>), <italic>G. hirsutum_ZM24_ICR</italic> (<xref ref-type="bibr" rid="ref24">Yang et al., 2019</xref>), and <italic>G. arboreum_ICR</italic> (<xref ref-type="bibr" rid="ref5">Du et al., 2018</xref>), respectively, using TopHat 2.1.1 (<xref ref-type="bibr" rid="ref9">Kim et al., 2013</xref>). Quantification of gene expression was then performed with Cufflinks version 2.2.1.<xref rid="fn0005" ref-type="fn"><sup>2</sup></xref> These data are shared freely on these websites without analysis tools to analyze them online, and the inconsistent format of the different datasets makes it more difficult to use them jointly. We have solved these problems and all the data are available for free download from GRAND.</p>
<table-wrap position="float" id="tab1">
<label>Table 1</label>
<caption>
<p>Summary of 18 cotton genome assemblies representing 14 <italic>Gossypium</italic> species.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">Species</th>
<th align="center" valign="top">Genome size (Mb)</th>
<th align="center" valign="top">Number of protein-coding genes</th>
<th align="center" valign="top">Contig N50 (Mb)</th>
<th align="center" valign="top">Scaffold N50 (Mb)</th>
<th align="left" valign="top">References</th>
<th align="left" valign="top">Data sources</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top"><italic>G. barbadense</italic>_ZJU</td>
<td align="center" valign="top">2,225</td>
<td align="left" valign="top">75,071</td>
<td align="char" valign="top" char=".">0.08</td>
<td align="center" valign="top">23.44</td>
<td align="left" valign="top" rowspan="2"><xref ref-type="bibr" rid="ref7">Hu et al., 2019</xref></td>
<td align="left" valign="top" rowspan="2"><ext-link xlink:href="https://www.ncbi.nlm.nih.gov/bioproject/PRJNA450479/" ext-link-type="uri">https://www.ncbi.nlm.nih.gov/bioproject/PRJNA450479/</ext-link></td>
</tr>
<tr>
<td align="left" valign="top"><italic>G. hirsutum</italic>_TM-1_ZJU</td>
<td align="center" valign="top">2,295</td>
<td align="left" valign="top">72,761</td>
<td align="char" valign="top" char=".">0.11</td>
<td align="center" valign="top">15.51</td>
</tr>
<tr>
<td align="left" valign="top"><italic>G. raimondii</italic>_USDA-ARS</td>
<td align="center" valign="top">735</td>
<td align="left" valign="top">40,743</td>
<td align="char" valign="top" char=".">6.3</td>
<td align="center" valign="top">6.3</td>
<td align="left" valign="top"><xref ref-type="bibr" rid="ref18">Udall et al., 2019</xref></td>
<td align="left" valign="top"><ext-link xlink:href="https://www.ncbi.nlm.nih.gov/bioproject/PRJNA493304" ext-link-type="uri">https://www.ncbi.nlm.nih.gov/bioproject/PRJNA493304</ext-link></td>
</tr>
<tr>
<td align="left" valign="top"><italic>G. barbadense</italic>_HAU</td>
<td align="center" valign="top">2,266</td>
<td align="left" valign="top">71,297</td>
<td align="char" valign="top" char=".">2.15</td>
<td align="center" valign="top">92.88</td>
<td align="left" valign="top" rowspan="2"><xref ref-type="bibr" rid="ref21">Wang et al., 2019</xref></td>
<td align="left" valign="top" rowspan="2"><ext-link xlink:href="https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA433615" ext-link-type="uri">https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA433615</ext-link></td>
</tr>
<tr>
<td align="left" valign="top"><italic>G. hirsutum</italic>_TM-1_HAU</td>
<td align="center" valign="top">2,347</td>
<td align="left" valign="top">70,199</td>
<td align="char" valign="top" char=".">1.89</td>
<td align="center" valign="top">97.74</td>
</tr>
<tr>
<td align="left" valign="top"><italic>G. arboreum</italic>_ICR</td>
<td align="center" valign="top">1,710</td>
<td align="left" valign="top">40,960</td>
<td align="char" valign="top" char=".">1.10</td>
<td align="center" valign="top">NA</td>
<td align="left" valign="top"><xref ref-type="bibr" rid="ref5">Du et al., 2018</xref></td>
<td align="left" valign="top"><ext-link xlink:href="https://www.ncbi.nlm.nih.gov/bioproject/PRJNA382310" ext-link-type="uri">https://www.ncbi.nlm.nih.gov/bioproject/PRJNA382310</ext-link></td>
</tr>
<tr>
<td align="left" valign="top"><italic>G. hirsutum</italic>_TM-1_ICR</td>
<td align="center" valign="top">2,286</td>
<td align="left" valign="top">73,624</td>
<td align="char" valign="top" char=".">4.76</td>
<td align="center" valign="top">NA</td>
<td align="left" valign="top" rowspan="2"><xref ref-type="bibr" rid="ref24">Yang et al., 2019</xref></td>
<td align="left" valign="top" rowspan="2"><ext-link xlink:href="https://www.ncbi.nlm.nih.gov/bioproject/PRJNA503326/" ext-link-type="uri">https://www.ncbi.nlm.nih.gov/bioproject/PRJNA503326/</ext-link></td>
</tr>
<tr>
<td align="left" valign="top"><italic>G. hirsutum</italic>_ZM24_ICR</td>
<td align="center" valign="top">2,309</td>
<td align="left" valign="top">73,707</td>
<td align="char" valign="top" char=".">1.98</td>
<td align="center" valign="top">NA</td>
</tr>
<tr>
<td align="left" valign="top"><italic>G. australe</italic>_ICR</td>
<td align="center" valign="top">1,752</td>
<td align="left" valign="top">40,694</td>
<td align="char" valign="top" char=".">1.83</td>
<td align="center" valign="top">143.6</td>
<td align="left" valign="top"><xref ref-type="bibr" rid="ref3">Cai et al., 2019</xref></td>
<td align="left" valign="top"><ext-link xlink:href="https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA513946" ext-link-type="uri">https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA513946</ext-link></td>
</tr>
<tr>
<td align="left" valign="top"><italic>G. davidsonii</italic>_ICR</td>
<td align="center" valign="top">801</td>
<td align="left" valign="top">41,471</td>
<td align="char" valign="top" char=".">26.8</td>
<td align="center" valign="top">NA</td>
<td align="left" valign="top" rowspan="2"><xref ref-type="bibr" rid="ref23">Yang et al., 2021</xref></td>
<td align="left" valign="top" rowspan="2"><ext-link xlink:href="http://grand.cricaas.com.cn/page/download/download" ext-link-type="uri">http://grand.cricaas.com.cn/page/download/download</ext-link></td>
</tr>
<tr>
<td align="left" valign="top"><italic>G. thurberi</italic>_ICR</td>
<td align="center" valign="top">780</td>
<td align="left" valign="top">41,316</td>
<td align="char" valign="top" char=".">24.7</td>
<td align="center" valign="top">NA</td>
</tr>
<tr>
<td align="left" valign="top"><italic>G. tomentosum</italic></td>
<td align="center" valign="top">2,229</td>
<td align="left" valign="top">72,648</td>
<td align="char" valign="top" char=".">11.98</td>
<td align="center" valign="top">103.05</td>
<td align="left" valign="top"><xref ref-type="bibr" rid="ref15">Shen et al., 2021</xref></td>
<td align="left" valign="top"><ext-link xlink:href="https://www.ncbi.nlm.nih.gov/bioproject/PRJNA629964" ext-link-type="uri">https://www.ncbi.nlm.nih.gov/bioproject/PRJNA629964</ext-link></td>
</tr>
<tr>
<td align="left" valign="top"><italic>G. darwinii_v1.1</italic></td>
<td align="center" valign="top">2,210</td>
<td align="left" valign="top">78,303</td>
<td align="char" valign="top" char=".">9.07</td>
<td align="center" valign="top">101.9</td>
<td align="left" valign="top" rowspan="2"><xref ref-type="bibr" rid="ref4">Chen et al., 2020</xref></td>
<td align="left" valign="top"><ext-link xlink:href="https://www.ncbi.nlm.nih.gov/bioproject/PRJNA516409/" ext-link-type="uri">https://www.ncbi.nlm.nih.gov/bioproject/PRJNA516409/</ext-link></td>
</tr>
<tr>
<td align="left" valign="top"><italic>G. mustelinum</italic></td>
<td align="center" valign="top">2,344</td>
<td align="left" valign="top">74,699</td>
<td align="char" valign="top" char=".">2.31</td>
<td align="center" valign="top">106.76</td>
<td align="left" valign="top"><ext-link xlink:href="https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA525892" ext-link-type="uri">https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA525892</ext-link></td>
</tr>
<tr>
<td align="left" valign="top"><italic>G. anomalum_NSF_B1</italic></td>
<td align="center" valign="top">1,208</td>
<td align="left" valign="top">37,016</td>
<td align="char" valign="top" char=".">10.81</td>
<td align="center" valign="top">97.68</td>
<td align="left" valign="top"><xref ref-type="bibr" rid="ref700">Grover et al., 2021</xref></td>
<td align="left" valign="top"><ext-link xlink:href="https://www.ncbi.nlm.nih.gov/bioproject/PRJNA421337" ext-link-type="uri">https://www.ncbi.nlm.nih.gov/bioproject/PRJNA421337</ext-link></td>
</tr>
<tr>
<td align="left" valign="top"><italic>G. herbaceum__WHU</italic></td>
<td align="center" valign="top">1,572</td>
<td align="left" valign="top">43,952</td>
<td align="char" valign="top" char=".">1.91</td>
<td align="center" valign="top">117.88</td>
<td align="left" valign="top"><xref ref-type="bibr" rid="ref8">Huang et al., 2020</xref></td>
<td align="left" valign="top"><ext-link xlink:href="https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA506494" ext-link-type="uri">https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA506494</ext-link></td>
</tr>
<tr>
<td align="left" valign="top"><italic>G. longicalyx</italic></td>
<td align="center" valign="top">1,205</td>
<td align="left" valign="top">38,378</td>
<td align="char" valign="top" char=".">17.53</td>
<td align="center" valign="top">95.88</td>
<td align="left" valign="top"><xref ref-type="bibr" rid="ref600">Grover et al., 2020</xref></td>
<td align="left" valign="top"><ext-link xlink:href="https://www.ncbi.nlm.nih.gov/bioproject/PRJNA420071" ext-link-type="uri">https://www.ncbi.nlm.nih.gov/bioproject/PRJNA420071</ext-link></td>
</tr>
<tr>
<td align="left" valign="top"><italic>G. turneri-TURN-v1.0</italic></td>
<td align="center" valign="top">765</td>
<td align="left" valign="top">39,692</td>
<td align="char" valign="top" char=".">7.91</td>
<td align="center" valign="top">60.46</td>
<td align="left" valign="top"><xref ref-type="bibr" rid="ref18">Udall et al., 2019</xref></td>
<td align="left" valign="top"><ext-link xlink:href="https://www.ncbi.nlm.nih.gov/bioproject/PRJNA493521" ext-link-type="uri">https://www.ncbi.nlm.nih.gov/bioproject/PRJNA493521</ext-link></td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="sec4">
<title>Development of Database and Website</title>
<p>The GRAND database relies on the Linux operating system, using J2EE as the framework, MySQL as the back-end database, and Apache Tomcat as the server. Genome sequence, annotation, expression, and variation data are stored in the MySQL database. A web interface based on JavaServer Pages (JSP), HTML5 and, CSS3 is constructed to enable end-users to access GRAND data through any modern browser on any kind of device. The GRAND database is hosted on a server equipped with eight 14-cores Intel Xeon Gold 5120 processors.</p>
</sec>
<sec id="sec5">
<title>Orthologous Gene ID Function and Gene Network</title>
<p>The orthologous genes among <italic>G. hirsutum</italic> TM-1 and ZM24 (<xref ref-type="bibr" rid="ref24">Yang et al., 2019</xref>), <italic>G. hirsutum</italic> TM-1 (<xref ref-type="bibr" rid="ref7">Hu et al., 2019</xref>), <italic>G. barbadense</italic> (<xref ref-type="bibr" rid="ref7">Hu et al., 2019</xref>), <italic>G. arboreum</italic> (<xref ref-type="bibr" rid="ref5">Du et al., 2018</xref>), <italic>G. hirsutum</italic> TM-1 (<xref ref-type="bibr" rid="ref21">Wang et al., 2019</xref>), and <italic>G. barbadense</italic> (<xref ref-type="bibr" rid="ref21">Wang et al., 2019</xref>) were identified using Inparanoid v4.1 (<xref ref-type="bibr" rid="ref12">O&#x2019;Brien et al., 2005</xref>) with default parameters. Tetraploid cottons were divided into A and D subgenomes. Then, we combined these results into one file for searching orthologous gene ID.</p>
<p>Based on gene expression data, networks of co-expressed genes of three cotton species (<italic>G. arboreum</italic>_ICR, <italic>G. hirsutum_</italic>TM-1_ICR, and <italic>G. hirsutum</italic>_ZM24_ICR) were constructed using Pearson&#x2019;s correlation coefficient (PCC) values between pairs of genes and visualized by using JavaScript Cytoscape.js. For a given query gene, the network of top 20 target genes with the highest correlation values with the query gene is shown. In addition, a summary table of all co-expressed genes and corresponding functional annotations is provided below the network.</p>
</sec>
</sec>
<sec id="sec6">
<title>Results and Discussion</title>
<sec id="sec7">
<title>Overview of Website Structure and Function</title>
<p>To provide users with a wealth of information about cotton, the GRAND database was built containing the latest and most comprehensive <italic>Gossypium</italic> genomic/transcriptomic datasets (including 18 assembled genomes and four transcriptomes; <xref rid="fig1" ref-type="fig">Figure 1</xref>; <xref rid="tab1" ref-type="table">Table 1</xref>). The main structure of GRAND is shown in <xref rid="fig1" ref-type="fig">Figure 1</xref> with four major modules: Browse, Search, Tools, and Download. GRAND provides search functions for various genomic information, including gene annotation information (KOG, GO, KEGG, and NR), gene sequences, genome variations (SNPs and INDELs), and expressional profiles and gene families, by entering a chromosomal region or longest transcripts ID. GRAND also integrated the genome visualization tools Gbrowse (<xref ref-type="bibr" rid="ref16">Stein et al., 2002</xref>) and Jbrowse (<xref ref-type="bibr" rid="ref2">Buels et al., 2016</xref>), allowing users to instantly browse, visualize, and retrieve sequence data and offer gene co-expression networks for different developmental stages and tissues/organs. Moreover, GRAND provides a suite of the toolbox for online analysis, such as BLAST- and BLAT-based sequence comparisons, orthologous gene ID across different species and PCR primer design (<xref ref-type="bibr" rid="ref19">Untergasser et al., 2012</xref>). Besides, users can download cotton data selectively or in full. Tutorials for using all the tools in the database are provided in the Help module. This information in GRAND will be useful for both dry lab and wet lab biologists.</p>
<fig position="float" id="fig1">
<label>Figure 1</label>
<caption>
<p>Schematic of GRAND database structure and web interface features. GRAND gathers 18 genomes, four transcriptomes of cotton and associated genome variations, and annotations data. All data are stored in a MySQL database.</p>
</caption>
<graphic xlink:href="fpls-13-773107-g001.tif"/>
</fig>
</sec>
<sec id="sec8">
<title>Browse Functions in GRAND</title>
<p>The browse detail page mainly includes the following modules: SNP Variation, INDEL Variation, Gene Annotation, and Gene Family. Users can search for the Nr, TrEMBL, KOG, KEGG, and GO annotations using Gene Annotation module. By cross-link, the genome variations related to each gene can be searched, including location, genome sequence, CDs sequence, transcript sequence, and peptide sequence (<xref rid="fig2" ref-type="fig">Figure 2A</xref>). Users can also quickly find related information about the gene family by searching for the target gene keywords (<xref rid="fig2" ref-type="fig">Figure 2B</xref>). The SNP or INDEL Variation shows the genome variations data (SNP or INDEL) on each chromosome of a group of individuals. Users can filter the data by reference genome, and SNP, or INDEL type (<xref rid="fig2" ref-type="fig">Figure 2C</xref>).</p>
<fig position="float" id="fig2">
<label>Figure 2</label>
<caption>
<p>An example of a gene page showing multiple types of information associated with the gene. <bold>(A)</bold> The annotation information of gene &#x201C;<italic>Gh_D03G071700</italic>.&#x201D; <bold>(B)</bold> the gene &#x201C;<italic>Gh_D03G071700</italic>&#x201D; is annotated as the hAT family C-terminal dimerization region and can be also found at the end page by searching the gene family function panel with PF05699. <bold>(C)</bold> The SNP Variation or INDEL Variation analyzes the genomic variation data (SNP or INDEL) on each chromosome of a group of individuals. <bold>(D,E)</bold> The Gbrowse and Jbrowse page.</p>
</caption>
<graphic xlink:href="fpls-13-773107-g002.tif"/>
</fig>
<p>For example, the gene &#x201C;<italic>Gh_D03G071700</italic>&#x201D; is located on chromosome D03 of the <italic>G. hirsutum_TM-1_ICR</italic> genome and is annotated as the hAT family C-terminal dimerization region, which can be also found at the end page by searching the gene family module with PF05699. The SNP and Indel Variation can be directly queried by clicking on the Variation option and visualized by clicking on Gbrowse and Jbrowse (<xref rid="fig2" ref-type="fig">Figure 2</xref>), which are tools for displaying variations (SNPs and INDELs) and genes (structure) of the cotton individuals on chromosomes. The Gbrowse detail page includes the following basic information of this gene: name, position in the scaffold, length, CDS parts, and sequence. The detail page of Jbrowse is displayed in a popup window showing information about the gene, SNP, and INDEL in the 30&#x2009;zkb region around this gene.</p>
</sec>
<sec id="sec9">
<title>Search Functions in GRAND</title>
<p>GRAND allows users to perform both BLAST and BLAT searches to rapidly align sequences to the genome. BLAST search implemented in GRAND using SequenceServer (<xref ref-type="bibr" rid="ref14">Priyam et al., 2019</xref>) provides an interface with text-based and interactive visual outputs to search against nucleotide sequences and/or protein sequences, including BLASTn, BLASTp, BLASTx, tBLASTx, and tBLASTn programs. GRAND currently has a BLAST database for whole-genome sequences, CDSs, and predicted proteins for each reference genome assembly. Pasting the DNA/Protein sequences in the query box or uploading a fasta file is acceptable. The search result displayed on a result page comprises two parts: &#x201C;Graphical view&#x201D; and &#x201C;List view&#x201D; (<xref rid="fig3" ref-type="fig">Figure 3A</xref>). The Graphical view presents a brief graphical view of the BLAST results by the chart. The List view is a table showing detailed information on the alignment by the BLAST program, such as gene ID, total score, e-value, and length. The sequence of FASTA files can be annotated by comparing with database (Nr_vs_GO, KEGG, COG, SwissProt, TrEMBL, KOG and Pfam; <xref ref-type="bibr" rid="ref1">Ashburner et al., 2000</xref>; <xref ref-type="bibr" rid="ref17">Tatusov et al., 2000</xref>; <xref ref-type="bibr" rid="ref10">Koonin et al., 2004</xref>; <xref ref-type="bibr" rid="ref6">El-Gebali et al., 2019</xref>) in the &#x201C;Anno function&#x201D; section. The results of the KEGG annotation can be visualized in the &#x201C;KEGG Dot Plot&#x201D; section. Besides, to make it easier for users to quickly search for data of interest, the current version of GRAND has four submodules under the &#x201C;Search&#x201D; module. (i) Multicriteria Search. Search the genome variations (SNPs and INDELs) of each individual in this database using gene, region, or variation. Data can be filtered by variation type and genotype. (ii) Phenotype Search. The phenotype data of fiber-related traits, floral traits, seed-related traits, and other traits for cotton species. (iii) Comparative Search. Search and compare the genome variations (SNPs and INDELs) of two or more cotton individuals in this database using gene, region, and variation. (iv) Gene Search. Search and achieve gene annotation (KOG, GO, KEGG, and NR), gene structure, sequences by inputting the chromosomal region and gene ID. The SNPs, and INDELs can be also searched by cross-link (<xref rid="fig3" ref-type="fig">Figure 3B</xref>). After searching, a new webpage will pop out and display all the matched results. The details of each matched result can be viewed by clicking on it. The Orthologous Gene ID function can obtain orthologous gene IDs among different cotton species and different versions of cotton. For example, the orthologous genes ID of gene &#x201C;<italic>Ga01G0003</italic>&#x201D; are &#x201C;<italic>Gh_A01G000300</italic>,&#x201D; &#x201C;<italic>Gh_D03G199000</italic>,&#x201D; &#x201C;<italic>Ghicr24_A01G000500</italic>,&#x201D; and &#x201C;<italic>GB_A10G2858</italic>&#x201D; in other cotton species, respectively (<xref rid="fig3" ref-type="fig">Figure 3C</xref>). This result was consistent with the result of the BLAST above.</p>
<fig position="float" id="fig3">
<label>Figure 3</label>
<caption>
<p>Search functions in GRAND. <bold>(A)</bold> The results of a BLAST search. <bold>(B)</bold> The Search page contains Comparative Search, Multicriteria Search, and Gene Search. <bold>(C)</bold> The orthologous gene ID function allows searching for orthologous gene IDs among different species and versions of cotton genome.</p>
</caption>
<graphic xlink:href="fpls-13-773107-g003.tif"/>
</fig>
</sec>
<sec id="sec10">
<title>Tools for Online Analysis in GRAND</title>
<p>In addition to the modules mentioned above, GRAND also offers several additional tools. The &#x201C;Expression Visualization&#x201D; section shows the expression profiles in different tissues, and users can perform the analysis by partial selection or by selecting all. The results are presented as heatmaps and the expression values (FPKM) for each data set are displayed in the table at the bottom (<xref rid="fig4" ref-type="fig">Figure 4A</xref>). Users can also import the results generated above into the &#x201C;Heatmap Draw&#x201D; section for further adjustment and embellishment. Gene network analysis can be used to identify related genes in the same biological processes or pathways. Networks of co-expressed genes are constructed based on inter-gene expression data using Pearson&#x2019;s correlation coefficient (PCC) between genes (<xref ref-type="bibr" rid="ref11">Langfelder and Horvath, 2008</xref>). Enter the gene ID and set the PCC value threshold to visualize the top 20 target genes with the highest correlation value with the query gene, and click on any co-expressed genes in the network to view their co-expression network. In addition, a summary table of all co-expressed genes and corresponding functional annotations is provided below the network. Links to basic information about the genes are created for each target gene in the summary table (<xref rid="fig4" ref-type="fig">Figure 4B</xref>). GRAND database provides primer design function based on gene sequences from cotton (<xref rid="fig4" ref-type="fig">Figure 4C</xref>). Users can also design primers for CRISPR/Cas9/Cpf1 genome editing using the targetDesign tool <italic>via</italic> the website link (<xref rid="fig4" ref-type="fig">Figure 4D</xref>; <xref ref-type="bibr" rid="ref22">Xie et al., 2017</xref>). Additionally, we provide an FTP server to store all the publicly released datasets used in GRAND, with an enhanced user interface, text preview, and directory download.</p>
<fig position="float" id="fig4">
<label>Figure 4</label>
<caption>
<p>Tools in GRAND. <bold>(A)</bold> A heatmap showing gene expression search results. Each row represents the expression data of a gene across many samples, and each column shows the expression in a particular sample. <bold>(B)</bold> Co-expression network search results. The network shows the top 20 nodes near the target gene, and the corresponding information can be viewed by clicking. <bold>(C)</bold> Primer design page. <bold>(D)</bold> TargetDesign tool for designing primers for CRISPR/Cas9/Cpf1 genome editing.</p>
</caption>
<graphic xlink:href="fpls-13-773107-g004.tif"/>
</fig>
</sec>
<sec id="sec11">
<title>Download Functions in GRAND</title>
<p>The download page provides users with selective FTP download for genome sequences and their annotation information, transcriptomics data, CDS, protein, etc.</p>
</sec>
<sec id="sec12">
<title>Limitations</title>
<p>GRAND currently still has some limitations. For example, only cotton genomic, transcriptomic and phenotypic data were collected here, some additional data, such as cotton molecular markers and metabolic data need to be expanded in the future. Moreover, there is no sequence feature extraction tool in the current database.</p>
</sec>
</sec>
<sec id="sec13">
<title>Conclusion and Perspectives</title>
<p>GRAND provides access to the various data, such as the genomic, transcriptomic, and phenotypic data for cotton. It can be browsed, mined, analyzed, and even downloaded. Moreover, GRAND provides an interface to visualize genomes, annotated genes, gene expression, and networks of co-expressed genes. The plenty available data contributes to highly resolving comparative genomics studies that shed light on the evolution and diversification of the various cotton species. During subsequent upgrades, the GRAND database will add sequence feature extraction tool and newly generated cotton data.</p>
</sec>
<sec id="sec14" sec-type="data-availability">
<title>Data Availability Statement</title>
<p>The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.</p>
</sec>
<sec id="sec15">
<title>Author Contributions</title>
<p>ZZ and LF wrote the initial draft. ZZ, ZhY, ZuY, MC and LF collected, curated, and formatted the data, and tested the GRAND and the use examples. All authors were involved in reviewing and editing the manuscript.</p>
</sec>
<sec id="sec41" sec-type="funding-information">
<title>Funding</title>
<p>This work was supported by funding from the National Natural Science Foundation of China (grant 31621005) and Xinjiang Changji Hui Autonomous Prefecture Science and Technology Projects (grant 2021Z01).</p>
</sec>
<sec id="conf1" sec-type="COI-statement">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec id="sec17" sec-type="disclaimer">
<title>Publisher&#x2019;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
</body>
<back>
<ack>
<p>We wish to thank all researchers who have generated invaluable cotton genomic resources that are gathered in the GRAND database. We thank Biomarker Technology Co., Ltd. for assisting in GRAND construction.</p>
</ack>
<ref-list>
<title>References</title>
<ref id="ref1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ashburner</surname> <given-names>M.</given-names></name> <name><surname>Ball</surname> <given-names>C. A.</given-names></name> <name><surname>Blake</surname> <given-names>J. A.</given-names></name> <name><surname>Botstein</surname> <given-names>D.</given-names></name> <name><surname>Butler</surname> <given-names>H.</given-names></name> <name><surname>Cherry</surname> <given-names>J. M.</given-names></name> <etal/></person-group>. (<year>2000</year>). <article-title>Gene ontology: tool for the unification of biology. The Gene Ontology Consortium</article-title>. <source>Nat. Genet.</source> <volume>25</volume>, <fpage>25</fpage>&#x2013;<lpage>29</lpage>. doi: <pub-id pub-id-type="doi">10.1038/75556</pub-id>, PMID: <pub-id pub-id-type="pmid">10802651</pub-id></citation></ref>
<ref id="ref2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Buels</surname> <given-names>R.</given-names></name> <name><surname>Yao</surname> <given-names>E.</given-names></name> <name><surname>Diesh</surname> <given-names>C. M.</given-names></name> <name><surname>Hayes</surname> <given-names>R. D.</given-names></name> <name><surname>Munoz-Torres</surname> <given-names>M.</given-names></name> <name><surname>Helt</surname> <given-names>G.</given-names></name> <etal/></person-group>. (<year>2016</year>). <article-title>JBrowse: a dynamic web platform for genome visualization and analysis</article-title>. <source>Genome Biol.</source> <volume>17</volume>:<fpage>66</fpage>. doi: <pub-id pub-id-type="doi">10.1186/s13059-016-0924-1</pub-id>, PMID: <pub-id pub-id-type="pmid">27072794</pub-id></citation></ref>
<ref id="ref3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cai</surname> <given-names>Y.</given-names></name> <name><surname>Cai</surname> <given-names>X.</given-names></name> <name><surname>Wang</surname> <given-names>Q.</given-names></name> <name><surname>Wang</surname> <given-names>P.</given-names></name> <name><surname>Zhang</surname> <given-names>Y.</given-names></name> <name><surname>Cai</surname> <given-names>C.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>Genome sequencing of the Australian wild diploid species <italic>Gossypium australe</italic> highlights disease resistance and delayed gland morphogenesis</article-title>. <source>Plant Biotechnol. J.</source> <volume>18</volume>, <fpage>814</fpage>&#x2013;<lpage>828</lpage>. doi: <pub-id pub-id-type="doi">10.1111/pbi.13249</pub-id>, PMID: <pub-id pub-id-type="pmid">31479566</pub-id></citation></ref>
<ref id="ref4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>Z. J.</given-names></name> <name><surname>Sreedasyam</surname> <given-names>A.</given-names></name> <name><surname>Ando</surname> <given-names>A.</given-names></name> <name><surname>Song</surname> <given-names>Q.</given-names></name> <name><surname>de Santiago</surname> <given-names>L. M.</given-names></name> <name><surname>Hulse-Kemp</surname> <given-names>A. M.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Genomic diversifications of five <italic>Gossypium</italic> allopolyploid species and their impact on cotton improvement</article-title>. <source>Nat. Genet.</source> <volume>52</volume>, <fpage>525</fpage>&#x2013;<lpage>533</lpage>. doi: <pub-id pub-id-type="doi">10.1038/s41588-020-0614-5</pub-id>, PMID: <pub-id pub-id-type="pmid">32313247</pub-id></citation></ref>
<ref id="ref5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Du</surname> <given-names>X.</given-names></name> <name><surname>Huang</surname> <given-names>G.</given-names></name> <name><surname>He</surname> <given-names>S.</given-names></name> <name><surname>Yang</surname> <given-names>Z.</given-names></name> <name><surname>Sun</surname> <given-names>G.</given-names></name> <name><surname>Ma</surname> <given-names>X.</given-names></name> <etal/></person-group>. (<year>2018</year>). <article-title>Resequencing of 243 diploid cotton accessions based on an updated A genome identifies the genetic basis of key agronomic traits</article-title>. <source>Nat. Genet.</source> <volume>50</volume>, <fpage>796</fpage>&#x2013;<lpage>802</lpage>. doi: <pub-id pub-id-type="doi">10.1038/s41588-018-0116-x</pub-id>, PMID: <pub-id pub-id-type="pmid">29736014</pub-id></citation></ref>
<ref id="ref6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>El-Gebali</surname> <given-names>S.</given-names></name> <name><surname>Mistry</surname> <given-names>J.</given-names></name> <name><surname>Bateman</surname> <given-names>A.</given-names></name> <name><surname>Eddy</surname> <given-names>S. R.</given-names></name> <name><surname>Luciani</surname> <given-names>A.</given-names></name> <name><surname>Potter</surname> <given-names>S. C.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>The Pfam protein families database in 2019</article-title>. <source>Nucleic Acids Res.</source> <volume>47</volume>, <fpage>D427</fpage>&#x2013;<lpage>D432</lpage>. doi: <pub-id pub-id-type="doi">10.1093/nar/gky995</pub-id>, PMID: <pub-id pub-id-type="pmid">30357350</pub-id></citation></ref>
<ref id="ref600"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Grover</surname> <given-names>C. E.</given-names></name> <name><surname>Pan</surname> <given-names>M.</given-names></name> <name><surname>Yuan</surname> <given-names>D.</given-names></name> <name><surname>Arick</surname> <given-names>M. A.</given-names></name> <name><surname>Hu</surname> <given-names>G.</given-names></name> <name><surname>Brase</surname> <given-names>L.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>The <italic>Gossypium longicalyx</italic> genome as a resource for cotton breeding and evolution</article-title>. <source>G3-Genes, Genom. Genet.</source> <volume>10</volume>, <fpage>1457</fpage>&#x2013;<lpage>1467</lpage>. doi: <pub-id pub-id-type="doi">10.1534/g3.120.401050</pub-id>, PMID: <pub-id pub-id-type="pmid">32284579</pub-id></citation></ref>
<ref id="ref700"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Grover</surname> <given-names>C. E.</given-names></name> <name><surname>Yuan</surname> <given-names>D.</given-names></name> <name><surname>Arick</surname> <given-names>M. A.</given-names></name> <name><surname>Miller</surname> <given-names>E. R.</given-names></name> <name><surname>Hu</surname> <given-names>G.</given-names></name> <name><surname>Peterson</surname> <given-names>D. G.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>The <italic>Gossypium anomalum</italic> genome as a resource for cotton improvement and evolutionary analysis of hybrid incompatibility</article-title>. <source>G3-Genes, Genom. Genet.</source> <volume>11</volume>. doi: <pub-id pub-id-type="doi">10.1093/g3journal/jkab319</pub-id>, PMID: <pub-id pub-id-type="pmid">32284579</pub-id></citation></ref>
<ref id="ref7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hu</surname> <given-names>Y.</given-names></name> <name><surname>Chen</surname> <given-names>J.</given-names></name> <name><surname>Fang</surname> <given-names>L.</given-names></name> <name><surname>Zhang</surname> <given-names>Z.</given-names></name> <name><surname>Ma</surname> <given-names>W.</given-names></name> <name><surname>Niu</surname> <given-names>Y.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title><italic>Gossypium barbadense</italic> and <italic>Gossypium hirsutum</italic> genomes provide insights into the origin and evolution of allotetraploid cotton</article-title>. <source>Nat. Genet.</source> <volume>51</volume>, <fpage>739</fpage>&#x2013;<lpage>748</lpage>. doi: <pub-id pub-id-type="doi">10.1038/s41588-019-0371-5</pub-id>, PMID: <pub-id pub-id-type="pmid">30886425</pub-id></citation></ref>
<ref id="ref8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Huang</surname> <given-names>G.</given-names></name> <name><surname>Wu</surname> <given-names>Z.</given-names></name> <name><surname>Percy</surname> <given-names>R. G.</given-names></name> <name><surname>Bai</surname> <given-names>M.</given-names></name> <name><surname>Li</surname> <given-names>Y.</given-names></name> <name><surname>Frelichowski</surname> <given-names>J. E.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Genome sequence of <italic>Gossypium herbaceum</italic> and genome updates of <italic>Gossypium arboreum</italic> and <italic>Gossypium hirsutum</italic> provide insights into cotton A-genome evolution</article-title>. <source>Nat. Genet.</source> <volume>52</volume>, <fpage>516</fpage>&#x2013;<lpage>524</lpage>. doi: <pub-id pub-id-type="doi">10.1038/s41588-020-0607-4</pub-id>, PMID: <pub-id pub-id-type="pmid">32284579</pub-id></citation></ref>
<ref id="ref9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kim</surname> <given-names>D.</given-names></name> <name><surname>Pertea</surname> <given-names>G.</given-names></name> <name><surname>Trapnell</surname> <given-names>C.</given-names></name> <name><surname>Pimentel</surname> <given-names>H.</given-names></name> <name><surname>Kelley</surname> <given-names>R.</given-names></name> <name><surname>Salzberg</surname> <given-names>S. L.</given-names></name></person-group> (<year>2013</year>). <article-title>TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions</article-title>. <source>Genome Biol.</source> <volume>14</volume>:<fpage>R36</fpage>. doi: <pub-id pub-id-type="doi">10.1186/gb-2013-14-4-r36</pub-id>, PMID: <pub-id pub-id-type="pmid">23618408</pub-id></citation></ref>
<ref id="ref10"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Koonin</surname> <given-names>E. V.</given-names></name> <name><surname>Fedorova</surname> <given-names>N. D.</given-names></name> <name><surname>Jackson</surname> <given-names>J. D.</given-names></name> <name><surname>Jacobs</surname> <given-names>A. R.</given-names></name> <name><surname>Krylov</surname> <given-names>D. M.</given-names></name> <name><surname>Makarova</surname> <given-names>K. S.</given-names></name> <etal/></person-group>. (<year>2004</year>). <article-title>A comprehensive evolutionary classification of proteins encoded in complete eukaryotic genomes</article-title>. <source>Genome Biol.</source> <volume>5</volume>:<fpage>R7</fpage>. doi: <pub-id pub-id-type="doi">10.1186/gb-2004-5-2-r7</pub-id>, PMID: <pub-id pub-id-type="pmid">14759257</pub-id></citation></ref>
<ref id="ref11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Langfelder</surname> <given-names>P.</given-names></name> <name><surname>Horvath</surname> <given-names>S.</given-names></name></person-group> (<year>2008</year>). <article-title>WGCNA: an R package for weighted correlation network analysis</article-title>. <source>BMC Bioinf.</source> <volume>9</volume>:<fpage>559</fpage>. doi: <pub-id pub-id-type="doi">10.1186/1471-2105-9-559</pub-id>, PMID: <pub-id pub-id-type="pmid">19114008</pub-id></citation></ref>
<ref id="ref12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>O&#x2019;Brien</surname> <given-names>K. P.</given-names></name> <name><surname>Remm</surname> <given-names>M.</given-names></name> <name><surname>Sonnhammer</surname> <given-names>E. L.</given-names></name></person-group> (<year>2005</year>). <article-title>Inparanoid: a comprehensive database of eukaryotic orthologs</article-title>. <source>Nucleic Acids Res.</source> <volume>33</volume>, <fpage>D476</fpage>&#x2013;<lpage>D480</lpage>. doi: <pub-id pub-id-type="doi">10.1093/nar/gki107</pub-id>, PMID: <pub-id pub-id-type="pmid">15608241</pub-id></citation></ref>
<ref id="ref13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Peng</surname> <given-names>Z.</given-names></name> <name><surname>Li</surname> <given-names>H.</given-names></name> <name><surname>Sun</surname> <given-names>G.</given-names></name> <name><surname>Dai</surname> <given-names>P.</given-names></name> <name><surname>Geng</surname> <given-names>X.</given-names></name> <name><surname>Wang</surname> <given-names>X.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>CottonGVD: a comprehensive genomic variation database for cultivated cottons</article-title>. <source>Front. Plant Sci.</source> doi: <pub-id pub-id-type="doi">10.3389/fpls.2021.803736</pub-id></citation></ref>
<ref id="ref14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Priyam</surname> <given-names>A.</given-names></name> <name><surname>Woodcroft</surname> <given-names>B. J.</given-names></name> <name><surname>Rai</surname> <given-names>V.</given-names></name> <name><surname>Moghul</surname> <given-names>I.</given-names></name> <name><surname>Munagala</surname> <given-names>A.</given-names></name> <name><surname>Ter</surname> <given-names>F.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>Sequenceserver: a modern graphical user interface for custom BLAST databases</article-title>. <source>Mol. Biol. Evol.</source> <volume>36</volume>, <fpage>2922</fpage>&#x2013;<lpage>2924</lpage>. doi: <pub-id pub-id-type="doi">10.1093/molbev/msz185</pub-id>, PMID: <pub-id pub-id-type="pmid">31411700</pub-id></citation></ref>
<ref id="ref15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shen</surname> <given-names>C.</given-names></name> <name><surname>Wang</surname> <given-names>N.</given-names></name> <name><surname>Zhu</surname> <given-names>D.</given-names></name> <name><surname>Wang</surname> <given-names>P.</given-names></name> <name><surname>Wang</surname> <given-names>M.</given-names></name> <name><surname>Wen</surname> <given-names>T.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title><italic>Gossypium tomentosum</italic> genome and interspecific ultra-dense genetic maps reveal genomic structures, recombination landscape and flowering depression in cotton</article-title>. <source>Genomics</source> <volume>113</volume>, <fpage>1999</fpage>&#x2013;<lpage>2009</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.ygeno.2021.04.036</pub-id>, PMID: <pub-id pub-id-type="pmid">33915244</pub-id></citation></ref>
<ref id="ref16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Stein</surname> <given-names>L. D.</given-names></name> <name><surname>Mungall</surname> <given-names>C.</given-names></name> <name><surname>Shu</surname> <given-names>S. Q.</given-names></name> <name><surname>Caudy</surname> <given-names>M.</given-names></name> <name><surname>Mangone</surname> <given-names>M.</given-names></name> <name><surname>Day</surname> <given-names>A.</given-names></name> <etal/></person-group>. (<year>2002</year>). <article-title>The generic genome browser: a building block for a model organism system database</article-title>. <source>Genome Res.</source> <volume>12</volume>, <fpage>1599</fpage>&#x2013;<lpage>1610</lpage>. doi: <pub-id pub-id-type="doi">10.1101/gr.403602</pub-id>, PMID: <pub-id pub-id-type="pmid">12368253</pub-id></citation></ref>
<ref id="ref17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tatusov</surname> <given-names>R. L.</given-names></name> <name><surname>Galperin</surname> <given-names>M. Y.</given-names></name> <name><surname>Natale</surname> <given-names>D. A.</given-names></name> <name><surname>Koonin</surname> <given-names>E. V.</given-names></name></person-group> (<year>2000</year>). <article-title>The COG database: a tool for genome-scale analysis of protein functions and evolution</article-title>. <source>Nucleic Acids Res.</source> <volume>28</volume>, <fpage>33</fpage>&#x2013;<lpage>36</lpage>. doi: <pub-id pub-id-type="doi">10.1093/nar/28.1.33</pub-id>, PMID: <pub-id pub-id-type="pmid">10592175</pub-id></citation></ref>
<ref id="ref18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Udall</surname> <given-names>J. A.</given-names></name> <name><surname>Long</surname> <given-names>E.</given-names></name> <name><surname>Hanson</surname> <given-names>C.</given-names></name> <name><surname>Yuan</surname> <given-names>D.</given-names></name> <name><surname>Ramaraj</surname> <given-names>T.</given-names></name> <name><surname>Conover</surname> <given-names>J. L.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title><italic>De novo</italic> genome sequence assemblies of <italic>Gossypium raimondii</italic> and <italic>Gossypium turneri</italic></article-title>. <source>G3</source> <volume>9</volume>, <fpage>3079</fpage>&#x2013;<lpage>3085</lpage>. doi: <pub-id pub-id-type="doi">10.1534/g3.119.400392</pub-id>, PMID: <pub-id pub-id-type="pmid">31462444</pub-id></citation></ref>
<ref id="ref19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Untergasser</surname> <given-names>A.</given-names></name> <name><surname>Cutcutache</surname> <given-names>I.</given-names></name> <name><surname>Koressaar</surname> <given-names>T.</given-names></name> <name><surname>Ye</surname> <given-names>J.</given-names></name> <name><surname>Faircloth</surname> <given-names>B. C.</given-names></name> <name><surname>Remm</surname> <given-names>M.</given-names></name> <etal/></person-group>. (<year>2012</year>). <article-title>Primer3-new capabilities and interfaces</article-title>. <source>Nucleic Acids Res.</source> <volume>40</volume>:<fpage>e115</fpage>. doi: <pub-id pub-id-type="doi">10.1093/nar/gks596</pub-id>, PMID: <pub-id pub-id-type="pmid">22730293</pub-id></citation></ref>
<ref id="ref20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>D.</given-names></name> <name><surname>Fan</surname> <given-names>W.</given-names></name> <name><surname>Guo</surname> <given-names>X.</given-names></name> <name><surname>Wu</surname> <given-names>K.</given-names></name> <name><surname>Zhou</surname> <given-names>S.</given-names></name> <name><surname>Chen</surname> <given-names>Z.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>MaGenDB: a functional genomics hub for Malvaceae plants</article-title>. <source>Nucleic Acids Res.</source> <volume>48</volume>, <fpage>D1076</fpage>&#x2013;<lpage>D1084</lpage>. doi: <pub-id pub-id-type="doi">10.1093/nar/gkz953</pub-id>, PMID: <pub-id pub-id-type="pmid">31665439</pub-id></citation></ref>
<ref id="ref21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>M.</given-names></name> <name><surname>Tu</surname> <given-names>L.</given-names></name> <name><surname>Yuan</surname> <given-names>D.</given-names></name> <name><surname>Zhu</surname> <given-names>D.</given-names></name> <name><surname>Shen</surname> <given-names>C.</given-names></name> <name><surname>Li</surname> <given-names>J.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>Reference genome sequences of two cultivated allotetraploid cottons, <italic>Gossypium hirsutum</italic> and <italic>Gossypium barbadense</italic></article-title>. <source>Nat. Genet.</source> <volume>51</volume>, <fpage>224</fpage>&#x2013;<lpage>229</lpage>. doi: <pub-id pub-id-type="doi">10.1038/s41588-018-0282-x</pub-id>, PMID: <pub-id pub-id-type="pmid">30510239</pub-id></citation></ref>
<ref id="ref22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xie</surname> <given-names>X.</given-names></name> <name><surname>Ma</surname> <given-names>X.</given-names></name> <name><surname>Zhu</surname> <given-names>Q.</given-names></name> <name><surname>Zeng</surname> <given-names>D.</given-names></name> <name><surname>Li</surname> <given-names>G.</given-names></name> <name><surname>Liu</surname> <given-names>Y. G.</given-names></name></person-group> (<year>2017</year>). <article-title>CRISPR-GE: a convenient software toolkit for CRISPR-based genome editing</article-title>. <source>Mol. Plant</source> <volume>10</volume>, <fpage>1246</fpage>&#x2013;<lpage>1249</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.molp.2017.06.004</pub-id>, PMID: <pub-id pub-id-type="pmid">28624544</pub-id></citation></ref>
<ref id="ref23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yang</surname> <given-names>Z.</given-names></name> <name><surname>Ge</surname> <given-names>X.</given-names></name> <name><surname>Li</surname> <given-names>W.</given-names></name> <name><surname>Jin</surname> <given-names>Y.</given-names></name> <name><surname>Liu</surname> <given-names>L.</given-names></name> <name><surname>Hu</surname> <given-names>W.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>Cotton D genome assemblies built with long-read data unveil mechanisms of centromere evolution and stress tolerance divergence</article-title>. <source>BMC Biol.</source> <volume>19</volume>:<fpage>115</fpage>. doi: <pub-id pub-id-type="doi">10.1186/s12915-021-01041-0</pub-id>, PMID: <pub-id pub-id-type="pmid">34082735</pub-id></citation></ref>
<ref id="ref24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yang</surname> <given-names>Z.</given-names></name> <name><surname>Ge</surname> <given-names>X.</given-names></name> <name><surname>Yang</surname> <given-names>Z.</given-names></name> <name><surname>Qin</surname> <given-names>W.</given-names></name> <name><surname>Sun</surname> <given-names>G.</given-names></name> <name><surname>Wang</surname> <given-names>Z.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>Extensive intraspecific gene order and gene structural variations in upland cotton cultivars</article-title>. <source>Nat. Commun.</source> <volume>10</volume>:<fpage>2989</fpage>. doi: <pub-id pub-id-type="doi">10.1038/s41467-019-10820-x</pub-id>, PMID: <pub-id pub-id-type="pmid">31278252</pub-id></citation></ref>
<ref id="ref25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>You</surname> <given-names>Q.</given-names></name> <name><surname>Xu</surname> <given-names>W.</given-names></name> <name><surname>Zhang</surname> <given-names>K.</given-names></name> <name><surname>Zhang</surname> <given-names>L.</given-names></name> <name><surname>Yi</surname> <given-names>X.</given-names></name> <name><surname>Yao</surname> <given-names>D.</given-names></name> <etal/></person-group>. (<year>2017</year>). <article-title>ccNET: database of co-expression networks with functional modules for diploid and polyploid <italic>Gossypium</italic></article-title>. <source>Nucleic Acids Res.</source> <volume>45</volume>, <fpage>D1090</fpage>&#x2013;<lpage>D1099</lpage>. doi: <pub-id pub-id-type="doi">10.1093/nar/gkw910</pub-id>, PMID: <pub-id pub-id-type="pmid">28053168</pub-id></citation></ref>
<ref id="ref26"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yu</surname> <given-names>J.</given-names></name> <name><surname>Jung</surname> <given-names>S.</given-names></name> <name><surname>Cheng</surname> <given-names>C. H.</given-names></name> <name><surname>Ficklin</surname> <given-names>S. P.</given-names></name> <name><surname>Lee</surname> <given-names>T.</given-names></name> <name><surname>Zheng</surname> <given-names>P.</given-names></name> <etal/></person-group>. (<year>2014</year>). <article-title>CottonGen: a genomics, genetics and breeding database for cotton research</article-title>. <source>Nucleic Acids Res.</source> <volume>42</volume>, <fpage>D1229</fpage>&#x2013;<lpage>D1236</lpage>. doi: <pub-id pub-id-type="doi">10.1093/nar/gkt1064</pub-id>, PMID: <pub-id pub-id-type="pmid">24203703</pub-id></citation></ref>
<ref id="ref27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhu</surname> <given-names>T.</given-names></name> <name><surname>Liang</surname> <given-names>C.</given-names></name> <name><surname>Meng</surname> <given-names>Z.</given-names></name> <name><surname>Sun</surname> <given-names>G.</given-names></name> <name><surname>Meng</surname> <given-names>Z.</given-names></name> <name><surname>Guo</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2017</year>). <article-title>CottonFGD: an integrated functional genomics database for cotton</article-title>. <source>BMC Plant Biol.</source> <volume>17</volume>:<fpage>101</fpage>. doi: <pub-id pub-id-type="doi">10.1186/s12870-017-1039-x</pub-id>, PMID: <pub-id pub-id-type="pmid">28595571</pub-id></citation></ref></ref-list>
<fn-group>
<fn id="fn0004"><p><sup>1</sup><ext-link xlink:href="http://cotton.zju.edu.cn/" ext-link-type="uri">http://cotton.zju.edu.cn/</ext-link></p></fn>
<fn id="fn0005"><p><sup>2</sup><ext-link xlink:href="http://cole-trapnelllab.github.io/cufflinks/" ext-link-type="uri">http://cole-trapnelllab.github.io/cufflinks/</ext-link></p></fn>
</fn-group>
</back>
</article>