<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="2.3" xml:lang="EN">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Plant Sci.</journal-id>
<journal-title>Frontiers in Plant Science</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Plant Sci.</abbrev-journal-title>
<issn pub-type="epub">1664-462X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fpls.2023.1237426</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Plant Science</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>MegaLTR: a web server and standalone pipeline for detecting and annotating LTR-retrotransposons in plant genomes</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Mokhtar</surname>
<given-names>Morad M.</given-names>
</name>
<xref ref-type="author-notes" rid="fn001">
<sup>*</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/444798"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>El Allali</surname>
<given-names>Achraf</given-names>
</name>
<xref ref-type="author-notes" rid="fn001">
<sup>*</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1902024"/>
</contrib>
</contrib-group>
<aff id="aff1">
<institution>African Genome Center, Mohammed VI Polytechnic University</institution>, <addr-line>Benguerir</addr-line>, <country>Morocco</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>Edited by: Manohar Chakrabarti, The University of Texas Rio Grande Valley, United States</p>
</fn>
<fn fn-type="edited-by">
<p>Reviewed by: Niharika Sharma, NSW Government, Australia; Xiujun Zhang, Chinese Academy of Sciences (CAS), China</p>
</fn>
<fn fn-type="corresp" id="fn001">
<p>*Correspondence: Achraf El Allali, <email xlink:href="mailto:achraf.elallali@um6p.ma">achraf.elallali@um6p.ma</email>; Morad M. Mokhtar, <email xlink:href="mailto:morad.mokhtar@ageri.sci.eg">morad.mokhtar@ageri.sci.eg</email>
</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>20</day>
<month>09</month>
<year>2023</year>
</pub-date>
<pub-date pub-type="collection">
<year>2023</year>
</pub-date>
<volume>14</volume>
<elocation-id>1237426</elocation-id>
<history>
<date date-type="received">
<day>09</day>
<month>06</month>
<year>2023</year>
</date>
<date date-type="accepted">
<day>21</day>
<month>08</month>
<year>2023</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2023 Mokhtar and El Allali</copyright-statement>
<copyright-year>2023</copyright-year>
<copyright-holder>Mokhtar and El Allali</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<p>LTR-retrotransposons (LTR-RTs) are a class of RNA-replicating transposon elements (TEs) that can alter genome structure and function by moving positions, repositioning genes, shifting exons, and causing chromosomal rearrangements. LTR-RTs are widespread in many plant genomes and constitute a significant portion of the genome. Their movement and activity in eukaryotic genomes can provide insight into genome evolution and gene function, especially when LTR-RTs are located near or within genes. Building the redundant and non-redundant LTR-RTs libraries and their annotations for species lacking this resource requires extensive bioinformatics pipelines and expensive computing power to analyze large amounts of genomic data. This increases the need for online services that provide computational resources with minimal overhead and maximum efficiency. Here, we present MegaLTR as a web server and standalone pipeline that detects intact LTR-RTs at the whole-genome level and integrates multiple tools for structure-based, homologybased, and <italic>de novo</italic> identification, classification, annotation, insertion time determination, and LTR-RT gene chimera analysis. MegaLTR also provides statistical analysis and visualization with multiple tools and can be used to accelerate plant species discovery and assist breeding programs in their efforts to improve genomic resources. We hope that the development of online services such as MegaLTR, which can analyze large amounts of genomic data, will become increasingly important for the automated detection and annotation of LTR-RT elements.</p>
</abstract>
<kwd-group>
<kwd>LTR-retrotransposons</kwd>
<kwd>plant genomes</kwd>
<kwd>webserver</kwd>
<kwd>insertion age</kwd>
<kwd>LTR-RT gene chimeras</kwd>
<kwd>non-redundant LTR-RTs library</kwd>
</kwd-group>
<counts>
<fig-count count="4"/>
<table-count count="3"/>
<equation-count count="1"/>
<ref-count count="60"/>
<page-count count="11"/>
<word-count count="5859"/>
</counts>
<custom-meta-wrap>
<custom-meta>
<meta-name>section-in-acceptance</meta-name>
<meta-value>Functional and Applied Plant Genomics</meta-value>
</custom-meta>
</custom-meta-wrap>
</article-meta>
</front>
<body>
<sec id="s1" sec-type="intro">
<label>1</label>
<title>Introduction</title>
<p>Long Terminal Repeat (LTR) Retrotransposons (LTR-RTs) are a class of transposon elements (TEs) belonging to the repetitive DNA sequences that have played a crucial role in shaping the structure and function of eukaryotic genomes (<xref ref-type="bibr" rid="B50">Vitte and Panaud, 2005</xref>). LTR-RTs are characterized by their ability to move within genomes <italic>via</italic> a &#x201c;copy-and-paste&#x201d; mechanism that involves transcription into RNA, reverse transcription into DNA, and subsequent insertion into new genomic locations (<xref ref-type="bibr" rid="B25">Lopes et&#xa0;al., 2013</xref>). These elements have been found in various organisms, including plants, where they contribute significantly to genome size and complexity. LTR-RTs are of great interest in the field of genomics because of their importance in genome evolution, gene regulation, and understanding plant biology (<xref ref-type="bibr" rid="B2">Bennetzen and Wang, 2014</xref>). Plant genomes are often characterized by a high proportion of TEs, with LTR-RTs being one of the major contributors to these elements. TEs can make up a substantial portion of the plant genome, as in maize, where TEs account for 85% of the genome, of which LTR-RTs account for 75% (<xref ref-type="bibr" rid="B40">Schnable et&#xa0;al., 2009</xref>). This wide distribution highlights their importance in shaping genome architecture and dynamics (<xref ref-type="bibr" rid="B40">Schnable et&#xa0;al., 2009</xref>). LTR-RTs are known to play a role in creating genetic diversity, promoting chromosomal rearrangements and influencing gene expression through their insertion sites and regulatory sequences (<xref ref-type="bibr" rid="B2">Bennetzen and Wang, 2014</xref>). Therefore, the study of LTR-RTs is crucial to unravel the complexity of plant genomes and understand their functional implications (<xref ref-type="bibr" rid="B56">Xia et&#xa0;al., 2020</xref>). The study of LTR-RTs provides insights into various aspects of plant genome biology. For example, studying their structural diversity, insertion patterns, and distribution in plant taxa can provide insight into evolutionary history and interspecies relationships (<xref ref-type="bibr" rid="B14">Grandbastien et&#xa0;al., 2005</xref>). In addition, understanding the regulation of LTR-RTs activity and its interplay with host factors can provide insight into the mechanisms of genome stability (<xref ref-type="bibr" rid="B49">Vitte et&#xa0;al., 2014</xref>). Because LTR-RTs can influence nearby gene expression through epigenetic modifications and transcriptional interference, studying these elements contributes to our understanding of gene regulatory networks in plants (<xref ref-type="bibr" rid="B60">Zhao et&#xa0;al., 2016</xref>; <xref ref-type="bibr" rid="B29">Mokhtar et&#xa0;al., 2021</xref>).</p>
<p>The movement of LTR-RTs within genomes contributes to genome evolution by generating genetic variation and driving genome expansion (<xref ref-type="bibr" rid="B49">Vitte et&#xa0;al., 2014</xref>). These elements can facilitate chromosomal rearrangements through unequal homologous recombination between LTRs or ectopic recombination between non-homologous LTRs. Such events can lead to gene duplications, deletions, and chromosomal rearrangements that contribute to plant genome diversification (<xref ref-type="bibr" rid="B26">Ma et&#xa0;al., 2004</xref>). LTR-RTs may also serve as targets for silencing by small RNAs, which could affect their transposition rates and influence the evolutionary development of plant species (<xref ref-type="bibr" rid="B12">Franco-Zorrilla et&#xa0;al., 2007</xref>). While some LTR-RTs are likely to be transcriptionally inactive, accumulating evidence suggests that many elements have been co-opted for useful functions in plant genomes. For example, some LTR-RTs have been domesticated to provide regulatory sequences such as promoters and enhancers for nearby genes (<xref ref-type="bibr" rid="B17">Jung et&#xa0;al., 2019</xref>). In addition, they have been associated with stress responses, chromatin remodeling, and even symbiotic interactions (<xref ref-type="bibr" rid="B16">Ito et&#xa0;al., 2016</xref>; <xref ref-type="bibr" rid="B38">Pereira, 2016</xref>). Understanding the functional significance of LTR-RTs in plant genomes provides insights into the intricate interplay between repetitive DNA elements and the evolution of novel traits.</p>
<p>LTR-RTs consist of several different structural elements that play different roles in the movement and regulation of the element within the genome. Common elements include target site duplication (TSD), two semi-identical LTRs, polypurine tract (PPT), primer binding site (PBS), <italic>GAG</italic> and <italic>Pol</italic> genes (<xref ref-type="bibr" rid="B21">Kumar, 1998</xref>). LTRs are long stretches of DNA located at both ends of the element and are typically several hundred base pairs long. LTRs contain regulatory elements (promoters, enhancers) and are thought to be important for the integration and stability of the element in the genome (<xref ref-type="bibr" rid="B21">Kumar, 1998</xref>). <italic>GAG</italic> and <italic>Pol</italic> genes are genes that encode proteins involved in the movement and replication of the element (<xref ref-type="bibr" rid="B9">Eickbush and Jamburuthugoda, 2008</xref>). The <italic>GAG</italic> gene encodes a structural protein involved in the assembly of the element, while the <italic>Pol</italic> gene consists of several different functional domains, including protease (PROT), reverse transcriptase (RT), RNase H (RH), and integrase (INT) (<xref ref-type="bibr" rid="B47">Ustyantsev et&#xa0;al., 2015</xref>).The RT domain is responsible for synthesizing a DNA copy of the RNA template of the element, while the INT domain is responsible for integrating the element into the genome (<xref ref-type="bibr" rid="B60">Zhao et&#xa0;al., 2016</xref>). The PROT domain is responsible for cleavage of the <italic>Pol</italic> protein into its functional domains; the RH domain is involved in degradation of the RNA template during reverse transcription; and other domains that are involved in various aspects of movement and regulation of the element (<xref ref-type="bibr" rid="B13">Gao et&#xa0;al., 2003</xref>; <xref ref-type="bibr" rid="B47">Ustyantsev et&#xa0;al., 2015</xref>). LTR-RTs are divided into two main categories based on their mode of movement: autonomous and non-autonomous. Autonomous LTR-RTs are capable of moving by themselves, whereas non-autonomous LTR-RTs require the assistance of an autonomous element to move (<xref ref-type="bibr" rid="B54">Wicker et&#xa0;al., 2007</xref>). In addition, LTR-RTs are classified into superfamilies <italic>Copia</italic> and <italic>Gypsy</italic> based on internal domain arrangements (<xref ref-type="bibr" rid="B54">Wicker et&#xa0;al., 2007</xref>). Other LTR-RTs groups include <italic>LARD</italic> (LArge Retrotransposon Derivatives), <italic>BARE-2</italic> (Barley RetroElement-2), <italic>TR-GAG</italic> (Terminal Repeat Retrotransposons with <italic>GAG</italic> domain), and <italic>TRIM</italic> (Terminal Repeats In Miniature)((<xref ref-type="bibr" rid="B55">Witte et&#xa0;al., 2001</xref>; <xref ref-type="bibr" rid="B19">Kalendar et&#xa0;al., 2004</xref>; <xref ref-type="bibr" rid="B45">Tanskanen et&#xa0;al., 2007</xref>; <xref ref-type="bibr" rid="B6">Chaparro et&#xa0;al., 2015</xref>), respectively).</p>
<p>Despite their widespread use and importance, LTR-RTs remain difficult to identify and annotate in most non-model organisms (<xref ref-type="bibr" rid="B37">Ou et&#xa0;al., 2019</xref>). One reason is that they are often difficult to identify and track in the genome. They are also difficult to study because they have complex and variable structures and can interact in complex ways with other DNA sequences (<xref ref-type="bibr" rid="B37">Ou et&#xa0;al., 2019</xref>). However, research on LTR-RTs has increased in recent years, thanks to advances in sequencing technology and bioinformatics that have improved our understanding of the role of LTR-RT in genomes. Several tools, pipelines, and databases exist to identify LTR-RTs and support current and future functional genomics research. These tools include Tandem Repeats Finder [TRF, (<xref ref-type="bibr" rid="B3">Benson, 1999</xref>)], LTR_STRUC (<xref ref-type="bibr" rid="B27">McCarthy and McDonald, 2003</xref>), LTR_FINDER (<xref ref-type="bibr" rid="B58">Xu and Wang, 2007</xref>), LTRdigest (<xref ref-type="bibr" rid="B43">Steinbiss et&#xa0;al., 2009</xref>), LTRharvest (<xref ref-type="bibr" rid="B10">Ellinghaus et&#xa0;al., 2008</xref>), RepeatMasker (<xref ref-type="bibr" rid="B42">Smit et&#xa0;al., 2015</xref>), MGEScan3 (<xref ref-type="bibr" rid="B22">Lee et&#xa0;al., 2016</xref>), LTR_retriever (<xref ref-type="bibr" rid="B35">Ou and Jiang, 2017</xref>), LtrDetector (<xref ref-type="bibr" rid="B48">Valencia and Girgis, 2019</xref>), DARTS (<xref ref-type="bibr" rid="B4">Biryukov and Ustyantsev, 2021</xref>), and TEsorter (<xref ref-type="bibr" rid="B59">Zhang et&#xa0;al., 2022</xref>). Once LTR-RTs are identified, they can be annotated using various databases and resources. Some examples of databases and resources developed for this purpose are TREP (<xref ref-type="bibr" rid="B53">Wicker et&#xa0;al., 2002</xref>), RepBase (<xref ref-type="bibr" rid="B18">Jurka et&#xa0;al., 2005</xref>), REXdb (<xref ref-type="bibr" rid="B32">Neumann et&#xa0;al., 2019</xref>), PlantRep (<xref ref-type="bibr" rid="B1">Amselem et&#xa0;al., 2019</xref>), and PlantLTRdb (<xref ref-type="bibr" rid="B30">Mokhtar et&#xa0;al., 2023b</xref>). These tools and databases have been used to create automatized pipelines for LTR-RT analysis, including REPCLASS (<xref ref-type="bibr" rid="B11">Feschotte et&#xa0;al., 2009</xref>), EDTA (<xref ref-type="bibr" rid="B37">Ou et&#xa0;al., 2019</xref>), and Inpactor2 (<xref ref-type="bibr" rid="B33">Orozco-Arias et&#xa0;al., 2022</xref>).</p>
<p>EDTA is a pipeline that integrates structural-, homology-based, and <italic>de novo</italic> identification methods to create TEs libraries. EDTA combines LTRharvest, LTR_FINDER, and LTR_retriever to analyze LTR-RTs. In addition, Generic Repeat Finder (<xref ref-type="bibr" rid="B41">Shi and Liang, 2019</xref>), TIR-Learner (<xref ref-type="bibr" rid="B44">Su et&#xa0;al., 2019</xref>), HelitronScanner (<xref ref-type="bibr" rid="B57">Xiong et&#xa0;al., 2014</xref>), and RepeatModeler (<xref ref-type="bibr" rid="B42">Smit et&#xa0;al., 2015</xref>) are used for other TEs. For LTR-RTs, EDTA performs identification, superfamily-level classification (<italic>Copia</italic> and <italic>Gypsy</italic>), and insertion age estimation with highly efficient tools. Another available pipeline is Inpactor2. It integrates the process of identification and classification of LTR-RTs at the lineage level and runs in a reasonable time. While EDTA and Inpactor2 are comprehensive pipelines for creating LTR-RTs libraries, it lacks some features, such as putative autonomous and non-autonomous classification, identification of LTR-RT gene chimeras, detection of LTR-RTs near genes, statistical analysis and visualization of LTR-RTs, and adjustable parameters for each analysis step. It is also not available as a web server and requires some level of technical computer skills. Like any machine learning-based algorithm, Inpactor2 is dependent on the quality of its training dataset (<xref ref-type="bibr" rid="B33">Orozco-Arias et&#xa0;al., 2022</xref>), a fact that users should consider when using this algorithm.</p>
<p>Here we introduce MegaLTR as a web server and standalone pipeline that detects intact LTR-RTs at the whole genome level. MegaLTR integrates multiple tools for structure-based, homology-based, and <italic>de novo</italic> identification, classification, and annotation. MegaLTR performs classification into putative autonomous and non-autonomous, superfamilial and lineage levels. It also identifies LTR-RT gene chimeras, detects LTR-RTs near genes, statistical analysis and visualization of LTR-RT. MegaLTR is easy to use and allows customization of parameters for each analysis step in both its web server and standalone versions.</p>
</sec>
<sec id="s2" sec-type="materials|methods">
<label>2</label>
<title>Materials and methods</title>
<sec id="s2_1">
<label>2.1</label>
<title>Genomic data</title>
<p>The complete genome sequences and annotations of 26 plant species were downloaded from the NCBI database (<xref ref-type="bibr" rid="B51">Wheeler et&#xa0;al., 2007</xref>). These genomes were selected based on some criteria, such as annotation and LTR assembly index (LAI) score (<xref ref-type="bibr" rid="B34">Ou et&#xa0;al., 2018</xref>), genome size, number of pseudomolecules/scaffolds, and the fact that they were model and non-model plants. The LAI score has been widely used in recent years to assess the quality of genome assemblies. It has been shown to be useful in determining the quality of assemblies, as a higher LAI score is associated with a higher quality assembly (<xref ref-type="bibr" rid="B34">Ou et&#xa0;al., 2018</xref>). The LAI score of each species was taken from the PlantLAI database (<xref ref-type="bibr" rid="B28">Mokhtar et&#xa0;al., 2023a</xref>). The plant name, NCBI taxonomy ID, GenBank accession number, assembly level, LAI score, genome size, evolutionary rate, and number of pseudomolecules/scaffolds of the studied species are listed in <xref ref-type="supplementary-material" rid="SM1">
<bold>Table S1</bold>
</xref>.</p>
</sec>
<sec id="s2_2">
<label>2.2</label>
<title>MegaLTR design and workflow</title>
<p>MegaLTR&#x2019;s workflow includes multiple programs interconnected by data adapters to ensure that data is routed from the server to a high-performance computer (HPC) and back to the server and processed as an end-to-end pipeline. The implementation of MegaLTR was summarized in <xref ref-type="supplementary-material" rid="SM1">
<bold>Data Sheet 1</bold>
</xref>. The MegaLTR workflow is shown schematically in <xref ref-type="fig" rid="f1">
<bold>Figure&#xa0;1</bold>
</xref>. MegaLTR is designed to accept FASTA sequences and their GFF annotation as input. It is capable of processing whole genome sequences in any form, including chromesomes, pseudomolecules, scaffolds, contigs, and fragments, which is useful in draft genome analysis. Analysis with MegaLTR consists of eight main steps: 1) LTR-RTs identification with LTR_FINDER (<xref ref-type="bibr" rid="B58">Xu and Wang, 2007</xref>; <xref ref-type="bibr" rid="B36">Ou and Jiang, 2019</xref>) and LTRharvest (<xref ref-type="bibr" rid="B10">Ellinghaus et&#xa0;al., 2008</xref>); 2) filtering LTR-RTs with LTR_retriever (<xref ref-type="bibr" rid="B35">Ou and Jiang, 2017</xref>); 3) annotation of internal domains and clades with TEsorter (<xref ref-type="bibr" rid="B59">Zhang et&#xa0;al., 2022</xref>); 4) PBS and PPT annotation with LTRdigest (<xref ref-type="bibr" rid="B43">Steinbiss et&#xa0;al., 2009</xref>) and PltRNAdb (<xref ref-type="bibr" rid="B31">Mokhtar and El Allali, 2022</xref>); 5) insertion age estimation with REANNOTATE (<xref ref-type="bibr" rid="B39">Pereira, 2008</xref>) and ClustalW (<xref ref-type="bibr" rid="B46">Thompson et&#xa0;al., 2003</xref>); 6) LTR-RTs classification with Python scripts and create a non-redundant LTRRTs library using USEARCH v11.0 (<xref ref-type="bibr" rid="B8">Edgar, 2010</xref>); 7) LTR-RTs detection within and near genes with Perl scripts; 8) statistical analysis and visualization with Python, R scripts and RIdeograms (<xref ref-type="bibr" rid="B15">Hao et&#xa0;al., 2020</xref>). The user can set the parameters for each analysis step.</p>
<fig id="f1" position="float">
<label>Figure&#xa0;1</label>
<caption>
<p>An Overview of MegaLTR Workflow and procedure.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-14-1237426-g001.tif"/>
</fig>
<p>For identification of LTR-RT candidates, LTR_FINDER and LTRharvest are used because they are very effective in identifying LTR-RTs and outperform all other programs in sensitivity (<xref ref-type="bibr" rid="B35">Ou and Jiang, 2017</xref>). However, these programs tend to produce a number of false-positive predictions (<xref ref-type="bibr" rid="B23">Lerat, 2010</xref>). To effectively remove false-positive predictions made by the original softwares, the results were combined into one file and used as input to LTR_retriever. The LTR_retriever tool uses a combination of several programs, including HMMER (<xref ref-type="bibr" rid="B52">Wheeler and Eddy, 2013</xref>), CD-HIT (<xref ref-type="bibr" rid="B24">Li and Godzik, 2006</xref>), BLAST+ (<xref ref-type="bibr" rid="B5">Camacho et&#xa0;al., 2009</xref>), RepeatMasker (<xref ref-type="bibr" rid="B42">Smit et&#xa0;al., 2015</xref>), and TRF (<xref ref-type="bibr" rid="B3">Benson, 1999</xref>) to identify and filter out all false candidates for LTR-RTs. MegaLTR only considers intact LTR-RT candidates that pass these filtering steps in the post analysis. The intact LTR-RT, defined as candidates, contain two identical/semi-identical LTRs and a target site duplication at both ends. The LTRs contain conserved sequences such as the TG-CA, which may play a role in regulating retrotransposon expression and/or retrotransposition. To accurately identify features within a potential LTR-RT, MegaLTR uses LTRdigest to detect PPT, and PBS and TEsorter to analyze internal protein domains. The PBS is generally located near the 5&#x2019;LTR, while PPT is relatively close to the 3&#x2019;LTR. To identify the PBS, a tRNA sequence library is used to search for regions in the LTR-RT candidate that are complementary to the tRNA. The tRNA sequences for this analysis are from the plant tRNA database [PltRNAdb, (<xref ref-type="bibr" rid="B31">Mokhtar and El Allali, 2022</xref>)]. This procedure allows reliable identification of PBS and PPT within a LTR-RT candidate. To annotate protein domains, TEsorter searched one of the databases REXdb (<xref ref-type="bibr" rid="B32">Neumann et&#xa0;al., 2019</xref>) and GyDB (<ext-link ext-link-type="uri" xlink:href="http://gydb.org">http://gydb.org</ext-link>) using HMMScan (<xref ref-type="bibr" rid="B7">Eddy, 1998</xref>) to identify putative domains such as capsid protein, protease, reverse transcriptase, RNase H, and integrase.</p>
<p>The next step is to classify LTR-RTs in clades. Previous studies have proposed different clade-level classifications for LTR-RTs. <xref ref-type="bibr" rid="B32">Neumann et&#xa0;al. (2019)</xref> divided <italic>Copia</italic> to the clades <italic>Ale, Alesia, SIRE, Bianca, Lyco, Ikeros, Gymco I-IV, Bryco, Osser, TAR, Angela, Ivana,</italic> and <italic>Tork</italic>. They also divided <italic>Gypsy</italic> into the clades <italic>Chlamyvir, CRM, Tcn1, Reina, Galadriel, Tekay, Tat-I-III, Athila, Ogre, Phygy, Selgy</italic>, and <italic>Retand</italic>. This classification is based on the protein domain databases for clade-level classification of LTR-RT. The TEsorter tool uses these databases as well as REXdb and GyDB to classify LTR-RTs into superfamilies and further classify them into clades. To estimate the insertion age of LTR-RT, MegaLTR uses the tools REANNOTATE and ClustalW in combination to estimate the insertion age of LTR-RT elements based on a comparative analysis of their 5&#x2019; and 3&#x2019; LTRs. To calculate the insertion age, the Kimura-2 parameter model (<xref ref-type="bibr" rid="B20">Kimura, 1980</xref>) is used to calculate the substitutions per site rate (K) between LTRs. The age is then estimated as T= K/2r (<xref ref-type="bibr" rid="B20">Kimura, 1980</xref>), where (r) is the evolution rate. In MegaLTR, the evolution rate is usually set by the user. It is important to note that evolution rates can vary significantly between species.</p>
<p>LTR-RT can be divided into two main categories based on their structure: autonomous and nonautonomous. According to <xref ref-type="bibr" rid="B54">Wicker et&#xa0;al. (2007)</xref>, the structure of autonomous <italic>Gypsy</italic> and <italic>Copia</italic> is based on domains arranged within the element LTR-RT. The structure of <italic>Gypsy</italic> is TSD-LTR-PBS-GAG-PROT-RTRH-INT-PPT-LTR-TSD, while <italic>Copia</italic> is TSD-LTR-PBS-GAG-PROT-INT-RT-RH-PPT-LTR-TSD. <italic>Copia</italic> and <italic>Gypsy</italic> elements that no longer have any of the previous structures are classified as non-autonomous Copia and non-autonomous <italic>Gypsy</italic>. Non-autonomous LTR-RT can be further subdivided based on their specific structure and the presence or absence of certain domains. Examples of non-autonomous elements include <italic>LARD, TRIM, TR-GAG</italic>, and <italic>BARE-2</italic> (<xref ref-type="fig" rid="f2">
<bold>Figure&#xa0;2</bold>
</xref>). The specific criteria for classifying LTR-RT elements into these categories have been described in several research studies, including <xref ref-type="bibr" rid="B19">Kalendar et&#xa0;al. (2004)</xref>; <xref ref-type="bibr" rid="B55">Witte et&#xa0;al. (2001)</xref>; <xref ref-type="bibr" rid="B6">Chaparro et&#xa0;al. (2015)</xref>; <xref ref-type="bibr" rid="B45">Tanskanen et&#xa0;al. (2007)</xref>. LTR-RT elements that do not fit into any of these categories and are not classified as autonomous or non-autonomous <italic>Copia</italic> or <italic>Gypsy</italic> elements are classified as &#x201c;unknown&#x201d;.</p>
<fig id="f2" position="float">
<label>Figure&#xa0;2</label>
<caption>
<p>The structures of autonomous (<italic>Gypsy</italic> and <italic>Copia</italic>) and non-autonomous LTR-RTs (<italic>LARD, TRIM, TR-GAG</italic>, and <italic>BARE-2</italic>).</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-14-1237426-g002.tif"/>
</fig>
<p>Because LTR-RTs sometimes insert themselves into or near genes and can affect gene function. MegaLTR identifies LTR-RTs that are inside or near genes using Perl scripts. To classify LTR-RT elements based on their genomic location, the start and end coordinates of the gene and the start and end coordinates of the LTR-RT element within the genome can be compared. If the LTR-RT element is located within the coordinates of the gene, it is considered a gene chimera. If the LTR-RT element is located near a gene, the distance upstream and downstream of the LTR-RT element can be determined in base pairs. This distance is usually determined by the user and may vary depending on the specific research question and desired sensitivity for detecting LTR-RT elements near genes. In the final step, MegaLTR performs two statistical analyses using boxplot. One for LTR-RT length by bps and the other for LTR-RT insertion age. Boxplots are useful for quickly conveying information about the variability and skewness of a data set. The next step is a visualization of the distribution of the identified LTR-RT and gene density in each pseudomolecules/scaffolds using RIdeograms (<xref ref-type="bibr" rid="B15">Hao et&#xa0;al., 2020</xref>).</p>
</sec>
<sec id="s2_3">
<label>2.3</label>
<title>Standalone version</title>
<p>The standalone version of MegaLTR is also available (<ext-link ext-link-type="uri" xlink:href="https://github.com/MoradMMokhtar/MegaLTR">https://github.com/MoradMMokhtar/MegaLTR</ext-link>). It has been thoroughly tested on Ubuntu 18.04 and 20.04. Installation is effortless <italic>via</italic> a Conda environment with the command: conda env create -f MegaLTR.yml. This command not only installs MegaLTR, but also takes care of installing the associated dependencies. Using MegaLTR standalone, the user can define all parameters using the following flags: -A (the analysis type), -F (fasta file), -G (GFF file), -T (species name for tRNA database), -P (prefix for outfiles), -l (minimum length of 5&#x2019; &amp; 3&#x2019;LTR, -L (maximum length of 5&#x2019; &amp; 3&#x2019;LTR), -d (minimum distance between 5&#x2019; &amp; 3&#x2019;LTR), -D (maximum distance between 5&#x2019; &amp; 3&#x2019;LTR), -S (similarity threshold), -M (minimum length of exact match pair), -B (name of TE database that TEsorter will use &#x201c;gydb, rexdb, rexdb-plant, rexdb-metazoa&#x201d;), -C (minimum coverage for protein domains in HMMScan), -V (maximum E value for protein domains in HMMScan), -Q (classification rule [identity - coverage - length]), -E (hmm database), -R (mutation rate of neutral species), -U (distance upstream LTR-RTs to determine nearby genes), -X (distance downstream LTR-RTs), -W (gene density window size), -N (number of chromosomes), -t (number of CPUs to run MegaLTR).</p>
</sec>
</sec>
<sec id="s3" sec-type="results">
<label>3</label>
<title>Results and discussion</title>
<sec id="s3_1">
<label>3.1</label>
<title>Validation and comparison</title>
<p>To test the performance and validate the quality of the intact LTR-RTs identified by MegaLTR, a manual curation of LTR-RTs library from <italic>Oryza sativa</italic> was used to compare the non-redundant library generated by MegaLTR. The curated <italic>Oryza sativa</italic> library included 897 LTR-RT elements and was previously established by <xref ref-type="bibr" rid="B35">Ou and Jiang (2017)</xref>. RepeatMasker v4.0.7 with the parameters &#x201c;-e ncbi -pa 56 -no_is -q -norna -div 40 -nolow -lib [LTR -library] -cutoff 225 genome.fa&#x201d; was applied to the MegaLTR library and the curated library to compute the performance metrics. We used six metrics proposed by <xref ref-type="bibr" rid="B35">Ou and Jiang (2017)</xref> to characterize the annotation performance of the non-redundant LTR-RT library generated by MegaLTR. These metrics include sensitivity (the ability to annotate target sequences correctly), specificity (the ability to exclude non-target sequences correctly), accuracy (true discrimination rate between target and non-target sequences), precision (true detection rate), FDR (false detection rate), and F1 measure (harmonic mean of precision and sensitivity). The True-positives (TP), false-positives (FP), false-negatives (FN), and true-negatives (TN) rates were computed using the EDTA toolkit. The performance metrics are defined as:</p>
<disp-formula>
<mml:math display="block" id="M1">
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>v</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>y</mml:mi>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi>P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi>P</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mtd>
<mml:mtd>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mi>p</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>f</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>y</mml:mi>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>F</mml:mi>
<mml:mi>P</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>T</mml:mi>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mtd>
<mml:mtd>
<mml:mrow>
<mml:mi>A</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>y</mml:mi>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi>P</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>T</mml:mi>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi>P</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>T</mml:mi>
<mml:mi>N</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>P</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mi>P</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>n</mml:mi>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi>P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi>P</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>P</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mtd>
<mml:mtd>
<mml:mrow>
<mml:mi>F</mml:mi>
<mml:mn>1</mml:mn>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>&#x2217;</mml:mo>
<mml:mi>T</mml:mi>
<mml:mi>P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>&#x2217;</mml:mo>
<mml:mi>T</mml:mi>
<mml:mi>P</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>P</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mtd>
<mml:mtd>
<mml:mrow>
<mml:mi>F</mml:mi>
<mml:mi>D</mml:mi>
<mml:mi>R</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi>P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi>P</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>P</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:math>
</disp-formula>
<p>MegaLTR results show consistently high specificity (96.59%), accuracy (94.98%), precision (89.38%), sensitivity (89.92%), and F1 measure (89.65%). The relatively low FDR (10.61%) confirms the accuracy and reliability of the LTR-RTs identified by MegaLTR. For comparison purposes, the EDTA pipeline was used to analyze the whole genome of <italic>Oryza sativa</italic> using the same parameters used in MegaLTR (-D 15000 -d 1000 -L 7000 -l 100 -p 20 -M 0.85). The EDTA-generated LTR-RTs library was compared with the curated <italic>Oryza sativa</italic> LTR library. Similar to the evaluation of MegaLTR, RepeatMasker and the script &#x201c;lib-test.pl&#x201d; were used to calculate the evaluation metrics. The results of the EDTA metrics were: specificity (96.23%), accuracy (94.61%), precision (88.34%), sensitivity (89.52%), F1 measure (88.93%), and FDR (11.65%). As shown in <xref ref-type="table" rid="T1">
<bold>Table&#xa0;1</bold>
</xref>, MegaLTR has relatively higher specificity, accuracy, precision and sensitivity with low FDR compared to EDTA.</p>
<table-wrap id="T1" position="float">
<label>Table&#xa0;1</label>
<caption>
<p>Comparison of six metrics between MegaLTR and EDTA using the genome of <italic>Oryza sativa</italic>.</p>
</caption>
<table frame="hsides">
<thead>
<tr>
<th valign="top" align="left">Pipeline name</th>
<th valign="top" align="left">Sensitivity</th>
<th valign="top" align="left">Specificity</th>
<th valign="top" align="left">Accuracy</th>
<th valign="top" align="left">Precision</th>
<th valign="top" align="left">FDR</th>
<th valign="top" align="left">F1</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">MegaLTR</td>
<td valign="top" align="left">89.92%</td>
<td valign="top" align="left">96.59%</td>
<td valign="top" align="left">94.98%</td>
<td valign="top" align="left">89.38%</td>
<td valign="top" align="left">10.61%</td>
<td valign="top" align="left">89.65%</td>
</tr>
<tr>
<td valign="top" align="left">EDTA</td>
<td valign="top" align="left">89.52%</td>
<td valign="top" align="left">96.23%</td>
<td valign="top" align="left">94.61%</td>
<td valign="top" align="left">88.34%</td>
<td valign="top" align="left">11.65%</td>
<td valign="top" align="left">88.93%</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Overall, the comparison of MegaLTR with both the manually curated <italic>Oryza sativa</italic> library and EDTA demonstrates the robustness and effectiveness of MegaLTR in identifying intact LTR-RTs and provides valuable insights for future studies on retrotransposons in plant genomes. <xref ref-type="table" rid="T2">
<bold>Table&#xa0;2</bold>
</xref> shows a comparison of various features between the MegaLTR and EDTA. The features compared include the class of TEs identified, the level of classification (autonomous, non-autonomous, superfamily, lineage level), the identification of LTR-RT near and within genes, and the form of availability.</p>
<table-wrap id="T2" position="float">
<label>Table&#xa0;2</label>
<caption>
<p>Comparison of some features between MegaLTR and EDTA.</p>
</caption>
<table frame="hsides">
<thead>
<tr>
<th valign="top" rowspan="2" align="left"/>
<th valign="top" colspan="2" align="left">Identified TEs</th>
<th valign="top" colspan="3" align="left">Classification level</th>
<th valign="top" colspan="2" align="left">Identify LTR-RT</th>
<th valign="top" colspan="3" align="left">Availability</th>
</tr>
<tr>
<th valign="top" align="left">DNA TEs</th>
<th valign="top" align="left">LTR-RTs</th>
<th valign="top" align="left">Autonomous and non-autonomous</th>
<th valign="top" align="left">Superfamily</th>
<th valign="top" align="left">Lineage level</th>
<th valign="top" align="left">Gene chimeras</th>
<th valign="top" align="left">Near genes</th>
<th valign="top" align="left">Web server</th>
<th valign="top" colspan="2" align="left">Standalone</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">MegaLTR</td>
<td valign="top" align="left">X</td>
<td valign="top" align="left">&#x2713;</td>
<td valign="top" align="left">&#x2713;</td>
<td valign="top" align="left">&#x2713;</td>
<td valign="top" align="left">&#x2713;</td>
<td valign="top" align="left">&#x2713;</td>
<td valign="top" align="left">&#x2713;</td>
<td valign="top" align="left">&#x2713;</td>
<td valign="top" colspan="2" align="left">&#x2713;</td>
</tr>
<tr>
<td valign="top" align="left">EDTA</td>
<td valign="top" align="left">&#x2713;</td>
<td valign="top" align="left">&#x2713;</td>
<td valign="top" align="left">X</td>
<td valign="top" align="left">&#x2713;</td>
<td valign="top" align="left">X</td>
<td valign="top" align="left">X</td>
<td valign="top" align="left">X</td>
<td valign="top" align="left">X</td>
<td valign="top" colspan="2" align="left">&#x2713;</td>
</tr>
<tr>
<th valign="top" colspan="11" align="center">LTR-RTs sub-classification level
</th>
</tr>
<tr>
<td valign="top" align="left"/>
<td valign="top" align="left">
<bold>
<italic>Copia</italic>
</bold>
</td>
<td valign="top" align="left">
<bold>
<italic>Gypsy</italic>
</bold>
</td>
<td valign="top" align="left">
<bold>Unknown</bold>
</td>
<td valign="top" align="left">
<bold>
<italic>LARD</italic>
</bold>
</td>
<td valign="top" align="left">
<bold>
<italic>TRIM</italic>
</bold>
</td>
<td valign="top" align="left">
<bold>
<italic>TR-GAG</italic>
</bold>
</td>
<td valign="top" align="left">
<bold>
<italic>BARE-2</italic>
</bold>
</td>
<td valign="top" align="left"/>
<td valign="top" colspan="2" align="left"/>
</tr>
<tr>
<td valign="top" align="left">MegaLTR</td>
<td valign="top" align="left">&#x2713;</td>
<td valign="top" align="left">&#x2713;</td>
<td valign="top" align="left">&#x2713;</td>
<td valign="top" align="left">&#x2713;</td>
<td valign="top" align="left">&#x2713;</td>
<td valign="top" align="left">&#x2713;</td>
<td valign="top" align="left">&#x2713;</td>
<td valign="top" align="left"/>
<td valign="top" colspan="2" align="left"/>
</tr>
<tr>
<td valign="top" align="left">EDTA</td>
<td valign="top" align="left">&#x2713;</td>
<td valign="top" align="left">&#x2713;</td>
<td valign="top" align="left">&#x2713;</td>
<td valign="top" align="left">X</td>
<td valign="top" align="left">X</td>
<td valign="top" align="left">X</td>
<td valign="top" align="left">X</td>
<td valign="top" align="left"/>
<td valign="top" colspan="2" align="left"/>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>"&#x2713;" refer to the feature is found, and "X" refers to the feature is missing.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>To validate MegaLTR, 26 whole-genome sequences with a total volume of 15.33 Gbp representing 58,392 scaffolds/pseudomolecules were used. Plant species were selected based on their LAI score, genome size, and number of scaffolds/pseudomolecules. As suggested by <xref ref-type="bibr" rid="B34">Ou et al., (2018)</xref>, the LAI score for draft genomes is below 10, while reference genomes have a LAI score between 10 and 20. Gold genomes have LAI scores greater than 20. The LAI scores of the selected genomes were retrieved from the PlantLAI database (<xref ref-type="bibr" rid="B28">Mokhtar et&#xa0;al., 2023a</xref>) and ranged from 8.7 (<italic>Citrus unshiu</italic>) to 29.45 (<italic>Zea mays</italic>), covering the different qualities of genome sequences (draft, reference, and gold quality). Genome sizes also varied, ranging from 119.6 Mbp for <italic>Arabidopsis thaliana</italic> to 2182.79 Mbp for <italic>Zea mays</italic>. In addition, the number of scaffolds/pseudomolecules varied from 7 to 20,876 for <italic>Arabidopsis thaliana</italic> and <italic>Citrus unshiu</italic> (<xref ref-type="supplementary-material" rid="SM1">
<bold>Table S1</bold>
</xref>).</p>
<p>
<xref ref-type="table" rid="T3">
<bold>Table&#xa0;3</bold>
</xref> shows a comparison between MegaLTR and EDTA based on runtime and number of identified LTR-RTs in each classified superfamily using the same parameters mentioned above. Since EDTA performs the analysis of all TEs (LTR, TIR, and Helitron), we used the LTR [&#x2013;type ltr] option to analyze only the LTR-RTs candidates. For MegaLTR, the total number of identified autonomous (<italic>Gypsy</italic> and <italic>Copia</italic>) and nonautonomous LTR-RTs (<italic>Gypsy, Copia, BARE-2, TR-GAG</italic>, unknown) was reported for the genomes examined. The <italic>LARD</italic> and <italic>TRIM</italic> structures were not detected in these genomes. However, EDTA classified LTR-RTs into <italic>Gypsy, Copia</italic> and unknown elements. As can be seen in <xref ref-type="table" rid="T3">
<bold>Table&#xa0;3</bold>
</xref>, MegaLTR reported a small number of unknown elements compared to EDTA, as MegaLTR performed further analyses to annotate and classify the identified LTR-RTs.</p>
<table-wrap id="T3" position="float">
<label>Table&#xa0;3</label>
<caption>
<p>Analysis runtime in hours and minutes (h:m), the total number of identified LTR-RTs in each classified superfamily for the 26 plant species using MegaLTR and EDTA.</p>
</caption>
<table frame="hsides">
<thead>
<tr>
<th valign="middle" rowspan="3" align="left">Species name</th>
<th valign="bottom" colspan="2" align="center">Run time</th>
<th valign="bottom" colspan="7" align="center">MegaLTR</th>
<th valign="middle" colspan="3" rowspan="2" align="center">EDTA</th>
</tr>
<tr>
<th valign="bottom" rowspan="2" align="left">MegaLTR</th>
<th valign="bottom" rowspan="2" align="left">EDTA</th>
<th valign="bottom" colspan="4" align="left">Autonomous</th>
<th valign="bottom" colspan="3" align="left">Nonautonomous</th>
</tr>
<tr>
<th valign="bottom" align="left">
<italic>Gypsy</italic>
</th>
<th valign="bottom" align="left">
<italic>Copia</italic>
</th>
<th valign="bottom" align="left">
<italic>Gypsy</italic>
</th>
<th valign="bottom" align="left">
<italic>Copia</italic>
</th>
<th valign="bottom" align="left">
<italic>BARE-2</italic>
</th>
<th valign="bottom" align="left">
<italic>TR-GAG</italic>
</th>
<th valign="bottom" align="left">unknown</th>
<th valign="bottom" align="left">
<italic>Gypsy</italic>
</th>
<th valign="bottom" align="left">
<italic>Copia</italic>
</th>
<th valign="bottom" align="left">unknown</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="middle" align="left">
<italic>Arabidopsis thaliana</italic>
</td>
<td valign="middle" align="left">0:09</td>
<td valign="middle" align="left">0:14</td>
<td valign="middle" align="left">2</td>
<td valign="middle" align="left">&#x2013;</td>
<td valign="middle" align="left">118</td>
<td valign="middle" align="left">80</td>
<td valign="middle" align="left">&#x2013;</td>
<td valign="middle" align="center">1</td>
<td valign="middle" align="left">2</td>
<td valign="middle" align="left">105</td>
<td valign="middle" align="left">75</td>
<td valign="middle" align="left">27</td>
</tr>
<tr>
<td valign="middle" align="left">
<italic>Brassica rapa</italic>
</td>
<td valign="middle" align="left">0:48</td>
<td valign="middle" align="left">0:38</td>
<td valign="middle" align="left">189</td>
<td valign="middle" align="left">65</td>
<td valign="middle" align="left">1138</td>
<td valign="middle" align="left">1238</td>
<td valign="middle" align="left">34</td>
<td valign="middle" align="center">18</td>
<td valign="middle" align="left">228</td>
<td valign="middle" align="left">1196</td>
<td valign="middle" align="left">1074</td>
<td valign="middle" align="left">588</td>
</tr>
<tr>
<td valign="middle" align="left">
<italic>Citrus clementina</italic>
</td>
<td valign="middle" align="left">0:15</td>
<td valign="middle" align="left">0:24</td>
<td valign="middle" align="left">60</td>
<td valign="middle" align="left">23</td>
<td valign="middle" align="left">771</td>
<td valign="middle" align="left">846</td>
<td valign="middle" align="left">1</td>
<td valign="middle" align="center">20</td>
<td valign="middle" align="left">14</td>
<td valign="middle" align="left">820</td>
<td valign="middle" align="left">765</td>
<td valign="middle" align="left">62</td>
</tr>
<tr>
<td valign="middle" align="left">
<italic>Citrus unshiu</italic>
</td>
<td valign="middle" align="left">0:17</td>
<td valign="middle" align="left">0:20</td>
<td valign="middle" align="left">26</td>
<td valign="middle" align="left">4</td>
<td valign="middle" align="left">340</td>
<td valign="middle" align="left">178</td>
<td valign="middle" align="left">&#x2013;</td>
<td valign="middle" align="center">2</td>
<td valign="middle" align="left">2</td>
<td valign="middle" align="left">343</td>
<td valign="middle" align="left">161</td>
<td valign="middle" align="left">32</td>
</tr>
<tr>
<td valign="middle" align="left">
<italic>Cucumis sativus</italic>
</td>
<td valign="middle" align="left">0:09</td>
<td valign="middle" align="left">0:06</td>
<td valign="middle" align="left">31</td>
<td valign="middle" align="left">7</td>
<td valign="middle" align="left">135</td>
<td valign="middle" align="left">219</td>
<td valign="middle" align="left">2</td>
<td valign="middle" align="center">&#x2013;</td>
<td valign="middle" align="left">22</td>
<td valign="middle" align="left">159</td>
<td valign="middle" align="left">196</td>
<td valign="middle" align="left">73</td>
</tr>
<tr>
<td valign="middle" align="left">
<italic>Glycine max</italic>
</td>
<td valign="middle" align="left">0:54</td>
<td valign="middle" align="left">1:13</td>
<td valign="middle" align="left">227</td>
<td valign="middle" align="left">69</td>
<td valign="middle" align="left">2046</td>
<td valign="middle" align="left">2335</td>
<td valign="middle" align="left">22</td>
<td valign="middle" align="center">322</td>
<td valign="middle" align="left">145</td>
<td valign="middle" align="left">2433</td>
<td valign="middle" align="left">1388</td>
<td valign="middle" align="left">694</td>
</tr>
<tr>
<td valign="middle" align="left">
<italic>Medicago truncatula</italic>
</td>
<td valign="middle" align="left">0:26</td>
<td valign="middle" align="left">0:31</td>
<td valign="middle" align="left">25</td>
<td valign="middle" align="left">21</td>
<td valign="middle" align="left">743</td>
<td valign="middle" align="left">1385</td>
<td valign="middle" align="left">2</td>
<td valign="middle" align="center">&#x2013;</td>
<td valign="middle" align="left">130</td>
<td valign="middle" align="left">710</td>
<td valign="middle" align="left">1329</td>
<td valign="middle" align="left">335</td>
</tr>
<tr>
<td valign="middle" align="left">
<italic>Mikania micrantha</italic>
</td>
<td valign="middle" align="left">4:51</td>
<td valign="middle" align="left">14:43</td>
<td valign="middle" align="left">470</td>
<td valign="middle" align="left">145</td>
<td valign="middle" align="left">6930</td>
<td valign="middle" align="left">15356</td>
<td valign="middle" align="left">9</td>
<td valign="middle" align="center">58</td>
<td valign="middle" align="left">705</td>
<td valign="middle" align="left">7009</td>
<td valign="middle" align="left">14100</td>
<td valign="middle" align="left">2671</td>
</tr>
<tr>
<td valign="middle" align="left">
<italic>Oryza sativa Japonica</italic>
</td>
<td valign="middle" align="left">0:28</td>
<td valign="middle" align="left">0:25</td>
<td valign="middle" align="left">35</td>
<td valign="middle" align="left">9</td>
<td valign="middle" align="left">488</td>
<td valign="middle" align="left">1399</td>
<td valign="middle" align="left">4</td>
<td valign="middle" align="center">4</td>
<td valign="middle" align="left">168</td>
<td valign="middle" align="left">502</td>
<td valign="middle" align="left">1504</td>
<td valign="middle" align="left">226</td>
</tr>
<tr>
<td valign="middle" align="left">
<italic>Panicum hallii</italic>
</td>
<td valign="middle" align="left">1:03</td>
<td valign="middle" align="left">1:23</td>
<td valign="middle" align="left">46</td>
<td valign="middle" align="left">46</td>
<td valign="middle" align="left">841</td>
<td valign="middle" align="left">3539</td>
<td valign="middle" align="left">1</td>
<td valign="middle" align="center">3</td>
<td valign="middle" align="left">293</td>
<td valign="middle" align="left">866</td>
<td valign="middle" align="left">3512</td>
<td valign="middle" align="left">641</td>
</tr>
<tr>
<td valign="middle" align="left">
<italic>Phoenix dactylifera</italic>
</td>
<td valign="middle" align="left">1:21</td>
<td valign="middle" align="left">2:29</td>
<td valign="middle" align="left">498</td>
<td valign="middle" align="left">226</td>
<td valign="middle" align="left">6233</td>
<td valign="middle" align="left">3016</td>
<td valign="middle" align="left">2</td>
<td valign="middle" align="center">69</td>
<td valign="middle" align="left">148</td>
<td valign="middle" align="left">6625</td>
<td valign="middle" align="left">2176</td>
<td valign="middle" align="left">503</td>
</tr>
<tr>
<td valign="middle" align="left">
<italic>Physcomitrella patens</italic>
</td>
<td valign="middle" align="left">0:21</td>
<td valign="middle" align="left">0:27</td>
<td valign="middle" align="left">&#x2013;</td>
<td valign="middle" align="left">&#x2013;</td>
<td valign="middle" align="left">184</td>
<td valign="middle" align="left">3225</td>
<td valign="middle" align="left">&#x2013;</td>
<td valign="middle" align="center">&#x2013;</td>
<td valign="middle" align="left">13</td>
<td valign="middle" align="left">140</td>
<td valign="middle" align="left">3069</td>
<td valign="middle" align="left">122</td>
</tr>
<tr>
<td valign="middle" align="left">
<italic>Populus trichocarpa</italic>
</td>
<td valign="middle" align="left">0:18</td>
<td valign="middle" align="left">0:26</td>
<td valign="middle" align="left">59</td>
<td valign="middle" align="left">19</td>
<td valign="middle" align="left">501</td>
<td valign="middle" align="left">474</td>
<td valign="middle" align="left">&#x2013;</td>
<td valign="middle" align="center">1</td>
<td valign="middle" align="left">59</td>
<td valign="middle" align="left">523</td>
<td valign="middle" align="left">448</td>
<td valign="middle" align="left">170</td>
</tr>
<tr>
<td valign="middle" align="left">
<italic>Prunus persica</italic>
</td>
<td valign="middle" align="left">0:16</td>
<td valign="middle" align="left">0:30</td>
<td valign="middle" align="left">43</td>
<td valign="middle" align="left">21</td>
<td valign="middle" align="left">632</td>
<td valign="middle" align="left">326</td>
<td valign="middle" align="left">5</td>
<td valign="middle" align="center">1</td>
<td valign="middle" align="left">141</td>
<td valign="middle" align="left">637</td>
<td valign="middle" align="left">319</td>
<td valign="middle" align="left">548</td>
</tr>
<tr>
<td valign="middle" align="left">
<italic>Rosa chinensis</italic>
</td>
<td valign="middle" align="left">0:44</td>
<td valign="middle" align="left">1:43</td>
<td valign="middle" align="left">113</td>
<td valign="middle" align="left">20</td>
<td valign="middle" align="left">3806</td>
<td valign="middle" align="left">1614</td>
<td valign="middle" align="left">38</td>
<td valign="middle" align="center">21</td>
<td valign="middle" align="left">745</td>
<td valign="middle" align="left">3498</td>
<td valign="middle" align="left">1426</td>
<td valign="middle" align="left">1884</td>
</tr>
<tr>
<td valign="middle" align="left">
<italic>Salvia splendens</italic>
</td>
<td valign="middle" align="left">2:16</td>
<td valign="middle" align="left">3:18</td>
<td valign="middle" align="left">198</td>
<td valign="middle" align="left">296</td>
<td valign="middle" align="left">4183</td>
<td valign="middle" align="left">5687</td>
<td valign="middle" align="left">23</td>
<td valign="middle" align="center">70</td>
<td valign="middle" align="left">459</td>
<td valign="middle" align="left">3898</td>
<td valign="middle" align="left">5462</td>
<td valign="middle" align="left">2009</td>
</tr>
<tr>
<td valign="middle" align="left">
<italic>Selaginella moellendorffii</italic>
</td>
<td valign="middle" align="left">0:11</td>
<td valign="middle" align="left">0:10</td>
<td valign="middle" align="left">&#x2013;</td>
<td valign="middle" align="left">102</td>
<td valign="middle" align="left">26</td>
<td valign="middle" align="left">557</td>
<td valign="middle" align="left">1</td>
<td valign="middle" align="center">7</td>
<td valign="middle" align="left">337</td>
<td valign="middle" align="left">34</td>
<td valign="middle" align="left">627</td>
<td valign="middle" align="left">355</td>
</tr>
<tr>
<td valign="middle" align="left">
<italic>Sesamum indicum</italic>
</td>
<td valign="middle" align="left">0:21</td>
<td valign="middle" align="left">0:28</td>
<td valign="middle" align="left">5</td>
<td valign="middle" align="left">20</td>
<td valign="middle" align="left">258</td>
<td valign="middle" align="left">176</td>
<td valign="middle" align="left">1</td>
<td valign="middle" align="center">6</td>
<td valign="middle" align="left">6</td>
<td valign="middle" align="left">240</td>
<td valign="middle" align="left">185</td>
<td valign="middle" align="left">38</td>
</tr>
<tr>
<td valign="middle" align="left">
<italic>Setaria viridis</italic>
</td>
<td valign="middle" align="left">0:25</td>
<td valign="middle" align="left">0:36</td>
<td valign="middle" align="left">3</td>
<td valign="middle" align="left">39</td>
<td valign="middle" align="left">829</td>
<td valign="middle" align="left">1071</td>
<td valign="middle" align="left">1</td>
<td valign="middle" align="center">5</td>
<td valign="middle" align="left">10</td>
<td valign="middle" align="left">802</td>
<td valign="middle" align="left">1091</td>
<td valign="middle" align="left">105</td>
</tr>
<tr>
<td valign="middle" align="left">
<italic>Solanum lycopersicum</italic>
</td>
<td valign="middle" align="left">0:32</td>
<td valign="middle" align="left">0:35</td>
<td valign="middle" align="left">93</td>
<td valign="middle" align="left">15</td>
<td valign="middle" align="left">945</td>
<td valign="middle" align="left">774</td>
<td valign="middle" align="left">&#x2013;</td>
<td valign="middle" align="center">5</td>
<td valign="middle" align="left">30</td>
<td valign="middle" align="left">899</td>
<td valign="middle" align="left">719</td>
<td valign="middle" align="left">196</td>
</tr>
<tr>
<td valign="middle" align="left">
<italic>Solanum pennellii</italic>
</td>
<td valign="middle" align="left">0:37</td>
<td valign="middle" align="left">0:42</td>
<td valign="middle" align="left">143</td>
<td valign="middle" align="left">1</td>
<td valign="middle" align="left">1714</td>
<td valign="middle" align="left">443</td>
<td valign="middle" align="left">1</td>
<td valign="middle" align="center">6</td>
<td valign="middle" align="left">58</td>
<td valign="middle" align="left">1622</td>
<td valign="middle" align="left">361</td>
<td valign="middle" align="left">310</td>
</tr>
<tr>
<td valign="middle" align="left">
<italic>Sorghum bicolor</italic>
</td>
<td valign="middle" align="left">1:28</td>
<td valign="middle" align="left">2:35</td>
<td valign="middle" align="left">58</td>
<td valign="middle" align="left">109</td>
<td valign="middle" align="left">1096</td>
<td valign="middle" align="left">7779</td>
<td valign="middle" align="left">3</td>
<td valign="middle" align="center">7</td>
<td valign="middle" align="left">893</td>
<td valign="middle" align="left">1167</td>
<td valign="middle" align="left">6927</td>
<td valign="middle" align="left">2120</td>
</tr>
<tr>
<td valign="middle" align="left">
<italic>Trifolium pratense</italic>
</td>
<td valign="middle" align="left">0:38</td>
<td valign="middle" align="left">1:00</td>
<td valign="middle" align="left">83</td>
<td valign="middle" align="left">95</td>
<td valign="middle" align="left">2692</td>
<td valign="middle" align="left">1566</td>
<td valign="middle" align="left">5</td>
<td valign="middle" align="center">34</td>
<td valign="middle" align="left">943</td>
<td valign="middle" align="left">2470</td>
<td valign="middle" align="left">1469</td>
<td valign="middle" align="left">2074</td>
</tr>
<tr>
<td valign="middle" align="left">
<italic>Vitis vinifera</italic>
</td>
<td valign="middle" align="left">0:22</td>
<td valign="middle" align="left">0:35</td>
<td valign="middle" align="left">247</td>
<td valign="middle" align="left">10</td>
<td valign="middle" align="left">1090</td>
<td valign="middle" align="left">613</td>
<td valign="middle" align="left">25</td>
<td valign="middle" align="center">1</td>
<td valign="middle" align="left">64</td>
<td valign="middle" align="left">1275</td>
<td valign="middle" align="left">583</td>
<td valign="middle" align="left">247</td>
</tr>
<tr>
<td valign="middle" align="left">
<italic>Zea mays</italic>
</td>
<td valign="middle" align="left">10:16</td>
<td valign="middle" align="left">27:39</td>
<td valign="middle" align="left">3687</td>
<td valign="middle" align="left">191</td>
<td valign="middle" align="left">16301</td>
<td valign="middle" align="left">28080</td>
<td valign="middle" align="left">70</td>
<td valign="middle" align="center">496</td>
<td valign="middle" align="left">3163</td>
<td valign="middle" align="left">19836</td>
<td valign="middle" align="left">26376</td>
<td valign="middle" align="left">6221</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The EDTA runtime given in <xref ref-type="table" rid="T3">
<bold>Table&#xa0;3</bold>
</xref> refers to LTR-RTs identification and classification as given by EDTA. However, the runtime given by MegaLTR refers to all analyses, including identification, annotation, classification of LTR-RTs, identification of LTR-RT gene chimeras, detection of LTR-RTs near genes, statistical analysis, and visualization of the density of LTR-RTs. The run times of each step reported by EDTA and MegaLTR can be found in <xref ref-type="supplementary-material" rid="SM1">
<bold>Data Sheet 3</bold>
</xref> and <xref ref-type="supplementary-material" rid="SM1">
<bold>Data Sheet 4</bold>
</xref>, respectively. For MegaLTR, analysis times range from 9 minutes for <italic>Arabidopsis lyrata</italic> (206.8 Mb), <italic>Arabidopsis thaliana</italic> (119.6 Mb), and <italic>Cucumis sativus</italic> (226.6 Mb) to 10 hours and 16 minutes for <italic>Zea mays</italic> (2182.7 Mb). For EDTA, analysis times range from 6 minutes for <italic>Cucumis sativus</italic> to 27 hours and 39 minutes for <italic>Zea mays</italic>. For large genomes such as <italic>Zea mays</italic> (2182.7 Mb) and <italic>Mikania micrantha</italic> (1790.6 Mb), MegaLTR is more than 2x faster than EDTA. <xref ref-type="supplementary-material" rid="SM1">
<bold>Figure S1</bold>
</xref> shows that MegaLTR is faster than EDTA for large genomes. The total number of LTR-RTs identified is also similar and only slightly different between MegaLTR and EDTA (<xref ref-type="supplementary-material" rid="SM1">
<bold>Figure S1</bold>
</xref>).</p>
</sec>
<sec id="s3_2">
<label>3.2</label>
<title>Case study: <italic>Arabidopsis thaliana</italic> genome</title>
<p>
<italic>Arabidopsis thaliana</italic> was selected as a case study for MegaLTR results and serves as a comparison between the output of MegaLTR and EDTA in terms of classification. <italic>Arabidopsis thaliana</italic> is a model organism with a well-structured genome arranged in chromosomes and a high LAI score of 16.91. EDTA identified a total of 207 intact LTR-RTs elements, including 105 <italic>Gypsy</italic>, 75 <italic>Copia</italic>, and 27 unknown. For MegaLTR, a total of 203 intact LTR-RT elements were identified, classified, and annotated. Of the 203 intact LTR-RTs elements, 2 elements were classified as autonomous <italic>Gypsy</italic> and 201 as non-autonomous LTR-RTs. Non-autonomous elements included 118 <italic>Gypsy</italic>, 80 <italic>Copia</italic>, 1 <italic>TR-GAG</italic>, and 2 unknown (<xref ref-type="table" rid="T3">
<bold>Table&#xa0;3</bold>
</xref>). Based on the position of the identified elements in the genome sequence, the LTR-RT results of EDTA and MegaLTR were compared. Of the 207 LTR-RTs identified by EDTA, 193 elements matched MegaLTR and 14 did not match. These 14 LTR-RTs included one element that did not pass the LTR_retriever filter and 11 elements that did not pass the TEsorter filter in MegaLTR. EDTA assigned these 11 elements to the NA class, consistent with their exclusion by MegaLTR. The remaining 2 elements were not found in the MegaLTR data. On the other hand, MegaLTR identified 10 LTR-RTs not found by EDTA (<xref ref-type="supplementary-material" rid="SM1">
<bold>Figure S2A</bold>
</xref>), including 7 <italic>Gypsy</italic> and 3 <italic>Copia</italic>. These elements are assigned to 7 clades, including 3 <italic>Athila</italic>, 2 <italic>Retand</italic>, 1 <italic>Ale</italic>, 1 <italic>Ivana</italic>, 1 <italic>Reina</italic>, 1 <italic>SIRE</italic>, and 1 <italic>Tekay</italic>. As for the internal domains, 6 elements contain all the domains necessary for transposition (<italic>GAG</italic>, PROT, INT, RT, RH) for Copia and (<italic>GAG</italic>, PROT, RT, RH, INT) for <italic>Gypsy</italic>. The remaining 4 elements include one element containing the domains <italic>GAG</italic> and PROT, 2 elements containing the domain PROT, and one containing the domain <italic>GAG</italic>. The annotation of MegaLTR&#x2019;s unique results suggests that MegaLTR is able to identify more intact LTR-RTs with a high degree of filtering and annotation. In contrast, EDTA reported a number of elements that do not belong to LTR (<xref ref-type="supplementary-material" rid="SM1">
<bold>Data Sheet 5</bold>
</xref>).</p>
<p>MegaLTR is able to classify intact LTR-RT elements into autonomous (<italic>Gypsy</italic>) and non-autonomous (<italic>Copia, Gypsy</italic>, and <italic>TR-GAG</italic>) based on their structure. In addition, MegaLTR is able to classify unknown elements into superfamilies. EDTA reported 27 unknown elements, while MegaLTR reported only 2 unknown elements. As shown in <xref ref-type="supplementary-material" rid="SM1">
<bold>Figure (S2B)</bold>
</xref>, MegaLTR and EDTA have 2 unknown elements in common, while EDTA has 25 unique unknown elements. The 25 unknown elements include 12 elements that did not pass MegaLTR filtering steps and 13 elements that were annotated and classified as nonautonomous (<italic>Copia</italic> and <italic>Gypsy</italic>). <xref ref-type="supplementary-material" rid="SM1">
<bold>Data Sheet 5</bold>
</xref> lists the common LTR-RTs, the unique LTR-RTs in EDTA, the unique LTR-RTs in MegaLTR, the unknown elements in EDTA, and the unknown elements in MegaLTR.</p>
</sec>
<sec id="s3_3">
<label>3.3</label>
<title>Runtime vs. number of CPUs</title>
<p>In MegaLTR, multithreading was implemented to reduce the execution time. By splitting the genome sequence into scaffold/chromosome without splitting the individual sequences, MegaLTR can analyze multiple sequences simultaneously using multiple CPU cores (threads). In the standalone version, the user can specify the number of threads to use, while the MegaLTR web server currently uses 56 CPU cores for parallel processing. We tested the effect of the number of threads on runtime using the <italic>Brassica rapa</italic> genome. This genome was selected based on its medium genome size (352.9 Mbp) and the number of pseudomolecules/scaffolds (1100). <xref ref-type="fig" rid="f3">
<bold>Figure&#xa0;3</bold>
</xref> and <xref ref-type="supplementary-material" rid="SM1">
<bold>Data Sheet 2</bold>
</xref> show the runtime for different CPU numbers from 1 to 30. To analyse the Brassica rapa genome using a single thread, MegaLTR required 1382 minutes, while using 2 threads reduced the runtime to 707 minutes and using 30 threads reduced the runtime to 102 minutes, demonstrating the gain achieved through parallel processing in MegaLTR.</p>
<fig id="f3" position="float">
<label>Figure&#xa0;3</label>
<caption>
<p>MegaLTR run time of <italic>Brassica rapa</italic> genome using different number of threads ranging from 1 to 30.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-14-1237426-g003.tif"/>
</fig>
</sec>
<sec id="s3_4">
<label>3.4</label>
<title>Generated output</title>
<p>The web server and standalone version of MegaLTR automatically generate a series of tables, FASTA files, and images, some of which are listed in <xref ref-type="supplementary-material" rid="SM1">
<bold>Table S2</bold>
</xref>. These files contain tables with the position of the identified LTR-RT within the sequence, the start and end of all identified features, classification into autonomous, non-autonomous, superfamily and lineage levels, estimated insertion age, LTR-RT-gene chimeras and LTR-RTs-near genes. It also generates redundant and non-redundant LTR-RTs libraries in FASTA format. The full list of generated results can be found in the MegaLTR online documentation (<ext-link ext-link-type="uri" xlink:href="https://github.com/MoradMMokhtar/MegaLTR">https://github.com/MoradMMokhtar/MegaLTR</ext-link>). Using <italic>Arabidopsis thaliana</italic> genome, <xref ref-type="fig" rid="f4">
<bold>Figure&#xa0;4</bold>
</xref> shows an example of statistical analysis of the length of LTR-RT, the age of insertion of LTR-RT, and visualization of the density of genes and LTR-RTs on chromosomes.</p>
<fig id="f4" position="float">
<label>Figure&#xa0;4</label>
<caption>
<p>Example of MegaLTR generated results of <italic>Arabidopsis thaliana</italic>. <bold>(A)</bold> LTR-RTs length distribution. <bold>(B)</bold> Boxplot of insertion age, <bold>(C)</bold> Visualization of the density of genes and LTR-RTs on chromosomes.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-14-1237426-g004.tif"/>
</fig>
</sec>
</sec>
<sec id="s4" sec-type="conclusions">
<label>4</label>
<title>Conclusion and future directions</title>
<p>With the increasing availability of plant genome projects, researchers need accurate, robust, and easy-to-use pipelines for processing large amounts of data to study the effects of LTR-RTs on plant genome evolution and functionality. These pipelines, in the form of a web server, would be valuable for efforts to integrate LTR-RTs as a possible element for studying the gene regulatory system. MegaLTR is a web server and stand-alone pipeline that detects intact LTR-RTs at the whole-genome level and integrates multiple tools for homology-, structure-, and <italic>de novo</italic>-based identification, classification, and annotation of intact LTR-RT. In addition, a comprehensive pipeline is also needed to create a non-redundant library of LTR-RTs for species that lack this resource for annotating whole genome LTR-RTs. MegaLTR is able to classify intact LTR-RT elements into putative autonomous (<italic>Copia</italic> and <italic>Gypsy</italic>) and non-autonomous (<italic>Copia, Gypsy, LARD, TRIM, TR-GAG</italic> and <italic>BARE-2</italic>), superfamily and lineage levels. It also identifies LTR-RT gene chimeras, detects LTR-RTs near genes, and provides statistical analysis and visualization of LTR-RT. For detection of LTR-RTs, MegaLTR shows high specificity, accuracy, precision, sensitivity and low FDR. The development of an online server such as MegaLTR, which provides computational resources for analyzing large amounts of genomic data, is becoming increasingly important for the automated analysis of LTR-RT elements. The current version of MegaLTR focuses on genome-level analysis LTR-RT, with work currently underway to integrate tools optimized for studying LTR-RTs at the transcriptomic level. MegaLTR web server is freely accessible at: <ext-link ext-link-type="uri" xlink:href="https://bioinformatics.um6p.ma/MegaLTR">https://bioinformatics.um6p.ma/MegaLTR</ext-link> and the standalone version at <ext-link ext-link-type="uri" xlink:href="https://github.com/MoradMMokhtar/MegaLTR">https://github.com/MoradMMokhtar/MegaLTR</ext-link>.</p>
</sec>
<sec id="s5" sec-type="data-availability">
<title>Data availability statement</title>
<p>The original contributions presented in the study are included in the article/<xref ref-type="supplementary-material" rid="SM1">
<bold>Supplementary Material.</bold>
</xref> Further inquiries can be directed to the corresponding authors. MegaLTR web server is freely available at: <uri xlink:href="https://bioinformatics.um6p.ma/MegaLTR">https://bioinformatics.um6p.ma/MegaLTR</uri> and the standalone version at <uri xlink:href="https://github.com/MoradMMokhtar/MegaLTR">https://github.com/MoradMMokhtar/MegaLTR</uri>.</p>
</sec>
<sec id="s6" sec-type="author-contributions">
<title>Author contributions</title>
<p>Conceptualization MM and AA; Methodology MM and AA; Scripting MM and AA; Data curation MM; Writing&#x2013;original draft MM and AA. All authors reviewed the manuscript. All authors contributed to the article and approved the submitted version.</p>
</sec>
</body>
<back>
<ack>
<title>Acknowledgments</title>
<p>The authors acknowledge the African Supercomputing Center at Mohammed VI Polytechnic University for supercomputing resources (<ext-link ext-link-type="uri" xlink:href="https://ascc.um6p.ma/">https://ascc.um6p.ma/</ext-link>) made available for conducting the research reported in this paper.</p>
</ack>
<sec id="s7" sec-type="COI-statement">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec id="s8" sec-type="disclaimer">
<title>Publisher&#x2019;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<sec id="s9" sec-type="supplementary-material">
<title>Supplementary material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fpls.2023.1237426/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fpls.2023.1237426/full#supplementary-material</ext-link>
</p>
<supplementary-material xlink:href="DataSheet_1.docx" id="SM1" mimetype="application/vnd.openxmlformats-officedocument.wordprocessingml.document"/>
<supplementary-material xlink:href="DataSheet_2.docx" id="SM2" mimetype="application/vnd.openxmlformats-officedocument.wordprocessingml.document"/>
<supplementary-material xlink:href="DataSheet_3.docx" id="SM3" mimetype="application/vnd.openxmlformats-officedocument.wordprocessingml.document"/>
<supplementary-material xlink:href="DataSheet_4.docx" id="SM4" mimetype="application/vnd.openxmlformats-officedocument.wordprocessingml.document"/>
<supplementary-material xlink:href="DataSheet_5.xlsx" id="SM5" mimetype="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"/>
<supplementary-material xlink:href="Image_1.png" id="SF1" mimetype="image/png"/>
<supplementary-material xlink:href="Image_2.jpeg" id="SF2" mimetype="image/jpeg"/>
<supplementary-material xlink:href="Table_1.xlsx" id="ST1" mimetype="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"/>
<supplementary-material xlink:href="Table_2.xlsx" id="ST2" mimetype="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Amselem</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Cornut</surname> <given-names>G.</given-names>
</name>
<name>
<surname>Choisne</surname> <given-names>N.</given-names>
</name>
<name>
<surname>Alaux</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Alfama-Depauw</surname> <given-names>F.</given-names>
</name>
<name>
<surname>Jamilloux</surname> <given-names>V.</given-names>
</name>
<etal/>
</person-group>. (<year>2019</year>). <article-title>RepetDB: a unified resource for transposable element references</article-title>. <source>Mobile DNA</source> <volume>10</volume>, <fpage>1</fpage>&#x2013;<lpage>8</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1186/s13100-019-0150-y</pub-id>
</citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bennetzen</surname> <given-names>J. L.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>H.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>The contributions of transposable elements to the structure, function, and evolution of plant genomes</article-title>. <source>Annu. Rev. Plant Biol.</source> <volume>65</volume>, <fpage>505</fpage>&#x2013;<lpage>530</lpage>. doi: <pub-id pub-id-type="doi">10.1146/annurev-arplant-050213-035811</pub-id>
</citation>
</ref>
<ref id="B3">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Benson</surname> <given-names>G.</given-names>
</name>
</person-group> (<year>1999</year>). <article-title>Tandem repeats finder: a program to analyze dna sequences</article-title>. <source>Nucleic Acids Res.</source> <volume>27</volume>, <fpage>573</fpage>&#x2013;<lpage>580</lpage>. doi: <pub-id pub-id-type="doi">10.1093/nar/27.2.573</pub-id>
</citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Biryukov</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Ustyantsev</surname> <given-names>K.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Darts: an algorithm for domain-associated retrotransposon search in genome assemblies</article-title>. <source>Genes</source> <volume>13</volume>, <fpage>9</fpage>. doi: <pub-id pub-id-type="doi">10.3390/genes13010009</pub-id>
</citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Camacho</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Coulouris</surname> <given-names>G.</given-names>
</name>
<name>
<surname>Avagyan</surname> <given-names>V.</given-names>
</name>
<name>
<surname>Ma</surname> <given-names>N.</given-names>
</name>
<name>
<surname>Papadopoulos</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Bealer</surname> <given-names>K.</given-names>
</name>
<etal/>
</person-group>. (<year>2009</year>). <article-title>Blast+: architecture and applications</article-title>. <source>BMC Bioinf.</source> <volume>10</volume>, <fpage>1</fpage>&#x2013;<lpage>9</lpage>. doi: <pub-id pub-id-type="doi">10.1186/1471-2105-10-421</pub-id>
</citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chaparro</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Gayraud</surname> <given-names>T.</given-names>
</name>
<name>
<surname>de Souza</surname> <given-names>R. F.</given-names>
</name>
<name>
<surname>Domingues</surname> <given-names>D. S.</given-names>
</name>
<name>
<surname>Akaffou</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Laforga Vanzela</surname> <given-names>A. L.</given-names>
</name>
<etal/>
</person-group>. (<year>2015</year>). <article-title>Terminal-repeat retrotransposons with gag domain in plant genomes: a new testimony on the complex world of transposable elements</article-title>. <source>Genome Biol. Evol.</source> <volume>7</volume>, <fpage>493</fpage>&#x2013;<lpage>504</lpage>. doi: <pub-id pub-id-type="doi">10.1093/gbe/evv001</pub-id>
</citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Eddy</surname> <given-names>S. R.</given-names>
</name>
</person-group> (<year>1998</year>). <article-title>Profile hidden markov models</article-title>. <source>Bioinf. (Oxford England)</source> <volume>14</volume>, <fpage>755</fpage>&#x2013;<lpage>763</lpage>. doi: <pub-id pub-id-type="doi">10.1093/bioinformatics/14.9.755</pub-id>
</citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Edgar</surname> <given-names>R. C.</given-names>
</name>
</person-group> (<year>2010</year>). <article-title>Search and clustering orders of magnitude faster than blast</article-title>. <source>Bioinformatics</source> <volume>26</volume>, <fpage>2460</fpage>&#x2013;<lpage>2461</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1093/bioinformatics/btq461</pub-id>
</citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Eickbush</surname> <given-names>T. H.</given-names>
</name>
<name>
<surname>Jamburuthugoda</surname> <given-names>V. K.</given-names>
</name>
</person-group> (<year>2008</year>). <article-title>The diversity of retrotransposons and the properties of their reverse transcriptases</article-title>. <source>Virus Res.</source> <volume>134</volume>, <fpage>221</fpage>&#x2013;<lpage>234</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.virusres.2007.12.010</pub-id>
</citation>
</ref>
<ref id="B10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ellinghaus</surname> <given-names>D.</given-names>
</name>
<name>
<surname>Kurtz</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Willhoeft</surname> <given-names>U.</given-names>
</name>
</person-group> (<year>2008</year>). <article-title>LTRharvest, an efficient and flexible software for <italic>de novo</italic> detection of LTR retrotransposons</article-title>. <source>BMC Bioinf.</source> <volume>9</volume>, <fpage>1</fpage>&#x2013;<lpage>14</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1186/1471-2105-9-18</pub-id>
</citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Feschotte</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Keswani</surname> <given-names>U.</given-names>
</name>
<name>
<surname>Ranganathan</surname> <given-names>N.</given-names>
</name>
<name>
<surname>Guibotsy</surname> <given-names>M. L.</given-names>
</name>
<name>
<surname>Levine</surname> <given-names>D.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>Exploring repetitive dna landscapes using repclass, a tool that automates the classification of transposable elements in eukaryotic genomes</article-title>. <source>Genome Biol. Evol.</source> <volume>1</volume>, <fpage>205</fpage>&#x2013;<lpage>220</lpage>. doi: <pub-id pub-id-type="doi">10.1093/gbe/evp023</pub-id>
</citation>
</ref>
<ref id="B12">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Franco-Zorrilla</surname> <given-names>J. M.</given-names>
</name>
<name>
<surname>Valli</surname> <given-names>A.</given-names>
</name>
<name>
<surname>Todesco</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Mateos</surname> <given-names>I.</given-names>
</name>
<name>
<surname>Puga</surname> <given-names>M. I.</given-names>
</name>
<name>
<surname>Rubio-Somoza</surname> <given-names>I.</given-names>
</name>
<etal/>
</person-group>. (<year>2007</year>). <article-title>Target mimicry provides a new mechanism for regulation of microrna activity</article-title>. <source>Nat. Genet.</source> <volume>39</volume>, <page-range>1033&#x2013;1037</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/ng2079</pub-id>
</citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gao</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Havecker</surname> <given-names>E. R.</given-names>
</name>
<name>
<surname>Baranov</surname> <given-names>P. V.</given-names>
</name>
<name>
<surname>Atkins</surname> <given-names>J. F.</given-names>
</name>
<name>
<surname>Voytas</surname> <given-names>D. F.</given-names>
</name>
</person-group> (<year>2003</year>). <article-title>Translational recoding signals between gag and pol in diverse LTR retrotransposons</article-title>. <source>RNA</source> <volume>9</volume>, <fpage>1422</fpage>&#x2013;<lpage>1430</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1261/rna.5105503</pub-id>
</citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Grandbastien</surname> <given-names>M.-A.</given-names>
</name>
<name>
<surname>Audeon</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Bonnivard</surname> <given-names>E.</given-names>
</name>
<name>
<surname>Casacuberta</surname> <given-names>J. M.</given-names>
</name>
<name>
<surname>Chalhoub</surname> <given-names>B.</given-names>
</name>
<name>
<surname>Costa</surname> <given-names>A.-P.</given-names>
</name>
<etal/>
</person-group>. (<year>2005</year>). <article-title>Stress activation and genomic impact of Tnt1 retrotransposons in Solanaceae</article-title>. <source>Cytogenetic Genome Res.</source> <volume>110</volume>, <fpage>229</fpage>&#x2013;<lpage>241</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1159/000084957</pub-id>
</citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hao</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Lv</surname> <given-names>D.</given-names>
</name>
<name>
<surname>Ge</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Shi</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Weijers</surname> <given-names>D.</given-names>
</name>
<name>
<surname>Yu</surname> <given-names>G.</given-names>
</name>
<etal/>
</person-group>. (<year>2020</year>). <article-title>Rideogram: drawing svg graphics to visualize and map genome-wide data on the idiograms</article-title>. <source>PeerJ Comput. Sci.</source> <volume>6</volume>, <fpage>e251</fpage>. doi: <pub-id pub-id-type="doi">10.7717/peerj-cs.251</pub-id>
</citation>
</ref>
<ref id="B16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ito</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Kim</surname> <given-names>J. M.</given-names>
</name>
<name>
<surname>Matsunaga</surname> <given-names>W.</given-names>
</name>
<name>
<surname>Saze</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Matsui</surname> <given-names>A.</given-names>
</name>
<name>
<surname>Endo</surname> <given-names>T. A.</given-names>
</name>
<etal/>
</person-group>. (<year>2016</year>). <article-title>A stress-activated transposon in arabidopsis induces transgenerational abscisic acid insensitivity</article-title>. <source>Sci. Rep.</source> <volume>6</volume>, <fpage>23181</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/srep23181</pub-id>
</citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jung</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Venkatesh</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Kang</surname> <given-names>M.-Y.</given-names>
</name>
<name>
<surname>Kwon</surname> <given-names>J.-K.</given-names>
</name>
<name>
<surname>Kang</surname> <given-names>B.-C.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>A non-ltr retrotransposon activates anthocyanin biosynthesis by regulating a myb transcription factor in capsicum annuum</article-title>. <source>Plant Sci.</source> <volume>287</volume>, <elocation-id>110181</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.plantsci.2019.110181</pub-id>
</citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jurka</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Kapitonov</surname> <given-names>V. V.</given-names>
</name>
<name>
<surname>Pavlicek</surname> <given-names>A.</given-names>
</name>
<name>
<surname>Klonowski</surname> <given-names>P.</given-names>
</name>
<name>
<surname>Kohany</surname> <given-names>O.</given-names>
</name>
<name>
<surname>Walichiewicz</surname> <given-names>J.</given-names>
</name>
</person-group> (<year>2005</year>). <article-title>Repbase update, a database of eukaryotic repetitive elements</article-title>. <source>Cytogenetic Genome Res.</source> <volume>110</volume>, <fpage>462</fpage>&#x2013;<lpage>467</lpage>. doi: <pub-id pub-id-type="doi">10.1159/000084979</pub-id>
</citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kalendar</surname> <given-names>R.</given-names>
</name>
<name>
<surname>Vicient</surname> <given-names>C. M.</given-names>
</name>
<name>
<surname>Peleg</surname> <given-names>O.</given-names>
</name>
<name>
<surname>Anamthawat-Jonsson</surname> <given-names>K.</given-names>
</name>
<name>
<surname>Bolshoy</surname> <given-names>A.</given-names>
</name>
<name>
<surname>Schulman</surname> <given-names>A. H.</given-names>
</name>
</person-group> (<year>2004</year>). <article-title>Large retrotransposon derivatives: abundant, conserved but nonautonomous retroelements of barley and related genomes</article-title>. <source>Genetics</source> <volume>166</volume>, <fpage>1437</fpage>&#x2013;<lpage>1450</lpage>. doi: <pub-id pub-id-type="doi">10.1534/genetics.166.3.1437</pub-id>
</citation>
</ref>
<ref id="B20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kimura</surname> <given-names>M.</given-names>
</name>
</person-group> (<year>1980</year>). <article-title>A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences</article-title>. <source>J. Mol. Evol.</source> <volume>16</volume>, <fpage>111</fpage>&#x2013;<lpage>120</lpage>. doi: <pub-id pub-id-type="doi">10.1007/BF01731581</pub-id>
</citation>
</ref>
<ref id="B21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kumar</surname> <given-names>A.</given-names>
</name>
</person-group> (<year>1998</year>). <article-title>The evolution of plant retroviruses: moving to green pastures</article-title>. <source>Trends Plant Sci.</source> <volume>3</volume>, <fpage>371</fpage>&#x2013;<lpage>374</lpage>. doi: <pub-id pub-id-type="doi">10.1016/S1360-1385(98)01304-1</pub-id>
</citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lee</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Lee</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Mohammed Ismail</surname> <given-names>W.</given-names>
</name>
<name>
<surname>Rho</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Fox</surname> <given-names>G. C.</given-names>
</name>
<name>
<surname>Oh</surname> <given-names>S.</given-names>
</name>
<etal/>
</person-group>. (<year>2016</year>). <article-title>Mgescan: a galaxy-based system for identifying retrotransposons in genomes</article-title>. <source>Bioinformatics</source> <volume>32</volume>, <fpage>2502</fpage>&#x2013;<lpage>2504</lpage>. doi: <pub-id pub-id-type="doi">10.1093/bioinformatics/btw157</pub-id>
</citation>
</ref>
<ref id="B23">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lerat</surname> <given-names>E.</given-names>
</name>
</person-group> (<year>2010</year>). <article-title>Identifying repeats and transposable elements in sequenced genomes: how to find your way through the dense forest of programs</article-title>. <source>Heredity</source> <volume>104</volume>, <fpage>520</fpage>&#x2013;<lpage>533</lpage>. doi: <pub-id pub-id-type="doi">10.1038/hdy.2009.165</pub-id>
</citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname> <given-names>W.</given-names>
</name>
<name>
<surname>Godzik</surname> <given-names>A.</given-names>
</name>
</person-group> (<year>2006</year>). <article-title>Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences</article-title>. <source>Bioinformatics</source> <volume>22</volume>, <fpage>1658</fpage>&#x2013;<lpage>1659</lpage>. doi: <pub-id pub-id-type="doi">10.1093/bioinformatics/btl158</pub-id>
</citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lopes</surname> <given-names>F. R.</given-names>
</name>
<name>
<surname>Jjingo</surname> <given-names>D.</given-names>
</name>
<name>
<surname>Silva</surname> <given-names>C. R. D.</given-names>
</name>
<name>
<surname>Andrade</surname> <given-names>A. C.</given-names>
</name>
<name>
<surname>Marraccini</surname> <given-names>P.</given-names>
</name>
<name>
<surname>Teixeira</surname> <given-names>J. B.</given-names>
</name>
<etal/>
</person-group>. (<year>2013</year>). <article-title>Transcriptional activity, chromosomal distribution and expression effects of transposable elements in coffea genomes</article-title>. <source>PloS One</source> <volume>8</volume>, <elocation-id>e78931</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1371/journal.pone.0078931</pub-id>
</citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ma</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Devos</surname> <given-names>K. M.</given-names>
</name>
<name>
<surname>Bennetzen</surname> <given-names>J. L.</given-names>
</name>
</person-group> (<year>2004</year>). <article-title>Analyses of ltr-retrotransposon structures reveal recent and rapid genomic dna loss in rice</article-title>. <source>Genome Res.</source> <volume>14</volume>, <fpage>860</fpage>&#x2013;<lpage>869</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1101/gr.1466204</pub-id>
</citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>McCarthy</surname> <given-names>E. M.</given-names>
</name>
<name>
<surname>McDonald</surname> <given-names>J. F.</given-names>
</name>
</person-group> (<year>2003</year>). <article-title>LTR_STRUC: a novel search and identification program for ltr retrotransposons</article-title>. <source>Bioinformatics</source> <volume>19</volume>, <fpage>362</fpage>&#x2013;<lpage>367</lpage>. doi: <pub-id pub-id-type="doi">10.1093/bioinformatics/btf878</pub-id>
</citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mokhtar</surname> <given-names>M. M.</given-names>
</name>
<name>
<surname>Abd-Elhalim</surname> <given-names>H. M.</given-names>
</name>
<name>
<surname>El Allali</surname> <given-names>A.</given-names>
</name>
</person-group> (<year>2023</year>a). <article-title>A Large-scale assessment of the quality of plant genome assemblies using the LTR assembly index</article-title>. <source>AoB Plants</source> <volume>15</volume> (<issue>3</issue>). doi:&#xa0;<pub-id pub-id-type="doi">10.1093/aobpla/plad015</pub-id>
</citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mokhtar</surname> <given-names>M. M.</given-names>
</name>
<name>
<surname>Alsamman</surname> <given-names>A. M.</given-names>
</name>
<name>
<surname>Abd-Elhalim</surname> <given-names>H. M.</given-names>
</name>
<name>
<surname>El Allali</surname> <given-names>A.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Cicersptedb: A web-based database for high-resolution genome-wide identification of transposable elements in cicer species</article-title>. <source>PloS One</source> <volume>16</volume>, <elocation-id>e0259540</elocation-id>. doi: <pub-id pub-id-type="doi">10.1371/journal.pone.0259540</pub-id>
</citation>
</ref>
<ref id="B30">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mokhtar</surname> <given-names>M. M.</given-names>
</name>
<name>
<surname>Alsamman</surname> <given-names>A. M.</given-names>
</name>
<name>
<surname>El Allali</surname> <given-names>A.</given-names>
</name>
</person-group> (<year>2023</year>b). <article-title>Plantltrdb: An interactive database for 195 plant species ltr-retrotransposons</article-title>. <source>Front. Plant Sci.</source> <volume>14</volume>, <elocation-id>1134627</elocation-id>. doi: <pub-id pub-id-type="doi">10.3389/fpls.2023.1134627</pub-id>
</citation>
</ref>
<ref id="B31">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mokhtar</surname> <given-names>M. M.</given-names>
</name>
<name>
<surname>El Allali</surname> <given-names>A.</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>Pltrnadb: Plant transfer rna database</article-title>. <source>PloS One</source> <volume>17</volume>, <fpage>1</fpage>&#x2013;<lpage>12</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1371/journal.pone.0268904</pub-id>
</citation>
</ref>
<ref id="B32">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Neumann</surname> <given-names>P.</given-names>
</name>
<name>
<surname>Nov&#xe1;k</surname> <given-names>P.</given-names>
</name>
<name>
<surname>Ho&#x161;t&#xe1;kov&#xe1;</surname> <given-names>N.</given-names>
</name>
<name>
<surname>Macas</surname> <given-names>J.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Systematic survey of plant LTRretrotransposons elucidates phylogenetic relationships of their polyprotein domains and provides a reference for element classification</article-title>. <source>Mobile DNA</source> <volume>10</volume>, <elocation-id>1</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1186/s13100-018-0144-1</pub-id>
</citation>
</ref>
<ref id="B33">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Orozco-Arias</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Humberto Lopez-Murillo</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Candamil-Cortes,&#xb4;</surname> <given-names>M. S.</given-names>
</name>
<name>
<surname>Arias</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Jaimes</surname> <given-names>P. A.</given-names>
</name>
<name>
<surname>Rossi Paschoal</surname> <given-names>A.</given-names>
</name>
<etal/>
</person-group>. (<year>2022</year>). <article-title>Inpactor2: a software based on deep learning to identify and classify ltr-retrotransposons in plant genomes</article-title>. <source>Briefings Bioinf</source>. <volume>24</volume>:<elocation-id>bbac511</elocation-id>. doi: <pub-id pub-id-type="doi">10.1093/bib/bbac511</pub-id>
</citation>
</ref>
<ref id="B34">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ou</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Chen</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Jiang</surname> <given-names>N.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Assessing genome assembly quality using the ltr assembly index (lai)</article-title>. <source>Nucleic Acids Res.</source> <volume>46</volume>, <fpage>e126</fpage>&#x2013;<lpage>e126</lpage>. doi: <pub-id pub-id-type="doi">10.1093/nar/gky730</pub-id>
</citation>
</ref>
<ref id="B35">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ou</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Jiang</surname> <given-names>N.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>LTR_retriever: A highly accurate and sensitive program for identification of long terminal repeat retrotransposons</article-title>. <source>Plant Physiol.</source> <volume>176</volume>, <fpage>1410</fpage>&#x2013;<lpage>1422</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1104/pp.17.01310</pub-id>
</citation>
</ref>
<ref id="B36">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ou</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Jiang</surname> <given-names>N.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>LTR_FINDER parallel: parallelization of LTR_FINDER enabling rapid identification of long terminal repeat retrotransposons</article-title>. <source>Mobile DNA</source> <volume>10</volume>, <fpage>1</fpage>&#x2013;<lpage>3</lpage>. doi: <pub-id pub-id-type="doi">10.1186/s13100-019-0193-0</pub-id>
</citation>
</ref>
<ref id="B37">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ou</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Su</surname> <given-names>W.</given-names>
</name>
<name>
<surname>Liao</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Chougule</surname> <given-names>K.</given-names>
</name>
<name>
<surname>Agda</surname> <given-names>J. R.</given-names>
</name>
<name>
<surname>Hellinga</surname> <given-names>A. J.</given-names>
</name>
<etal/>
</person-group>. (<year>2019</year>). <article-title>Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline</article-title>. <source>Genome Biol.</source> <volume>20</volume>, <fpage>1</fpage>&#x2013;<lpage>18</lpage>. doi: <pub-id pub-id-type="doi">10.1186/s13059-019-1905-y</pub-id>
</citation>
</ref>
<ref id="B39">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pereira</surname> <given-names>V.</given-names>
</name>
</person-group> (<year>2008</year>). <article-title>Automated paleontology of repetitive DNA with REANNOTATE</article-title>. <source>BMC Genomics</source> <volume>9</volume>, <elocation-id>614</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1186/1471-2164-9-614</pub-id>
</citation>
</ref>
<ref id="B38">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pereira</surname> <given-names>A.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Plant abiotic stress challenges from the changing environment</article-title>. <source>Front. Plant Sci.</source> <volume>7</volume>. doi:&#xa0;<pub-id pub-id-type="doi">10.3389/fpls.2016.01123</pub-id>
</citation>
</ref>
<ref id="B40">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schnable</surname> <given-names>P. S.</given-names>
</name>
<name>
<surname>Ware</surname> <given-names>D.</given-names>
</name>
<name>
<surname>Fulton</surname> <given-names>R. S.</given-names>
</name>
<name>
<surname>Stein</surname> <given-names>J. C.</given-names>
</name>
<name>
<surname>Wei</surname> <given-names>F.</given-names>
</name>
<name>
<surname>Pasternak</surname> <given-names>S.</given-names>
</name>
<etal/>
</person-group>. (<year>2009</year>). <article-title>The B73 maize genome: complexity, diversity, and dynamics</article-title>. <source>science</source> <volume>326</volume>, <fpage>1112</fpage>&#x2013;<lpage>1115</lpage>. doi: <pub-id pub-id-type="doi">10.1126/science.1178534</pub-id>
</citation>
</ref>
<ref id="B41">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shi</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Liang</surname> <given-names>C.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Generic repeat finder: a high-sensitivity tool for genome-wide <italic>de novo</italic> repeat detection</article-title>. <source>Plant Physiol.</source> <volume>180</volume>, <fpage>1803</fpage>&#x2013;<lpage>1815</lpage>. doi: <pub-id pub-id-type="doi">10.1104/pp.19.00386</pub-id>
</citation>
</ref>
<ref id="B42">
<citation citation-type="web">
<person-group person-group-type="author">
<name>
<surname>Smit</surname> <given-names>A.</given-names>
</name>
<name>
<surname>Hubley</surname> <given-names>R.</given-names>
</name>
<name>
<surname>Green</surname> <given-names>P.</given-names>
</name>
</person-group> (<year>2015</year>) <source>Repeatmodeler open-1.0. 2008&#x2013;2015</source> (<publisher-loc>Seattle, USA</publisher-loc>: <publisher-name>Institute for Systems Biology</publisher-name>) (Accessed <access-date>May 1, 2018</access-date>). httpwww.repeatmasker.org.</citation>
</ref>
<ref id="B43">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Steinbiss</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Willhoeft</surname> <given-names>U.</given-names>
</name>
<name>
<surname>Gremme</surname> <given-names>G.</given-names>
</name>
<name>
<surname>Kurtz</surname> <given-names>S.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>Fine-grained annotation and classification of <italic>de novo</italic> predicted ltr retrotransposons</article-title>. <source>Nucleic Acids Res.</source> <volume>37</volume>, <fpage>7002</fpage>&#x2013;<lpage>7013</lpage>. doi: <pub-id pub-id-type="doi">10.1093/nar/gkp759</pub-id>
</citation>
</ref>
<ref id="B44">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Su</surname> <given-names>W.</given-names>
</name>
<name>
<surname>Gu</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Peterson</surname> <given-names>T.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Tir-learner, a new ensemble method for tir transposable element annotation, provides evidence for abundant new transposable elements in the maize genome</article-title>. <source>Mol. Plant</source> <volume>12</volume>, <fpage>447</fpage>&#x2013;<lpage>460</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.molp.2019.02.008</pub-id>
</citation>
</ref>
<ref id="B45">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tanskanen</surname> <given-names>J. A.</given-names>
</name>
<name>
<surname>Sabot</surname> <given-names>F.</given-names>
</name>
<name>
<surname>Vicient</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Schulman</surname> <given-names>A. H.</given-names>
</name>
</person-group> (<year>2007</year>). <article-title>Life without gag: The bare-2 retrotransposon as a parasite&#x2019;s parasite</article-title>. <source>Gene</source> <volume>390</volume>, <fpage>166</fpage>&#x2013;<lpage>174</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.gene.2006.09.009</pub-id>
</citation>
</ref>
<ref id="B46">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Thompson</surname> <given-names>J. D.</given-names>
</name>
<name>
<surname>Gibson</surname> <given-names>T. J.</given-names>
</name>
<name>
<surname>Higgins</surname> <given-names>D. G.</given-names>
</name>
</person-group> (<year>2003</year>). <article-title>Multiple sequence alignment using clustalw and clustalx</article-title>. <source>Curr. Protoc. Bioinf.</source> <volume>00</volume>, <fpage>2.3.1</fpage>&#x2013;<lpage>2.3.22</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1002/0471250953.bi0203s00</pub-id>
</citation>
</ref>
<ref id="B47">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ustyantsev</surname> <given-names>K.</given-names>
</name>
<name>
<surname>Novikova</surname> <given-names>O.</given-names>
</name>
<name>
<surname>Blinov</surname> <given-names>A.</given-names>
</name>
<name>
<surname>Smyshlyaev</surname> <given-names>G.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Convergent evolution of ribonuclease h in ltr retrotransposons and retroviruses</article-title>. <source>Mol. Biol. Evol.</source> <volume>32</volume>, <fpage>1197</fpage>&#x2013;<lpage>1207</lpage>. doi: <pub-id pub-id-type="doi">10.1093/molbev/msv008</pub-id>
</citation>
</ref>
<ref id="B48">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Valencia</surname> <given-names>J. D.</given-names>
</name>
<name>
<surname>Girgis</surname> <given-names>H. Z.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Ltrdetector: A tool-suite for detecting long terminal repeat retrotransposons <italic>de-novo</italic>
</article-title>. <source>BMC Genomics</source> <volume>20</volume>, <fpage>1</fpage>&#x2013;<lpage>14</lpage>. doi: <pub-id pub-id-type="doi">10.1186/s12864-019-5796-9</pub-id>
</citation>
</ref>
<ref id="B49">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Vitte</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Fustier</surname> <given-names>M.-A.</given-names>
</name>
<name>
<surname>Alix</surname> <given-names>K.</given-names>
</name>
<name>
<surname>Tenaillon</surname> <given-names>M. I.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>The bright side of transposons in crop evolution</article-title>. <source>Briefings Funct. Genomics</source> <volume>13</volume>, <fpage>276</fpage>&#x2013;<lpage>295</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1093/bfgp/elu002</pub-id>
</citation>
</ref>
<ref id="B50">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Vitte</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Panaud</surname> <given-names>O.</given-names>
</name>
</person-group> (<year>2005</year>). <article-title>Ltr retrotransposons and flowering plant genome size: emergence of the increase/decrease model</article-title>. <source>Cytogenetic Genome Res.</source> <volume>110</volume>, <fpage>91</fpage>&#x2013;<lpage>107</lpage>. doi: <pub-id pub-id-type="doi">10.1159/000084941</pub-id>
</citation>
</ref>
<ref id="B51">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wheeler</surname> <given-names>D. L.</given-names>
</name>
<name>
<surname>Barrett</surname> <given-names>T.</given-names>
</name>
<name>
<surname>Benson</surname> <given-names>D. A.</given-names>
</name>
<name>
<surname>Bryant</surname> <given-names>S. H.</given-names>
</name>
<name>
<surname>Canese</surname> <given-names>K.</given-names>
</name>
<name>
<surname>Chetvernin</surname> <given-names>V.</given-names>
</name>
<etal/>
</person-group>. (<year>2007</year>). <article-title>Database resources of the national center for biotechnology information</article-title>. <source>Nucleic Acids Res.</source> <volume>36</volume>, <fpage>D13</fpage>&#x2013;<lpage>D21</lpage>. doi: <pub-id pub-id-type="doi">10.1093/nar/gkm1000</pub-id>
</citation>
</ref>
<ref id="B52">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wheeler</surname> <given-names>T. J.</given-names>
</name>
<name>
<surname>Eddy</surname> <given-names>S. R.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>nhmmer: Dna homology search with profile hmms</article-title>. <source>Bioinformatics</source> <volume>29</volume>, <fpage>2487</fpage>&#x2013;<lpage>2489</lpage>. doi: <pub-id pub-id-type="doi">10.1093/bioinformatics/btt403</pub-id>
</citation>
</ref>
<ref id="B53">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Wicker</surname> <given-names>T.</given-names>
</name>
<name>
<surname>Matthews</surname> <given-names>D. E.</given-names>
</name>
<name>
<surname>Keller</surname> <given-names>B.</given-names>
</name>
</person-group> (<year>2002</year>). <article-title>TREP: a database for Triticeae repetitive elements, Dataset</article-title>. <source>Trends Plant Sci.</source> <volume>7</volume>, <page-range>561&#x2013;562</page-range>. doi: <pub-id pub-id-type="doi">10.1016/S1360-1385(02)02372-5</pub-id>
</citation>
</ref>
<ref id="B54">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wicker</surname> <given-names>T.</given-names>
</name>
<name>
<surname>Sabot</surname> <given-names>F.</given-names>
</name>
<name>
<surname>Hua-Van</surname> <given-names>A.</given-names>
</name>
<name>
<surname>Bennetzen</surname> <given-names>J. L.</given-names>
</name>
<name>
<surname>Capy</surname> <given-names>P.</given-names>
</name>
<name>
<surname>Chalhoub</surname> <given-names>B.</given-names>
</name>
<etal/>
</person-group>. (<year>2007</year>). <article-title>A unified classification system for eukaryotic transposable elements</article-title>. <source>Nat. Rev. Genet.</source> <volume>8</volume>, <fpage>973</fpage>&#x2013;<lpage>982</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/nrg2165</pub-id>
</citation>
</ref>
<ref id="B55">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Witte</surname> <given-names>C.-P.</given-names>
</name>
<name>
<surname>Le</surname> <given-names>Q. H.</given-names>
</name>
<name>
<surname>Bureau</surname> <given-names>T.</given-names>
</name>
<name>
<surname>Kumar</surname> <given-names>A.</given-names>
</name>
</person-group> (<year>2001</year>). <article-title>Terminal-repeat retrotransposons in miniature (trim) are involved in restructuring plant genomes</article-title>. <source>Proc. Natl. Acad. Sci.</source> <volume>98</volume>, <fpage>13778</fpage>&#x2013;<lpage>13783</lpage>. doi: <pub-id pub-id-type="doi">10.1073/pnas.241341898</pub-id>
</citation>
</ref>
<ref id="B56">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xia</surname> <given-names>E.</given-names>
</name>
<name>
<surname>Tong</surname> <given-names>W.</given-names>
</name>
<name>
<surname>Hou</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>An</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Chen</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Wu</surname> <given-names>Q.</given-names>
</name>
<etal/>
</person-group>. (<year>2020</year>). <article-title>The reference genome of tea plant and resequencing of 81 diverse accessions provide insights into its genome evolution and adaptation</article-title>. <source>Mol. Plant</source> <volume>13</volume>, <fpage>1013</fpage>&#x2013;<lpage>1026</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.molp.2020.04.010</pub-id>
</citation>
</ref>
<ref id="B57">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xiong</surname> <given-names>W.</given-names>
</name>
<name>
<surname>He</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Lai</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Dooner</surname> <given-names>H. K.</given-names>
</name>
<name>
<surname>Du</surname> <given-names>C.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Helitronscanner uncovers a large overlooked cache of helitron transposons in many plant genomes</article-title>. <source>Proc. Natl. Acad. Sci.</source> <volume>111</volume>, <fpage>10263</fpage>&#x2013;<lpage>10268</lpage>. doi: <pub-id pub-id-type="doi">10.1073/pnas.1410068111</pub-id>
</citation>
</ref>
<ref id="B58">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xu</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>H.</given-names>
</name>
</person-group> (<year>2007</year>). <article-title>LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons</article-title>. <source>Nucleic Acids Res.</source> <volume>35</volume>, <fpage>W265</fpage>&#x2013;<lpage>W268</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1093/nar/gkm286</pub-id>
</citation>
</ref>
<ref id="B59">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname> <given-names>R.-G.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>G.-Y.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>X.-L.</given-names>
</name>
<name>
<surname>Dainat</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>Z.-X.</given-names>
</name>
<name>
<surname>Ou</surname> <given-names>S.</given-names>
</name>
<etal/>
</person-group>. (<year>2022</year>). <article-title>Tesorter: an accurate and fast method to classify ltr-retrotransposons in plant genomes</article-title>. <source>Hortic. Res.</source> <volume>9</volume>, <elocation-id>uhac017</elocation-id>. doi: <pub-id pub-id-type="doi">10.1093/hr/uhac017</pub-id>
</citation>
</ref>
<ref id="B60">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhao</surname> <given-names>D.</given-names>
</name>
<name>
<surname>Ferguson</surname> <given-names>A. A.</given-names>
</name>
<name>
<surname>Jiang</surname> <given-names>N.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>What makes up plant genomes: The vanishing line between transposable elements and genes</article-title>. <source>Biochim. Biophys. Acta (BBA)-Gene Regul. Mech.</source> <volume>1859</volume>, <fpage>366</fpage>&#x2013;<lpage>380</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.bbagrm.2015.12.005</pub-id>
</citation>
</ref>
</ref-list>
</back>
</article>