<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Microbiol.</journal-id>
<journal-title>Frontiers in Microbiology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Microbiol.</abbrev-journal-title>
<issn pub-type="epub">1664-302X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fmicb.2017.01410</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Microbiology</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Complete Genome Analysis of <italic>Thermus parvatiensis</italic> and Comparative Genomics of <italic>Thermus</italic> spp. Provide Insights into Genetic Variability and Evolution of Natural Competence as Strategic Survival Attributes</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Tripathi</surname> <given-names>Charu</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/461303/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Mishra</surname> <given-names>Harshita</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/461060/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Khurana</surname> <given-names>Himani</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/461674/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Dwivedi</surname> <given-names>Vatsala</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/461285/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Kamra</surname> <given-names>Komal</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Negi</surname> <given-names>Ram K.</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Lal</surname> <given-names>Rup</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="author-notes" rid="fn001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/300271/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Department of Zoology, University of Delhi</institution> <country>New Delhi, India</country></aff>
<aff id="aff2"><sup>2</sup><institution>Ciliate Biology Laboratory, Sri Guru Tegh Bahadar Khalsa College, University of Delhi</institution> <country>New Delhi, India</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Frank T. Robb, University of Maryland, Baltimore, United States</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Yutaka Kawarabayasi, Kyushu University, Japan; Jeremy Dodsworth, California State University, San Bernardino, United States</p></fn>
<fn fn-type="corresp" id="fn001"><p>&#x0002A;Correspondence: Rup Lal <email>ruplal&#x00040;gmail.com</email></p></fn>
<fn fn-type="other" id="fn002"><p>This article was submitted to Evolutionary and Genomic Microbiology, a section of the journal Frontiers in Microbiology</p></fn></author-notes>
<pub-date pub-type="epub">
<day>27</day>
<month>07</month>
<year>2017</year>
</pub-date>
<pub-date pub-type="collection">
<year>2017</year>
</pub-date>
<volume>8</volume>
<elocation-id>1410</elocation-id>
<history>
<date date-type="received">
<day>13</day>
<month>04</month>
<year>2017</year>
</date>
<date date-type="accepted">
<day>11</day>
<month>07</month>
<year>2017</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2017 Tripathi, Mishra, Khurana, Dwivedi, Kamra, Negi and Lal.</copyright-statement>
<copyright-year>2017</copyright-year>
<copyright-holder>Tripathi, Mishra, Khurana, Dwivedi, Kamra, Negi and Lal</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract><p>Thermophilic environments represent an interesting niche. Among thermophiles, the genus <italic>Thermus</italic> is among the most studied genera. In this study, we have sequenced the genome of <italic>Thermus parvatiensis</italic> strain RL, a thermophile isolated from Himalayan hot water springs (temperature &#x0003E;96&#x000B0;C) using PacBio RSII SMRT technique. The small genome (2.01 Mbp) comprises a chromosome (1.87 Mbp) and a plasmid (143 Kbp), designated in this study as pTP143. Annotation revealed a high number of repair genes, a squeezed genome but containing highly plastic plasmid with transposases, integrases, mobile elements and hypothetical proteins (44%). We performed a comparative genomic study of the group <italic>Thermus</italic> with an aim of analysing the phylogenetic relatedness as well as niche specific attributes prevalent among the group. We compared the reference genome RL with 16 <italic>Thermus</italic> genomes to assess their phylogenetic relationships based on 16S rRNA gene sequences, average nucleotide identity (ANI), conserved marker genes (31 and 400), pan genome and tetranucleotide frequency. The core genome of the analyzed genomes contained 1,177 core genes and many singleton genes were detected in individual genomes, reflecting a conserved core but adaptive pan repertoire. We demonstrated the presence of metagenomic islands (chromosome:5, plasmid:5) by recruiting raw metagenomic data (from the same niche) against the genomic replicons of <italic>T. parvatiensis</italic>. We also dissected the CRISPR loci wide all genomes and found widespread presence of this system across <italic>Thermus</italic> genomes. Additionally, we performed a comparative analysis of competence loci wide <italic>Thermus</italic> genomes and found evidence for recent horizontal acquisition of the locus and continued dispersal among members reflecting that natural competence is a beneficial survival trait among <italic>Thermus</italic> members and its acquisition depicts unending evolution in order to accomplish optimal fitness.</p></abstract>
<kwd-group>
<kwd>thermophiles</kwd>
<kwd><italic>Thermus parvatiensis</italic></kwd>
<kwd>CRISPR</kwd>
<kwd><italic>Pilus</italic> genes</kwd>
<kwd>natural transformation</kwd>
<kwd>phage resistance</kwd>
</kwd-group>
<contract-num rid="cn001">BT/PR15118/BCE/8/1141/2015</contract-num>
<contract-num rid="cn002">NBAIM/AMAAS/2014-17/PF/9</contract-num>
<contract-sponsor id="cn001">Department of Biotechnology, Ministry of Science and Technology<named-content content-type="fundref-id">10.13039/501100001407</named-content></contract-sponsor>
<contract-sponsor id="cn002">National Bureau of Agriculturally Important Microorganisms<named-content content-type="fundref-id">10.13039/501100005912</named-content></contract-sponsor>
<counts>
<fig-count count="8"/>
<table-count count="3"/>
<equation-count count="0"/>
<ref-count count="98"/>
<page-count count="22"/>
<word-count count="14532"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>Introduction</title>
<p>The genus <italic>Thermus</italic> belongs to the vast group of extreme thermophiles that have held biochemical and industrial attention. The discovery of <italic>Thermus aquaticus</italic> in 1969 (Brock and Freeze, <xref ref-type="bibr" rid="B9">1969</xref>), and subsequently the multi-billion-dollar industry of <italic>Taq</italic> DNA polymerase has revolutionized the field of extremophile research. Not only do the extremophiles provide understanding of life at extreme habitats, but they also serve as model organisms to study protein structure and functions. Members of this genus have been isolated from hot water springs all over the world (Chung et al., <xref ref-type="bibr" rid="B16">2000</xref>; Ming et al., <xref ref-type="bibr" rid="B57">2014</xref>). The proteins encoded by <italic>Thermus</italic> spp. have high stability and have been used in various industries, DNA polymerases (Carballeira et al., <xref ref-type="bibr" rid="B13">1990</xref>; Engelke et al., <xref ref-type="bibr" rid="B27">1990</xref>; Rao and Saunders, <xref ref-type="bibr" rid="B67">1992</xref>) being chief among them; along with xylanases (Blank et al., <xref ref-type="bibr" rid="B8">2014</xref>), amylases (Shaw et al., <xref ref-type="bibr" rid="B79">1995</xref>), lipases (Kretza et al., <xref ref-type="bibr" rid="B44">2012</xref>) and many other enzymes. The genus is well known for bioremediation of heavy metals thus lowering the toxicity at heavy metal contaminated sites. <italic>T. scotoductus</italic> for instance, has been shown to reduce Cr(VI) aerobically (Opperman and van Heerden, <xref ref-type="bibr" rid="B64">2006</xref>).</p>
<p>The phylogenetic relationships amongst the members of the genus have been as dynamic as the genetic constitution of its members (Kumwenda et al., <xref ref-type="bibr" rid="B45">2014</xref>). Members of the genus <italic>Meiothermus</italic>, which were earlier classified to the genus <italic>Thermus</italic>, were later moved to constitute a new genus to accommodate these moderately thermophilic and strictly aerobic group (Nobre et al., <xref ref-type="bibr" rid="B61">1996</xref>). The members of the genus <italic>Thermus</italic> are a part of the phylum Deinococcus-Thermus, along with <italic>Deinococcus, Deinobacter, Deinobacterium, Truepera, Marinithermus, Meiothermus, Oceanithermus, Rhabdothermus</italic>, and <italic>Vulcanithermus</italic>. Members of the genus <italic>Thermus</italic> form yellow to orange-yellow colonies and generally have small genome sizes of less than 2.5 Mb with extrachromosomal elements of common occurrence (Henne et al., <xref ref-type="bibr" rid="B36">2004</xref>; Bruggemann and Chen, <xref ref-type="bibr" rid="B11">2006</xref>). Previous genome analyses of the group have revealed highly plastic nature of the genome of <italic>Thermus</italic> species with plasmids and megaplasmids being the center for such plasticity (Bruggemann and Chen, <xref ref-type="bibr" rid="B11">2006</xref>). Genome wide rearrangements have been instrumental in shaping the genomes of these thermophiles (Kumwenda et al., <xref ref-type="bibr" rid="B45">2014</xref>). Elements that are considered to belong to the mobilome, i.e., insertion sequence (IS) elements, transposons and prophages, occur widely in the genomes of <italic>Thermus</italic> species (Kumwenda et al., <xref ref-type="bibr" rid="B45">2014</xref>). Along with this, thermophilic organisms are known to thrive under viral selection pressure. Selective forces continually acting at extreme environments bring about a strong evolutionary streamlining of these genomes (Sabath et al., <xref ref-type="bibr" rid="B73">2013</xref>).</p>
<p>Among the drivers of evolution; genetic recombination, rearrangement, horizontal gene transfer, conjugation, transformation and mutations are key players for this genus (Kumwenda et al., <xref ref-type="bibr" rid="B45">2014</xref>). One of the most likely explanations for this is the highly evolved competence system of <italic>Thermus</italic> which serves as an efficient arrangement for the uptake of alien genetic material from the environment (Lorenz and Wackernagel, <xref ref-type="bibr" rid="B52">1994</xref>). In most organisms, competence is not a constitutive phenomenon, but tightly controlled by factors related to the cell cycle (induced/artificial competence). In contrast to artificial competence, some organisms (including <italic>Thermus</italic> species) are constitutively competent (Friedrich et al., <xref ref-type="bibr" rid="B29">2002</xref>); reviewed by Averhoff (<xref ref-type="bibr" rid="B5">2009</xref>). The exact mechanism of uptake of free DNA from the environment varies from species to species. In case of <italic>Thermus</italic> species, the type IV pilus (T4P) system has been implicated in natural transformation, although the link between piliation and natural transformation seems unclear. Amongst all naturally competent species, <italic>Thermus thermophilus</italic> HB27 has the most efficient (40 kb/s per cell) natural competence system and a robust, non-selective competence machinery (Averhoff, <xref ref-type="bibr" rid="B5">2009</xref>). The development of competence machinery in <italic>Thermus</italic> is of great evolutionary significance and explains the dynamism in <italic>Thermus</italic> genomes. We provide a brief introduction of the genes involved in imparting natural competence, followed by a genus-wide analysis of competence loci in <italic>Thermus</italic>.</p>
<p>The genus <italic>Thermus</italic> comprises 17 validly published species, the genome sequences of which are available in public databases. <italic>T. parvatiensis</italic> strain RL<sup>T</sup> (Dwivedi et al., <xref ref-type="bibr" rid="B23">2015</xref>) was isolated from a hot water spring located atop (altitude &#x0007E;1,700 m) the Himalayan ranges at Manikaran, India. The hot spring water has high temperature (90&#x02013;98&#x000B0;C) (Dwivedi et al., <xref ref-type="bibr" rid="B24">2012</xref>) and circum-neutral pH. Low O<sub>2</sub> potential (4.8 &#x000B1; 0.2 cm<sup>3</sup> STP/L), low dissolved CO<sub>2</sub> (14.7 &#x000B1; 0.1 cm<sup>3</sup> STP/L) and high concentration of arsenic (140 ppb) (Sangwan et al., <xref ref-type="bibr" rid="B75">2015</xref>) prevalent at the niche further provide strong selection pressures. <italic>T. parvatiensis</italic> forms yellow colonies on polypeptone yeast extract agar at 60-80&#x000B0;C and demonstrates protease activity (Dwivedi et al., <xref ref-type="bibr" rid="B23">2015</xref>). Previously, strain RL was sequenced using Roche 454 GS (FLX Titanium) system and Sanger shotgun sequencing. The raw data generated was assembled into 17 contigs (Dwivedi et al., <xref ref-type="bibr" rid="B24">2012</xref>). In order to fill the gaps and generate a complete genome record, we determined the entire genome sequence using single molecule real time (SMRT) sequencing method. Here, we present the complete genome of <italic>T. parvatiensis</italic> and perform a comparative genomic analysis of available <italic>Thermus</italic> genomes. Our study is designed to uncover the phylogenetic relatedness among members based on phylogenomic methods, the core-pan genome structure as well as conserved genome features with the help of metagenomic recruitments. Further, we have analyzed genus specific evolutionary dynamics which are facilitated by the highly efficient natural competence system of this genus and a possible link with predominance of viral signatures found in these genomes.</p>
</sec>
<sec sec-type="materials and methods" id="s2">
<title>Materials and methods</title>
<sec>
<title>Genome sequencing, assembly, and annotation of <italic>T. parvatiensis</italic> replicons</title>
<p>SMRT genome sequencing was performed using PacBio RSII system at McGill University and Genome Quebec Innovation Centre, Canada. Genomic DNA was extracted using CTAB method (Doyle and Doyle, <xref ref-type="bibr" rid="B22">1990</xref>) followed by quality assessment on gel and quantification by ND1000 Nanodrop spectrophotometer. Sheared large insert library preparation was followed by generation of raw reads with an average read length of 9,878 nt. A total of 2,488 GB raw data was generated with 224,211 reads encompassing 857,926,800 bases with an average sequencing depth of 428 &#x000D7; (Supplementary Table <xref ref-type="supplementary-material" rid="SM1">1</xref>). <italic>De novo</italic> assembly was performed at the Genome Quebec, Canada using the HGAP assembler (Chin et al., <xref ref-type="bibr" rid="B15">2013</xref>) (coverage cut-off 30 &#x000D7;). Assembly validation was performed by aligning raw reads onto finished contigs using the Burrows-Wheeler Aligner version 0.7.9a (Li and Durbin, <xref ref-type="bibr" rid="B51">2009</xref>). Visual inspection of the assembly was performed using Tablet version 1.14.04.10 (Milne et al., <xref ref-type="bibr" rid="B56">2010</xref>). Ends of contigs were searched for overlaps using the formatdb and BLAST functions of Ugene (Okonechnikov et al., <xref ref-type="bibr" rid="B63">2012</xref>). Circularized replicons were uploaded on RAST server (Aziz et al., <xref ref-type="bibr" rid="B6">2008</xref>) for general genome annotations. RNAmmer version 1.2 (Lagesan et al., <xref ref-type="bibr" rid="B47">2007</xref>) was used to detect rRNA operons. Phages were scanned using online tools PHAST (Zhou et al., <xref ref-type="bibr" rid="B98">2011</xref>) and PHASTER (Arndt et al., <xref ref-type="bibr" rid="B3">2016</xref>). For detailed annotations of the phage regions, analysis was extended to include probable phages and associated regions using Phage Search Tool against viral and prophage databases (<ext-link ext-link-type="uri" xlink:href="http://www.phantome.org/Downloads/">http://www.phantome.org/Downloads/</ext-link>). Further, PHAST tool was also used to decipher the completeness of the phage genome followed by annotation using BLASTx against the ORFs predicted from prophage databases. Aragorn (Laslett and Canback, <xref ref-type="bibr" rid="B49">2004</xref>) online tRNA database was used to detect tRNAs in the genome. The WebMGA (Wu et al., <xref ref-type="bibr" rid="B91">2011</xref>) server was used for general COG category assignment. An approach integrating the Z-curve analysis, dnaA box location and genes surrounding the OriC was used to identify the origin of replication on the chromosome using the Ori-Finder server (Gao and Zhang, <xref ref-type="bibr" rid="B30">2008</xref>). <italic>T. parvatiensis</italic> genome was searched against DNA box database to locate the origin of replication and for this, DNA box repeat sequence (TGTGGATAA) of <italic>T. thermophilus</italic> (closest relative of <italic>T. parvatiensis</italic>) was used as reference to guide the BLAST search.</p>
</sec>
<sec>
<title>Phylogenomic assessments</title>
<p>For comparative analyses, sequenced genomes of the genus <italic>Thermus</italic> (17 genomes) were downloaded from the NCBI GenBank database. The genomes included in this study are, <italic>T. thermophilus</italic> HB27 (Henne et al., <xref ref-type="bibr" rid="B36">2004</xref>), <italic>T. thermophilus</italic> HB8, <italic>T. parvatiensis</italic> RL (Dwivedi et al., <xref ref-type="bibr" rid="B24">2012</xref>, <xref ref-type="bibr" rid="B23">2015</xref>), <italic>T. scotoductus</italic> SA-01 (Gounder et al., <xref ref-type="bibr" rid="B31">2011</xref>), <italic>T. oshimai</italic> JL-2 (Murugapiran et al., <xref ref-type="bibr" rid="B60">2013</xref>), <italic>Thermus</italic> species CCB_US3_UF1 (Teh et al., <xref ref-type="bibr" rid="B83">2012</xref>), <italic>T. aquaticus</italic> Y51MC23, <italic>T. antranikianii</italic> HN3-7, <italic>T. filiformis</italic> Wai33 A1, <italic>T. thermophilus</italic> JL-18 (Murugapiran et al., <xref ref-type="bibr" rid="B60">2013</xref>), <italic>T. thermophilus</italic> SG0.5JP17-16, <italic>T. islandicus</italic> PRI3838, <italic>T. caliditerrae</italic> YIM 77777, <italic>T. igniterrae</italic> RF-4, <italic>T. amyloliquefaciens</italic> YIM 77409 (Yu et al., <xref ref-type="bibr" rid="B94">2015</xref>; Zhou et al., <xref ref-type="bibr" rid="B97">2016</xref>), <italic>T. tengchongensis</italic> YIM 77401 (Mefferd et al., <xref ref-type="bibr" rid="B55">2016</xref>), and <italic>T. brockianus</italic> GE-1. Accession numbers of genomes included in this study and other general genome features are included in Table <xref ref-type="table" rid="T1">1</xref>. Among the genomes selected for comparisons, 10 were complete genomes and seven were draft genomes. All complete genomes selected were found to harbor 1&#x02013;4 plasmid(s) which were downloaded as separate sequences.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>General genome features of organisms belonging to the genus <italic>Thermus</italic>.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th/>
<th valign="top" align="left"><bold>Strain</bold></th>
<th valign="top" align="left"><bold>NCBI Accession No</bold>.</th>
<th valign="top" align="left"><bold>Source of Isolation</bold></th>
<th valign="top" align="left"><bold>Genome Size</bold></th>
<th valign="top" align="left"><bold>Plasmid (s)</bold></th>
<th valign="top" align="center"><bold>G&#x0002B;C (%)</bold></th>
<th valign="top" align="center"><bold>Predicted CDS</bold></th>
<th valign="top" align="center"><bold>tRNA</bold></th>
<th valign="top" align="center"><bold>rRNA operons</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left"><italic>T. parvatiensis</italic></td>
<td valign="top" align="left">RL<xref ref-type="table-fn" rid="TN1"><sup>&#x0002A;</sup></xref></td>
<td valign="top" align="left"><ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="CP014141">CP014141</ext-link></td>
<td valign="top" align="left">Hot spring, India</td>
<td valign="top" align="left"><underline>2,016,098</underline></td>
<td valign="top" align="left">pTP143 (143,277)</td>
<td valign="top" align="center">68.5</td>
<td valign="top" align="center">2,383</td>
<td valign="top" align="center">54</td>
<td valign="top" align="center">2</td>
</tr>
<tr>
<td valign="top" align="left"><italic>T. thermophilus</italic></td>
<td valign="top" align="left">HB27<xref ref-type="table-fn" rid="TN1"><sup>&#x0002A;</sup></xref></td>
<td valign="top" align="left"><ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="AE017221">AE017221</ext-link></td>
<td valign="top" align="left">Hot spring, Japan</td>
<td valign="top" align="left">2,127,482</td>
<td valign="top" align="left">pTT27 (232,605)</td>
<td valign="top" align="center"><bold>69.4</bold></td>
<td valign="top" align="center">2,244</td>
<td valign="top" align="center">47</td>
<td valign="top" align="center">6</td>
</tr>
<tr>
<td valign="top" align="left"><italic>T. thermophilus</italic></td>
<td valign="top" align="left">HB8<xref ref-type="table-fn" rid="TN1"><sup>&#x0002A;</sup></xref></td>
<td valign="top" align="left"><ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="AP008226">AP008226</ext-link></td>
<td valign="top" align="left">Hot spring, Japan</td>
<td valign="top" align="left">2,197,207</td>
<td valign="top" align="left">pTT27 (256,992), pTT8 (9,322), pVV8 (81,151)</td>
<td valign="top" align="center"><bold>69.4</bold></td>
<td valign="top" align="center">2,268</td>
<td valign="top" align="center">48</td>
<td valign="top" align="center">6</td>
</tr>
<tr>
<td valign="top" align="left"><italic>T. thermophilus</italic></td>
<td valign="top" align="left">JL-18<xref ref-type="table-fn" rid="TN1"><sup>&#x0002A;</sup></xref></td>
<td valign="top" align="left"><ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="CP003252">CP003252</ext-link></td>
<td valign="top" align="left">Great Boiling Spring, USA</td>
<td valign="top" align="left">2,311,212</td>
<td valign="top" align="left">pTTJL1801 (265,886), pTTJL1802 (142,731)</td>
<td valign="top" align="center">69.0</td>
<td valign="top" align="center">2,424</td>
<td valign="top" align="center">52</td>
<td valign="top" align="center">6</td>
</tr>
<tr>
<td valign="top" align="left"><italic>T. thermophilus</italic></td>
<td valign="top" align="left">SG0.5JP17-16<xref ref-type="table-fn" rid="TN1"><sup>&#x0002A;</sup></xref></td>
<td valign="top" align="left"><ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="CP002777">CP002777</ext-link></td>
<td valign="top" align="left">Hot Spring</td>
<td valign="top" align="left">2,303,227</td>
<td valign="top" align="left">pTHTHE1601 (<bold>440,026</bold>)</td>
<td valign="top" align="center">68.6</td>
<td valign="top" align="center">2,405</td>
<td valign="top" align="center">53</td>
<td valign="top" align="center">6</td>
</tr>
<tr>
<td valign="top" align="left"><italic>T. scotoductus</italic></td>
<td valign="top" align="left">SA-01<xref ref-type="table-fn" rid="TN1"><sup>&#x0002A;</sup></xref></td>
<td valign="top" align="left"><ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="CP001962">CP001962</ext-link></td>
<td valign="top" align="left">Fissure water, South Africa</td>
<td valign="top" align="left">2,355,186</td>
<td valign="top" align="left">pTSC8 (<underline>8,383</underline>)</td>
<td valign="top" align="center">64.9</td>
<td valign="top" align="center">2,514</td>
<td valign="top" align="center">47</td>
<td valign="top" align="center">6</td>
</tr>
<tr>
<td valign="top" align="left"><italic>T. oshimai</italic></td>
<td valign="top" align="left">JL-2<xref ref-type="table-fn" rid="TN1"><sup>&#x0002A;</sup></xref></td>
<td valign="top" align="left"><ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="CP003249">CP003249</ext-link></td>
<td valign="top" align="left">Great Boiling Spring, USA</td>
<td valign="top" align="left">2,401,329</td>
<td valign="top" align="left">pTHEOS01 (271,713), pTHEOS02 (57,223)</td>
<td valign="top" align="center">68.6</td>
<td valign="top" align="center">2,521</td>
<td valign="top" align="center">59</td>
<td valign="top" align="center">6</td>
</tr>
<tr>
<td valign="top" align="left"><italic>T</italic>. sp.</td>
<td valign="top" align="left">CCB_US3_UF1<xref ref-type="table-fn" rid="TN1"><sup>&#x0002A;</sup></xref></td>
<td valign="top" align="left"><ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="CP003126">CP003126</ext-link></td>
<td valign="top" align="left">Hot spring, Malaysia</td>
<td valign="top" align="left">2,263,488</td>
<td valign="top" align="left">pTCCB09 (19,716)</td>
<td valign="top" align="center">68.6</td>
<td valign="top" align="center">2,228</td>
<td valign="top" align="center">48</td>
<td valign="top" align="center">6</td>
</tr>
<tr>
<td valign="top" align="left"><italic>T. aquaticus</italic></td>
<td valign="top" align="left">Y51MC23<xref ref-type="table-fn" rid="TN1"><sup>&#x0002A;</sup></xref></td>
<td valign="top" align="left"><ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="CP010822">CP010822</ext-link></td>
<td valign="top" align="left">Hot spring, USA</td>
<td valign="top" align="left">2,338,641</td>
<td valign="top" align="left">pTA14 (14,448), pTA16 (16,597), pTA69 (69,906), pTA78 (78,727)</td>
<td valign="top" align="center">68.0</td>
<td valign="top" align="center">2,436</td>
<td valign="top" align="center">55</td>
<td valign="top" align="center">3</td>
</tr>
<tr>
<td valign="top" align="left"><italic>T. brockianus</italic></td>
<td valign="top" align="left">GE-1<xref ref-type="table-fn" rid="TN1"><sup>&#x0002A;</sup></xref></td>
<td valign="top" align="left"><ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="CP016312">CP016312</ext-link></td>
<td valign="top" align="left">Kamchatka, Russia</td>
<td valign="top" align="left">2,388,273</td>
<td valign="top" align="left">pTB1 (342,792), pTB2 (10,299)</td>
<td valign="top" align="center">66.9</td>
<td valign="top" align="center">2,789</td>
<td valign="top" align="center">47</td>
<td valign="top" align="center">2</td>
</tr>
<tr>
<td valign="top" align="left"><italic>T. antranikianii</italic></td>
<td valign="top" align="left">DSM 12462 (HN3-7)</td>
<td valign="top" align="left"><ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="AUIW01000000">AUIW01000000</ext-link></td>
<td valign="top" align="left">Hot spring, Iceland</td>
<td valign="top" align="left">2,163,625</td>
<td valign="top" align="left">ND</td>
<td valign="top" align="center"><underline>64.8</underline></td>
<td valign="top" align="center">2,321</td>
<td valign="top" align="center">47</td>
<td valign="top" align="center">4</td>
</tr>
<tr>
<td valign="top" align="left"><italic>T. filiformis</italic></td>
<td valign="top" align="left">ATCC 43280 (Wai33 A1)</td>
<td valign="top" align="left"><ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="JPSL02000000">JPSL02000000</ext-link></td>
<td valign="top" align="left">Hot spring, New Zealand</td>
<td valign="top" align="left">2,386,081</td>
<td valign="top" align="left">ND</td>
<td valign="top" align="center">69.0</td>
<td valign="top" align="center">2,338</td>
<td valign="top" align="center">47</td>
<td valign="top" align="center">6</td>
</tr>
<tr>
<td valign="top" align="left"><italic>T. islandicus</italic></td>
<td valign="top" align="left">DSM 21543 (PRI 3838)</td>
<td valign="top" align="left"><ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="ATXJ01000000">ATXJ01000000</ext-link></td>
<td valign="top" align="left">Hot spring, Iceland</td>
<td valign="top" align="left">2,263,010</td>
<td valign="top" align="left">ND</td>
<td valign="top" align="center">68.4</td>
<td valign="top" align="center">2,470</td>
<td valign="top" align="center">47</td>
<td valign="top" align="center">6</td>
</tr>
<tr>
<td valign="top" align="left"><italic>T. igniterrae</italic></td>
<td valign="top" align="left">ATCC 700962 (RF-4)</td>
<td valign="top" align="left"><ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="AQWU01000000">AQWU01000000</ext-link></td>
<td valign="top" align="left">Hot spring, Iceland</td>
<td valign="top" align="left">2,225,983</td>
<td valign="top" align="left">ND</td>
<td valign="top" align="center">68.8</td>
<td valign="top" align="center">2,379</td>
<td valign="top" align="center">43</td>
<td valign="top" align="center">6</td>
</tr>
<tr>
<td valign="top" align="left"><italic>T. caliditerrae</italic></td>
<td valign="top" align="left">YIM 77777</td>
<td valign="top" align="left"><ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="JQNC01000000">JQNC01000000</ext-link></td>
<td valign="top" align="left">Hot spring, China</td>
<td valign="top" align="left">2,218,114</td>
<td valign="top" align="left">ND</td>
<td valign="top" align="center">67.2</td>
<td valign="top" align="center">2,327</td>
<td valign="top" align="center">50</td>
<td valign="top" align="center">3</td>
</tr>
<tr>
<td valign="top" align="left"><italic>T. amyloliquefaciens</italic></td>
<td valign="top" align="left">YIM 77409</td>
<td valign="top" align="left"><ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="JQMV00000000">JQMV00000000</ext-link></td>
<td valign="top" align="left">Hot spring, China</td>
<td valign="top" align="left">2,160,855</td>
<td valign="top" align="left">ND</td>
<td valign="top" align="center">67.4</td>
<td valign="top" align="center">2,313</td>
<td valign="top" align="center">48</td>
<td valign="top" align="center">6</td>
</tr>
<tr>
<td valign="top" align="left"><italic>T. tengchongensis</italic></td>
<td valign="top" align="left">YIM 77401</td>
<td valign="top" align="left"><ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="JQLK01000000">JQLK01000000</ext-link></td>
<td valign="top" align="left">Geothermally heated soil, China</td>
<td valign="top" align="left"><bold>2,562,314</bold></td>
<td valign="top" align="left">ND</td>
<td valign="top" align="center">66.4</td>
<td valign="top" align="center">2,750</td>
<td valign="top" align="center">47</td>
<td valign="top" align="center">2</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>Smallest genome/plasmid sequences and GC content are underlined whereas the largest are in bold</italic>.</p>
<fn id="TN1">
<label>&#x0002A;</label>
<p><italic>Complete genome; ND-Not determined</italic>.</p></fn>
</table-wrap-foot>
</table-wrap>
<p>Phylogenetic analysis based on traditional 16S rRNA gene sequences was performed, for which 16S rRNA genes were fetched from the respective genomes using RNAmmer version 1.2 server (Lagesan et al., <xref ref-type="bibr" rid="B47">2007</xref>). Multiple sequence alignment was performed using Muscle (Edgar, <xref ref-type="bibr" rid="B25">2004</xref>). Unaligned sequences were trimmed from the edges. Phylogenetic tree was constructed using Maximum-likelihood (ML) algorithm (Felsenstein, <xref ref-type="bibr" rid="B28">1993</xref>) employed in Mega version 6 (Tamura et al., <xref ref-type="bibr" rid="B82">2013</xref>). Although 16S rRNA gene is a well-established marker for tracing phylogeny, dependence on just one gene may lead to biased phylogenetic projections. Hence, evolutionary relationships were reconstructed using multiple conserved marker genes extracted from the genomes. For this, 31 conserved bacterial single copy genes were extracted from each genome using AmphoraNet server (Kerepesi et al., <xref ref-type="bibr" rid="B40">2014</xref>). Individual marker gene sequences for individual genomes were concatenated. Alignment was performed using Muscle (Edgar, <xref ref-type="bibr" rid="B25">2004</xref>). Further, 400 conserved bacterial marker genes were retrieved from each genome using PhyloPhlAn (Segata et al., <xref ref-type="bibr" rid="B77">2013</xref>). For the above three sequence based analyses, ML trees were rendered with 1,000 bootstrap revaluations. To trace phylogeny using whole genome data, well established phylogenomic approaches were employed. Average nucleotide identity (ANI) values were calculated using the BLASTALL algorithm (ANIb) of JSpecies v 1.2.1 (Richter and Rosello-Mora, <xref ref-type="bibr" rid="B68">2009</xref>). A two-way matrix containing pairwise ANI scores was used to perform hierarchical clustering using Pearson correlation (average linkage). Similar dendrogram was generated using a two-way matrix of tetranucleotide frequencies calculated using regression analysis by JSpecies v 1.2.1. To evaluate phylogeny on the basis of variable component of the genome, pan genome phylogeny was reconstructed by hierarchical clustering using information from a binary gene presence-absence (1/0) matrix generated by BPGA (Chaudhari et al., <xref ref-type="bibr" rid="B14">2016</xref>). Gene presence-absence matrix constituted the information about presence or absence of the total gene complement (pan genome) for all <italic>Thermus</italic> species. In order to resolve the precariousness of sub-species level relationships, pairwise digital DNA-DNA hybridization (dDDH) values were calculated using the genome to genome distance calculator (ggdc.dsmz.de) (Auch et al., <xref ref-type="bibr" rid="B4">2010</xref>).</p>
</sec>
<sec>
<title>Analysis of genome flexibility</title>
<p>Genome sequences were uploaded on RAST server (Aziz et al., <xref ref-type="bibr" rid="B6">2008</xref>) and coding sequences were extracted from RAST predictions. Coding sequences (amino acids) were compared using formatdb and BLASTALL programs available in the package BLAST version 2.2.26 (Altschul et al., <xref ref-type="bibr" rid="B1">1990</xref>). Genomic islands were predicted using IslandViewer 3 (Dhillon et al., <xref ref-type="bibr" rid="B21">2015</xref>). Dot-plots and synteny maps were constructed to uncover the extent of rearrangements (duplications, deletions, insertions) occurring as a function of genome distance. Dot-plots were generated using BLASTN (Wheeler and Bhagwat, <xref ref-type="bibr" rid="B90">2007</xref>) with <italic>T. parvatiensis</italic> as the reference. Synteny maps were constructed by identifying conserved locally collinear blocks (LCBs) among genomes, followed by whole genome alignments using progressiveMauve version 20150226 (Darling et al., <xref ref-type="bibr" rid="B18">2010</xref>) at three spaced seed patterns and a high seed weight (seed weight &#x0003D; 15) for sensitive alignment of closely related genomes. Horizontally acquired regions on the megaplasmid pTP143 were detected by BLAST based comparison with all sequenced <italic>Thermus</italic> plasmids. These regions were confirmed using Alien Hunter (Vernikos and Parkhill, <xref ref-type="bibr" rid="B86">2006</xref>) at default thresholds. Further, a mapping of syntenic regions on all <italic>Thermus</italic> plasmids was performed using progressiveMauve (Darling et al., <xref ref-type="bibr" rid="B18">2010</xref>) for visual demonstration.</p>
<p>Core and pan genome analysis was performed using BPGA algorithm (Chaudhari et al., <xref ref-type="bibr" rid="B14">2016</xref>). Usearch (Edgar, <xref ref-type="bibr" rid="B26">2010</xref>), which is the default clustering algorithm of BPGA, was employed for orthologous gene identification and clustering. Core genome plot was rendered by plotting the total number of shared genes with each subsequent addition of a genome against the number of genomes. Pan genome plot was rendered by plotting the total number of distinct gene families identified with the addition of each genome vs. number of genomes. To avoid biasedness, median values of 20 random permutations were used for rendering these plots. Representative (seed) sequences of both core and pan genome were used for function based analysis.</p>
<p>Metagenomic sequence data recruitment was performed in order to gain insights into strain specific flexible repertoire harbored by <italic>T. parvatiensis</italic>, and to investigate whether this flexibility is strain specific or extended to other members of the genus as well. For this purpose, water samples were collected from the hot water spring at Manikaran, India, at two locations, namely MNW1 and MNW2 (intra-site distance: 100 m; 32&#x000B0;01&#x02032;34.8&#x02033;N, 077&#x000B0;20&#x02032;50.3&#x02033;E). DNA extraction from the water samples (10 L each, filtered through 0.45 &#x003BC; filters) was carried out by PowerMax (R) Water DNA isolation kit (MoBio Laboratories Inc., Carlsbad, CA, USA) as per manufacturer&#x00027;s instructions. Metagenomic data was generated using Illumina GAII technology with an insert size of 170 bp (DDBJ/EMBL/GenBank accession number <ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="PRJEB19501">PRJEB19501</ext-link>). In order to delineate metagenomic islands (hereafter referred to as MGIs), metagenomic raw reads from hot spring water (MNW1 and MNW2) were recruited onto the chromosome and plasmid of <italic>T. parvatiensis</italic> using nucmer available with MUMmer (Kurtz et al., <xref ref-type="bibr" rid="B46">2004</xref>) package. Coverage plots were generated using mummerplot available in MUMmer package. Regions with no or little mapping after metagenomic reads tilling at coverage cut-off of 80% were identified as MGIs (Steffen et al., <xref ref-type="bibr" rid="B80">2012</xref>). Mapping coverage was determined using coords file generated by nucmer (identity cut-off: 80%). BLASTp algorithm was implemented to identify the presence of these regions on other <italic>Thermus</italic> genomes.</p>
</sec>
<sec>
<title>Detection of genus specific survival strategies</title>
<sec>
<title>CRISPR analysis</title>
<p>CRISPR arrays were extracted from genomes using CRISPRFinder (Grissa et al., <xref ref-type="bibr" rid="B32">2007</xref>) online server which performs BLAST against dbCRISPR (CRISPR database; last updated on 2017-01-02). CRISPRFinder further classifies the identified CRISPR arrays as true or false based on whether or not they are associated with CRISPR associated genes (<italic>Cas</italic>) respectively. <italic>Cas</italic> genes were annotated using CRISPRone (Zhang and Ye, <xref ref-type="bibr" rid="B95">2017</xref>). CRISPRs lacking <italic>Cas</italic> genes in the vicinity were designated as false/questionable CRISPRs. Only true CRISPRs were selected for analyses. CRISPR arrays have two components: repeats and spacers; both of which were analyzed to study the evolution and probable viral diversity respectively. Classification and clustering of CRISPR repeats and repeat-based <italic>Cas</italic> gene predictions were undertaken using CRISPRmap, a comprehensive cluster analysis method (based on Markov clustering) which clusters conserved sequence families and potential structure motifs (Lange et al., <xref ref-type="bibr" rid="B48">2013</xref>). Repeats were classified based on 40 conserved sequence families and 33 probable structural motifs. Further, 24 families and 18 structural motifs were considered for the construction of repeat cluster maps. For prediction of potential viruses most frequently associated with <italic>Thermus</italic> genomes, spacer sequences from all genomes were extracted and BLAST against viral GenBank database of NCBI (Deng et al., <xref ref-type="bibr" rid="B20">2007</xref>) with a threshold e-value of 1. For better stringency, among all matches, only those having 100% identity of more than 20 nucleotides were considered as valid hits.</p>
</sec>
<sec>
<title>Comparison of competence imparting genes</title>
<p>Genes involved in imparting competence (16 genes) to <italic>T. thermophilus</italic> HB27 were used as reference for extracting competence associated genes from individual genomes using BLAST. <italic>PilA1-A4</italic> genes were aligned using Hirschberg (KAlign) algorithm (Lassmann and Sonnhammer, <xref ref-type="bibr" rid="B50">2005</xref>). Visual alignment consensus was built at 70% threshold. Relationships among <italic>PilA1-A4</italic> genes were inferred by PhyML (Guindon et al., <xref ref-type="bibr" rid="B33">2009</xref>) maximum-likelihood method using HKY85 substitution model (Hasegawa et al., <xref ref-type="bibr" rid="B35">1985</xref>). Median size estimations were made using boxplot function in R (<ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/mirrors.html">https://cran.r-project.org/mirrors.html</ext-link>).</p>
</sec>
</sec>
</sec>
<sec id="s3">
<title>Results and discussion</title>
<sec>
<title>Genome sequencing, assembly, and annotation of <italic>T. parvatiensis</italic> replicons</title>
<sec>
<title>Genome assembly, finishing, and annotation</title>
<p>The genome of <italic>T. parvatiensis</italic> strain RL was initially assembled into three contigs (totally 2,066,435 bp, 1,886,121 bp, 159 Kbp and 21 Kbp) with G&#x0002B;C content of 68.5%. The 21 Kbp contig, mapped onto the chromosome (BLASTn and mapping with previously generated assembly; Dwivedi et al., <xref ref-type="bibr" rid="B24">2012</xref>). The entire 21 Kbp region, seems to represent an integrated plasmid or a large genomic island incorporated into the genome, based on the annotation of mostly hypothetical genes and transposable elements among genes identified. This was supplemented by differences in mean G&#x0002B;C% of this region (67.3%) as compared to the rest of the genome (68.5%). In an attempt to circularize the largest contig, its ends were aligned against each other and an overlapping region of 13,300 nt was removed. Similarly, in order to circularize the 159 Kbp contig, an overlapping region of 15,853 bp region was removed from the ends of the contig. Finally, two replicons were reconstructed: a chromosome (1,872,821 bp) and a megaplasmid (143,277 bp) (Figure <xref ref-type="fig" rid="F1">1</xref>) (total size: 2,016,098 bp) (Table <xref ref-type="table" rid="T2">2</xref>). The chromosomal origin of replication was located at 158,478&#x02013;158,778. A total of 12 DNA boxes with consensus sequence of TGTGGATAA were identified spanning the 301 nt <italic>OriC</italic> region. The total number of predicted coding sequences were 2,383. The genome was found to harbor two rRNA operons and 54 tRNAs and tmRNAs. COG functional category assignment placed a large number of genes to amino acid transport and metabolism (11.04%), general function prediction (12.95%), energy production and conservation (7.54%) and translation, ribosomal structure and biogenesis (7.40%). A number of genes were classified into the unknown function category (7.78%). A large proportion of the genome is strictly attributed to genes needed by the organism for essential cellular processes. <italic>T. parvatiensis</italic> thrives at a high arsenic concentration (140 ppb). We investigated the presence of arsenic resistance mechanisms in this thermophilic organism. Arsenate reductase gene <italic>arsC</italic> (1 copy), arsenic efflux pump protein <italic>arsB</italic> (2 copies) and <italic>arsR</italic> transcriptional regulator (4 copies) were identified. The above genes belong to the ars (arsenic resistance) operon responsible for the efflux of As(III) out of the cells (Yang and Rosen, <xref ref-type="bibr" rid="B92">2016</xref>). Genes for the oxidation of arsenic (aox operon) were not identified. The mechanism of arsenic detoxification in <italic>T. parvatiensis</italic> thus involves extrusion of arsenic out of the cells (rather than oxidation).</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>Replicon maps of <italic>Thermus parvatiensis</italic> strain RL <bold>(A)</bold> ORFs on the chromosome have been mapped on both strands, the origin of replication is marked with a red arrow. From outside to inside: genes on the negative strand, genes on the positive strand, GC percentage and GC skew. <bold>(B)</bold> Detailed map of <italic>T. parvatiensis</italic> megaplasmid pTP143 marked with prominent categories of genes in different colors. Genes representing DNA repair genes (green), mobile element genes (red), transcriptional regulators (magenta) and gene clusters (purple; denitrification gene cluster, cobalamin biosynthetic gene cluster, carotenoid synthesis gene cluster) have been specifically highlighted.</p></caption>
<graphic xlink:href="fmicb-08-01410-g0001.tif"/>
</fig>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Genomic features of the chromosome and plasmid of <italic>T. parvatiensis</italic> strain RL.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th/>
<th valign="top" align="left"><bold>Chromosome</bold></th>
<th valign="top" align="left"><bold>Plasmid</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Accession number</td>
<td valign="top" align="left"><ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="CP014141">CP014141</ext-link></td>
<td valign="top" align="left"><ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="CP014142">CP014142</ext-link></td>
</tr>
<tr>
<td valign="top" align="left">Size (bp)</td>
<td valign="top" align="left">1,872,821</td>
<td valign="top" align="left">143,277</td>
</tr>
<tr>
<td valign="top" align="left">G&#x0002B;C content (%)</td>
<td valign="top" align="left">68.5</td>
<td valign="top" align="left">68.4</td>
</tr>
<tr>
<td valign="top" align="left">CDS</td>
<td valign="top" align="left">2326</td>
<td valign="top" align="left">57</td>
</tr>
<tr>
<td valign="top" align="left">Coding density (%)</td>
<td valign="top" align="left">94.12</td>
<td valign="top" align="left">87.65</td>
</tr>
<tr>
<td valign="top" align="left">tRNAs</td>
<td valign="top" align="left">54</td>
<td valign="top" align="left">0</td>
</tr>
<tr>
<td valign="top" align="left">rRNA operons</td>
<td valign="top" align="left">2</td>
<td valign="top" align="left">0</td>
</tr>
</tbody>
</table>
</table-wrap>
<p><italic>T. parvatiensis</italic> strain RL uniquely harbored three integrated phages in its genome (Supplementary Figure <xref ref-type="supplementary-material" rid="SM10">1</xref>)&#x02014;two on the chromosome and one on the plasmid. This was in contrast to all its close phylogenetic neighbors taken into account in this study namely, <italic>T. thermophilus</italic> HB8, <italic>T. thermophilus</italic> HB27, <italic>T. thermophilus</italic> JL18, and <italic>T. thermophilus</italic> SG0.5JP17-16 in which no phages could be identified. The first phage (20.3 Kbp) on the chromosome revealed the presence of phage structural proteins such as tail assembly protein and coat protein along with three heat shock proteins which can be directly implicated to the environment, i.e., hot spring water (surface water temperature &#x0003E;95&#x000B0;C). The phage region was associated with two hybrid histidine kinases, which are implicated in two-component regulatory system (Khorchid and Ikura, <xref ref-type="bibr" rid="B41">2006</xref>). The second chromosomal phage (23.2 Kbp) was annotated and revealed a probable integron with attL and attR sites along with integrase encoding gene and flanking tRNA. The gene cassette of this integron had 23 hypothetical proteins and 16 viral proteins. These results also indicate that this integron might denote a super-integron with 37 ORFs captured. Phage regions have been known to play a role in horizontal gene transfer by specialized as well as generalized transduction (Touchon et al., <xref ref-type="bibr" rid="B85">2017</xref>). Genes harbored on the integrated phage regions corresponding to two-component system and heat shock proteins reflect the dispersal of these genes might be an active phenomenon among the population.</p>
</sec>
<sec>
<title>Plasmid pTP143</title>
<p>Plasmids, including small plasmids, as well as large megaplasmids of up to 440 Kbp are known to be present in <italic>Thermus</italic> genomes (Table <xref ref-type="table" rid="T1">1</xref>). The plasmids of <italic>Thermus</italic> are known to be the center of plasticity, harboring genes for mobile elements, transposons and a number of biosynthetic clusters (Henne et al., <xref ref-type="bibr" rid="B36">2004</xref>; Bruggemann and Chen, <xref ref-type="bibr" rid="B11">2006</xref>). The megaplasmid pTP143 of <italic>T. parvatiensis</italic>, contained 181 coding sequences. A large number of the genes harbored on the plasmid were, however, genes belonging to integrases, transposases, mobile elements and hypothetical protein coding genes (81 genes, constituting 44% of the total plasmid genes) (Figure <xref ref-type="fig" rid="F1">1</xref>) which denote the plastic nature of the plasmid. A low coding density (87.65%) was observed for pTP143, as compared to the chromosome (94.12%). Genetic analysis of <italic>T. parvatiensis</italic> megaplasmid pTP143 revealed a cobalamin biosynthetic cluster, a denitrification cluster and a carotenoid biosynthesis cluster (responsible for imparting the yellow pigment to the organism). A thermophilic lifestyle demands a robust DNA repair system. Genes required for thermophilic existence most suitably correspond to an elevated number of DNA repair genes (Bruggemann and Chen, <xref ref-type="bibr" rid="B11">2006</xref>). Consequently, a <italic>recQ</italic> helicase, reverse gyrases, photolyase <italic>phrB, sbcC</italic>, and <italic>sbcD</italic> nucleases (implicated in deleting hairpin structures) were found on the megaplasmid. Genes related to stress response, <italic>surE</italic> (involved in nucleic acid pool maintenance; Proudfoot et al., <xref ref-type="bibr" rid="B66">2004</xref>) and cytochrome P450 (Kelly and Kelly, <xref ref-type="bibr" rid="B39">2013</xref>) were also identified. A number of transcriptional factors known to modulate stress conditions were found on the plasmid. These included a transcriptional regulator <italic>IcIR</italic> involved in regulation of responses to quorum sensing and toxic stress (Molina-Henares et al., <xref ref-type="bibr" rid="B59">2005</xref>). A transcriptional regulator <italic>Crp/Fnr</italic> known to be responsive to environmental changes (K&#x000F6;rner et al., <xref ref-type="bibr" rid="B43">2003</xref>; Zhou et al., <xref ref-type="bibr" rid="B96">2012</xref>), such as oxidative stress, carbon dioxide concentrations and heavy metal impositions was annotated. <italic>Crp/Fnr</italic> regulators act by regulating the expression of genes involved in alleviating the respective stress conditions. <italic>MerR</italic>, a heavy metal modulating transcriptional regulator (Brown et al., <xref ref-type="bibr" rid="B10">2003</xref>), which activates promotors of genes in response to heavy metal influx was annotated. <italic>T. parvatiensis</italic> harbors a plastic plasmid with mobile and hypothetical gene components. Constituting genes for DNA repair, stress response and transcriptional regulators, we believe that the megaplasmid has an indispensable role to play for the thermophilic survival of <italic>T. parvatiensis</italic>. Not only essential for survival, the megaplasmid demonstrates a potentially crucial role in communicating and modulating temperature stress via an appropriate response carried out by the transcriptional regulators it harbors.</p>
</sec>
</sec>
<sec>
<title>Phylogenomic assessments</title>
<p>The novel phylogeny of strain RL has already been discussed based on multi locus gene analysis (Dwivedi et al., <xref ref-type="bibr" rid="B23">2015</xref>). We describe here, the microbial phylogeny within the genus <italic>Thermus</italic> using the traditional 16S rRNA gene, 31 bacterial single copy genes and 400 conserved bacterial marker genes. To strengthen the analyses, we have used whole genome patterns established by ANI scores, tetra-nucleotide scores, pan genome and dDDH values. The phylogenetic tree based on 16S rRNA gene sequences placed strain RL along with members of the <italic>T. thermophilus</italic> clade into a single monophyletic clade, closely bifurcating with SG0.5JP17-16 (Figure <xref ref-type="fig" rid="F2">2</xref>), and residing with strains of <italic>T. thermophilus</italic>, i.e., HB27, HB8, JL-18, and SG0.5JP17-16. This was expected from over 99% identity of 16S rRNA gene sequence of <italic>T. parvatiensis</italic> with members of <italic>T. thermophilus</italic>. A single gene, however, is not able to provide the required phylogenetic resolution, hence, phylogenetic relationships were further investigated on the basis of conserved marker genes. For this, 31 bacterial single copy genes and 400 bacterial conserved marker genes were used. The phylogenetic tree constructed using 31 essential single copy genes placed <italic>T. parvatiensis</italic> in the same clade as <italic>T. thermophilus</italic> (Figure <xref ref-type="fig" rid="F2">2</xref>). However, the phylogenetic tree constructed using 400 conserved bacterial marker genes placed <italic>T. parvatiensis</italic> at an outlier position with respect to <italic>T. thermophilus</italic> group with a strong bootstrap support (100%) (Figure <xref ref-type="fig" rid="F2">2</xref>). Correlations based on gene distances scored on the basis of ANI scores, placed <italic>T. parvatiensis</italic> along with SG0.5JP17-16. However, these two strains did not fall into the <italic>T. thermophilus</italic> clade, but clustered with strain CCB_US3_UF1 and <italic>T. igniterrae</italic> instead (Figure <xref ref-type="fig" rid="F2">2</xref>). The novel species status of strain RL was also reflected in its ANI scores with <italic>T. thermophilus</italic> members (95.03&#x02013;95.57%), which fall on the borderline for species delineation based on ANI cut-off (95&#x02013;96%) (Konstantinidis and Tiedje, <xref ref-type="bibr" rid="B42">2005</xref>) (Supplementary Table <xref ref-type="supplementary-material" rid="SM2">2</xref>). This is in contrast to the high ANI scores among members of the <italic>T. thermophilus</italic> group (&#x0003E;96%). Tetra-nucleotide based correlations also placed <italic>T. parvatiensis</italic> as an outlier, lying just outside the tight <italic>T. thermophilus</italic> group (Figure <xref ref-type="fig" rid="F2">2</xref>). The same was reflected in the dendrogram based on pan genes presence-absence (1/0) matrix (Figure <xref ref-type="fig" rid="F2">2</xref>). Digital DDH values (Supplementary Table <xref ref-type="supplementary-material" rid="SM3">3</xref>) were able to separate <italic>T. parvatiensis</italic> from <italic>T. thermophilus</italic> clearly. <italic>T. parvatiensis</italic> vs. <italic>T. thermophilus</italic> dDDH values were in the range 61.0&#x02013;64.6%, which were below the 70% DDH cut-off for species delineation. This was in contrast to the high intra-species scores among the <italic>T. thermophilus</italic> group (68.9&#x02013;89%).</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>Inference of evolutionary relationships among <italic>Thermus</italic> spp. based on phylogenetic and phylogenomic methods. Phylogenetic analysis based on <bold>(A)</bold> 16S rRNA gene sequences; <bold>(B)</bold> 31 single copy genes; and <bold>(C)</bold> 400 conserved bacterial marker genes, of the species under study using maximum likelihood method. Bars represent the number of substitutions per nucleotide position. Percentage bootstrap values (&#x02265;70%) are shown next to the nodes. Phylogenomic dendrograms showing hierarchical clustering of species under study constructed using <bold>(D)</bold> whole genome distance matrix based on ANI scores; <bold>(E)</bold> tetranucleotide frequencies; and <bold>(F)</bold> pan genes presence-absence matrix. Gradation of colors from black to yellow in <bold>(D)</bold> depicts increasing genome distance on the basis two-way ANI matrix. Black denotes minimum distance and yellow denotes maximum distance. Organisms have been grouped together into clades on the basis of minimum distance (black). Blue shade (in <bold>A,B</bold>) depicts clustering of <italic>T. parvatiensis</italic> within <italic>T. thermophilus</italic> group. Light brown shade (in <bold>C&#x02013;F)</bold> depicts the position of <italic>T. parvatiensis</italic> separately from other closely clustered <italic>Thermus</italic> members (shaded gray in <bold>C&#x02013;F)</bold>.</p></caption>
<graphic xlink:href="fmicb-08-01410-g0002.tif"/>
</fig>
<p>Above analyses suggest that species diversification for the genus <italic>Thermus</italic> has taken place by acquisition/deletion/rearrangement of regions which cannot be reflected in the 16S rRNA gene. Conserved genes (31 and 400) are better able to reflect the phylogenetic relationships among the members. However, a drawback of the above methods is that they fall short of taking in account the intragenomic heterogeneity that is quite high among <italic>Thermus</italic> members due to extensive gene shuffling. For this reason, whole genome based methods such as ANI, tetranucleotide frequency and DDH values should be regarded as more accurate phylogenomic methods for estimating phylogenetic relationships within this genus. We have been able to resolve the phylogeny of <italic>T. parvatiensis</italic> by whole genome methods. In spite of having a high percentage similarity with the <italic>T. thermophilus</italic> group, based on 16S rRNA gene sequences, <italic>T. parvatiensis</italic> represents a different species based on whole genome methods, which are more reliable as compared to gene based methods. Thus, <italic>T. parvatiensis</italic>, in course of evolution, has accumulated genome wide differences that have led to its bifurcation with the <italic>T. thermophilus</italic> group, and represents a genetically unique species.</p>
</sec>
<sec>
<title>Analysis of genome flexibility</title>
<sec>
<title>Genome organization</title>
<p>All species belonging to the genus <italic>Thermus</italic>, as described in this study, have been isolated from thermophilic environments (mostly hot spring waters) from all over the world, i.e., from USA (Murugapiran et al., <xref ref-type="bibr" rid="B60">2013</xref>), Japan (Henne et al., <xref ref-type="bibr" rid="B36">2004</xref>), India (Dwivedi et al., <xref ref-type="bibr" rid="B24">2012</xref>), South Africa (Gounder et al., <xref ref-type="bibr" rid="B31">2011</xref>), China (Yu et al., <xref ref-type="bibr" rid="B94">2015</xref>; Zhou et al., <xref ref-type="bibr" rid="B97">2016</xref>), Malaysia (Teh et al., <xref ref-type="bibr" rid="B83">2012</xref>), New Zealand (Hudson et al., <xref ref-type="bibr" rid="B37">1987</xref>; Mefferd et al., <xref ref-type="bibr" rid="B55">2016</xref>), and Iceland (Chung et al., <xref ref-type="bibr" rid="B16">2000</xref>). This illustrates that the genus is spread across the globe, thriving at the most extreme environments. Being confined to stressed niches, these G&#x0002B;C rich (64.8&#x02013;69.4%) organisms possess small genomes, ranging from 2.01 Mb of <italic>T. parvatiensis</italic> to 2.5 Mb of <italic>T. tengchongensis</italic> YIM 77401 (mean genome size: 2.25 Mb). Considering the powerful evolutionary forces that have been constantly shaping their genomes, in geographically diverse niches, the genome size of <italic>Thermus</italic> shows low variability (Table <xref ref-type="table" rid="T1">1</xref>). Already known to shed and rearrange genes that are not uniquely essential for survival, the members of the genus have been successful in maintaining their genome sizes close to the average genome size of the genus. A large part of the genomes is comprised of genomic islands, varying from 0 to 5.85% (Supplementary Table <xref ref-type="supplementary-material" rid="SM4">4</xref>).</p>
<p>To demonstrate the extent of genomic shuffling, synteny maps of the 10 complete genomes were generated (Figure <xref ref-type="fig" rid="F3">3</xref>). <italic>T. parvatiensis</italic> and SG0.5JP17-16 revealed a conserved organizational synteny with each other and lack of inversions and rearrangements. Synteny was conserved among members of <italic>T. thermophilus</italic> group, however, large blocks of inversions were observed in relation to strains CCB_US3_UF1, <italic>T. scotoductus, T. aquaticus, T. oshimai</italic>, and <italic>T. brockianus</italic>. These demonstrate the huge genome wide rearrangements occurring at the genus level (Figure <xref ref-type="fig" rid="F3">3</xref>). The same observation was reflected in dot-plot comparisons of <italic>T. parvatiensis</italic> with other members (Supplementary Figure <xref ref-type="supplementary-material" rid="SM11">2</xref>).</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>Organizational (synteny) comparisons of <italic>T. parvatiensis</italic> with nine representatives of <italic>Thermus</italic>. Only members with complete genomes have been considered for this analysis. Synteny maps showing <bold>(A)</bold> comparison of the chromosome of <italic>T. parvatiensis</italic> against the chromosomes of other genomes; and <bold>(B)</bold> comparison of plasmid pTP143 against plasmid sequences (concatenated wherever number of plasmids &#x0003E;1) of other <italic>Thermus</italic> members. Boxes of different colors represent locally collinear blocks (LCBs) (or locally conserved regions) connected via lines of the same color to their corresponding positions on other genomes. For each genome, the LCBs above and below the reference line (indicated by black triangle) denote the orientation of the LCBs with respect to the reference sequence (LCBs below the black reference line denote inversions). Black lines (in <bold>B</bold>) below LCBs represent the position of coding sequences. Red arrows on plasmid pTP143 (in <bold>B</bold>) mark the regions that could not be mapped on other <italic>Thermus</italic> plasmids.</p></caption>
<graphic xlink:href="fmicb-08-01410-g0003.tif"/>
</fig>
<p>The genes on <italic>Thermus</italic> plasmids are known to undergo vast rearrangement events. In some cases, they have been observed to shift from the plasmid to the chromosome and get stabilized there. In other cases, large plasmids with a majority of non-essential but potentially benefit imparting clusters have been discerned. The former case has been observed in <italic>T. scotoductus</italic> and CCB_US3_UF1, both of which have small plasmids as most of the genes got incorporated on the chromosomes, leaving plasmids with diminished configurations. pTSC8 (8,383 bp), for example, has discarded many non-essential genes like cobalamine biosynthesis pathway genes, plasmid stability genes and chromosome partitioning genes, but retained genes for aerobic and anaerobic respiration to attain a much more compact conformation (Gounder et al., <xref ref-type="bibr" rid="B31">2011</xref>). The latter situation has been observed for plasmid pVV8 (81,151 bp) which was found to be enriched in phosphonate metabolism genes, which are not of common occurrence in <italic>Thermus</italic> genomes (Ohtani et al., <xref ref-type="bibr" rid="B62">2012</xref>). To evaluate this trend in pTP143 and other plasmids, we surveyed genetic clusters commonly observed on <italic>Thermus</italic> plasmids. We observed the presence of advantageous traits such as cobalamin biosynthesis on pTP143 and pTHEOS01 (<italic>T. oshimai</italic> plasmid) and nitrate reduction on pTP143, pTHEOS01 and <italic>T. thermophilus</italic> JL-18 plasmid. To assess possibly laterally acquired regions on pTP143, we performed a comparative mapping of plasmids based on BLASTn comparisons (identity cut-off: 80, coverage cut-off: 80, e-value: 1e-30) across all the <italic>Thermus</italic> plasmids. The plasmid pTP143 of <italic>T. parvatiensis</italic> showed highest identity to the plasmids pTHTHE1601 (identity: 99.3%, query cover: 79%), pTTJL1801 (identity: 97.5%; query cover: 78%), pTT27 of HB27 (identity: 99.0%; query cover: 47%) and pTT27 of HB8 (identity: 98.9%; query cover: 49%). A region on pTP143 that failed to show any homologous regions with any of the plasmids was analyzed as a putatively horizontally transferred region (Supplementary Table <xref ref-type="supplementary-material" rid="SM5">5</xref>). This region harbored 3 genes for mobile element proteins, 9 hypothetical protein coding genes, <italic>sbcC</italic> and <italic>sbcD</italic> (hairpin structure resolving nucleases) (Figure <xref ref-type="fig" rid="F4">4</xref>). The genes present on this locus were identified using BLASTp (identity cut-off: 70%, e-value cut-off: 1.00E-15) on the chromosomes of <italic>T. aquaticus, T. brockianus</italic>, HB8, SG0.5JP17-16, JL-18, CCB_US3_UF1 and even on the chromosome of <italic>T. parvatiensis</italic>.</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p><bold>(A)</bold> Detailed map of plasmid pTP143 with genes assigned to COG categories and depicted in a color-coded manner. Rings from outside to inside represent positions of genomic islands (GI) (red); coding sequences (brown); GC content (black); GC skew (purple) and COG category assignment (multiple colors according to the color key) <bold>(B)</bold> GIs found on the megaplasmid pTP143 (linear view) have been highlighted. These regions contain hypothetical protein coding genes, some plasmid specific genes, a toxin-antitoxin system and repair genes. Most of the genes on the GIs were placed into COG category &#x0201C;L&#x0201D; (replication and repair).</p></caption>
<graphic xlink:href="fmicb-08-01410-g0004.tif"/>
</fig>
<p>The genes on the plasmids of <italic>Thermus</italic> are under movement and involved in shifting from the plasmid to the chromosome, in order to get stabilized. In case of <italic>Thermus</italic> species, plasmids perhaps contribute more to the flexibility of genome by either acquiring or shedding of the genes. Incoming genes/pathways may first be incorporated on the plasmid and later be stabilized either on the plasmid itself or through integration into the chromosome. The presence of a genomic island on plasmid pTP143 with multiple mobile elements suggests that it might get mobilized soon. In the process, some other genes may also get shifted from the plasmid to the chromosome. Across <italic>Thermus</italic> genomes, this process has led to either stabilization of the megaplasmid with megaplasmids acquiring a vast majority of genes (SG0.5JP17-16), or it has led to streamlining of the plasmid (<italic>T. scotoductus</italic>).</p>
</sec>
<sec>
<title>Conserved and variable gene repertoire of <italic>Thermus</italic> group</title>
<p>The core genome denotes the conserved functions, whereas, the pan genome denotes the entire genetic potential of a group (Tettelin et al., <xref ref-type="bibr" rid="B84">2005</xref>). The total number of genes constituting the core genome for the genus <italic>Thermus</italic> were 1177. Functional annotation of genes constituting the core genome placed a high number of genes into categories coding for amino acid metabolism and transport (12.15%), translation (10.07%), energy production and conservation (7.75%) and coenzyme metabolism (6.07%), designating these as the conserved functions specific for the genus. The pan genome was estimated at 5188 and constituted accessory genes and unique genes (singletons). Accessory genes are the ones whose orthologs are present in two or more genomes, but not all the genomes. Singletons are genes that are unique to just one genome out of all those compared. The variability in accessory genome depicts the flexibility of the genome structure. Accessory gene number varied from 661 to 1,166 (mean: 946). High number of accessory genes were observed in <italic>T. tengchongensis</italic> (1,166), <italic>T. brockianus</italic> (1,094), JL-18 (1093), <italic>T. oshimai</italic> (1,053), <italic>T. scotoductus</italic> (1,032), SG0.5JP17-16 (1027), and <italic>T. aquaticus</italic> (1003) (Figure <xref ref-type="fig" rid="F5">5B</xref>). Variability in the number of unique genes (singletons) was observed from 30 to 229 genes (mean 98). A large number of singletons were identified in <italic>T. filiformis</italic> (229), <italic>T. islandicus</italic> (212), <italic>T. tengchongensis</italic> (175), and <italic>T. aquaticus</italic> (154) (Figure <xref ref-type="fig" rid="F5">5B</xref>). Out of the genomes having high number of accessory genes (higher than mean), 55.5% of the genomes had high number of singletons too. These genomes were <italic>T. aquaticus, T. brockianus, T. oshimai, T. scotoductus</italic>, and <italic>T. tengchongensis</italic>. The accessory and unique components of the genus were enriched in genes belonging to carbohydrate metabolism and transport (8.17 and 8.76% respectively), replication and repair (8.17 and 8.76% respectively), inorganic ion transport and metabolism (5.85 and 5.69% respectively) and signal transduction (5.71 and 5.40% respectively) which reflect the diverse functional counterparts harbored by these organisms. The pan genome of the genus was estimated as an &#x0201C;open&#x0201D; pan genome because a plateau was not observed (Figure <xref ref-type="fig" rid="F5">5A</xref>) after addition of all genomes to the pan genome plot. Addition of more genomes to the group will lead to expansion of the pan repertoire (Rouli et al., <xref ref-type="bibr" rid="B71">2015</xref>). The genus has thus maintained a conserved core genome, but an expansive and sundry pan genome. The flexibility of the genomes is explained by the high influx of genes into these organisms via horizontal gene transfer. Although, a number of features might get recruited and incorporated in the genome, only those that have a survival benefit for the organism will be retained and the rest will be discarded by the highly active rearrangement events that are continually taking place in these genomes. In due course of time, extensive rearrangement events have led to the establishment of those genomic features that have benefitted the organism for better survival; other dispensable elements were either shed off or transferred to the plasmid. Overall, genome-wide differences and anticipation in accordance with specific gene repertoire required at a niche (niche specialization) can be considered to be driving forces in the evolution of the genus <italic>Thermus</italic>.</p>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption><p>Core and pan genome analysis of <italic>Thermus</italic> genus <bold>(A)</bold> Plot of pan and core genome. The plot represents a stabilized core structure but an open pan-genome; <bold>(B)</bold> Graph showing the number of core genes, accessory genes, unique genes and exclusively absent genes in all genomes under study.</p></caption>
<graphic xlink:href="fmicb-08-01410-g0005.tif"/>
</fig>
<p>In order to designate strain specific, potentially utilizable attributes of <italic>T. parvatiensis</italic>, we identified MGIs of <italic>T. parvatiensis</italic>. The term MGI encompasses those regions of the genome that are identified by mapping metagenomic data from an environment against the genome of an organism isolated from the same niche (Pa&#x00161;i&#x00107; et al., <xref ref-type="bibr" rid="B65">2009</xref>). These regions stand out as &#x0201C;gaps&#x0201D; with little or no reads corresponding to these regions, thus highlighting the strain specific potential that these organisms have accumulated in contrast to the environmental counterparts. Such an analysis was performed for <italic>T. parvatiensis</italic> by aligning raw metagenomic data from hot spring water (Manikaran, India) onto the replicons (Figure <xref ref-type="fig" rid="F6">6</xref>). <italic>T. parvatiensis</italic> chromosome showed a relatively high coverage of reads (4 &#x000D7;) with five regions with no coverage of reads (MGIs) (Figure <xref ref-type="fig" rid="F6">6</xref>). <italic>T. parvatiensis</italic> plasmid pTP143, however had thin coverage of reads (1.9 &#x000D7;) and five MGIs. Thus, plasmid pTP143 seems to have accumulated more strain specific variations which denote high flexibility of the plasmid. This data further states that the chromosome in case of <italic>Thermus</italic> is more or less stable in terms of the genes it harbors, however, much influx and rearrangements occur via plasmid. The chromosomal MGIs harbored genes for arginine biosynthesis, iron-sulfur cluster assembly proteins, transcriptional regulators, two component system genes and hypothetical genes (Supplementary Table <xref ref-type="supplementary-material" rid="SM6">6</xref>). A MGI (4,811 bp) on the plasmid was found to harbor genes specifically involved in environmental response to stress in general and oxidative stress in particular. These genes included radical SAM (S-adenosylmethionine) domain heme biosynthesis protein (heme is a co-factor for hemoproteins that functions in electron transport chain), cytochrome c552 which is particularly involved in electron transport at low aeration, peptide methionine sulfoxide reductase (<italic>MsrA</italic>) known to protect against reactive oxygen and nitrogen species (Weissbach et al., <xref ref-type="bibr" rid="B89">2002</xref>). Cobalamin biosynthesis genes were largely detected on the plasmid MGIs, including uroporphyrinogen-III methyltransferase, <italic>BluB</italic>, adenosylcobalamine-phosphate synthase and <italic>CbiG</italic>. Other prominently represented genes included plasmid stability genes (<italic>ParB, Soj</italic>, and <italic>StbB</italic>) and DNA repair genes (reverse gyrase, <italic>sbcC</italic> and <italic>sbcD</italic>). These genes implicate the conservation of low aeration oxidation response, plasmid stability and DNA repair as strain specific features. In order to get insights about the prevalence of these specialized regions in other genomes, we performed BLASTn (identity &#x0003E; 95%, query coverage &#x0003E; 95%, e-value &#x0003C;1.00E-30) of MGI regions with other <italic>Thermus</italic> genomes. Whereas, genes prevalent on MGI1 of the chromosome did not show significant identity with other members of the genus, MGI2-5 of the chromosome were found to have homologous counterparts in JL-18, SG0.5JP17-16, HB27, and <italic>T. brockianus</italic>. Plasmid MGIs 1, 3, 4, and 5, similarly could be identified on JL-18, SG0.5JP17-16, HB8, HB27, however, plasmid MGI2 of pTP143 was unique in this respect and significant similarity could not be observed with other members. The genes thus annotated on MGIs, were conserved within the <italic>T. thermophilus</italic> group indicating close adaptive relatedness. Chromosomal MGI1 and plasmid MGI2 were identified as strain specific MGIs for <italic>T. parvatiensis</italic>. These attributes suggest the conservation of features that are not directly implicated with the niche, but are retained in the genome as anticipated survival benefits.</p>
<fig id="F6" position="float">
<label>Figure 6</label>
<caption><p>Depiction of metagenomic islands recovered by recruitment of raw reads obtained from the metagenomic sequencing of hot spring water (Manikaran, India) onto <italic>T. parvatiensis</italic> plasmid <bold>(A)</bold> and chromosome <bold>(B)</bold>. MGIs on the plasmid pTP143 (5 MGIs) are highlighted in gray; MGIs on the chromosome of <italic>T. parvatiensis</italic> (5 MGIs) are marked with arrows. Reads mapped to the reference (identity &#x02265; 80%) are represented as blue (MNW1) and red (MNW2) dots.</p></caption>
<graphic xlink:href="fmicb-08-01410-g0006.tif"/>
</fig>
</sec>
</sec>
<sec>
<title>Genus specific survival strategies</title>
<sec>
<title>Abundance of CRISPR arrays</title>
<p>Analysis of CRISPRs among <italic>Thermus</italic> species was performed to get insights into prevalence of viral defense system within the genus. All genomes harbored CRISPR loci, except <italic>T. antranikianii</italic>. Nine CRISPR arrays were identified in <italic>T. filiformis</italic>, HB8, HB27 and <italic>T. igniterrae</italic>, followed by <italic>T. aquaticus</italic> which had 7 CRISPR arrays (Table <xref ref-type="table" rid="T3">3</xref>). <italic>T. parvatiensis</italic>, on the other hand, was found to harbor only 1 true CRISPR array. One questionable array was detected in <italic>T. islandicus</italic>. The CRISPR cassette/array number varied from 0-9 with DR length variation from 25 to 37 wide all genomes analyzed (mean CRISPR array count per genome: 5.529). The number of spacers harbored within CRISPR arrays denote the frequency of viral invasions. <italic>T. oshimai</italic> was found to carry the highest number of spacers (134 spacers). A high frequency of viral attacks in these thermophiles was indicated by a high mean number of spacers harbored by each genome (mean: 72.7 spacers per genome). Five <italic>Thermus</italic> genomes (29.4%) harbored &#x0003E;100 CRISPR spacers. Within <italic>T. oshimai</italic> itself, two CRISPR arrays harboring large number of spacers were uncovered. One of the CRISPR arrays harbored 88 copies of a single DR consensus (CGGTCCATCCCCACGGGCGTGGGGACTAC; DR length: 29 bp), with an equally high number of spacers. The other array (DR consensus &#x0003D; CTTTGAACCGTACCTATAAGGGTTTGAAAC; DR length: 30 bp) had acquired 67 spacers. On the contrary, within the same genus, an array containing only 3 spacers was also observed. This indicates high variation among CRISPR elements within the species. <italic>Cas</italic> genes were extracted from true CRISPRs and annotated (Table <xref ref-type="table" rid="T3">3</xref>). The Cas system in <italic>Thermus</italic> was found to be composed of genes belonging to types I, III, and IV (Supplementary Figure <xref ref-type="supplementary-material" rid="SM12">3</xref>). Universal type genes <italic>cas1</italic> and <italic>cas2</italic> were identified on CRISPR loci. Apart from this, the class I effector cas3 which is responsible for the helicase and DNase activity was also annotated (Table <xref ref-type="table" rid="T3">3</xref>). Thus, cleavage of foreign entities entering <italic>Thermus</italic> genomes is brought about by type I, III, and IV mediated action. Analysis of repeats based on sequence and structure (Figure <xref ref-type="fig" rid="F7">7</xref>) was performed. Sequence families 1 and 18 were predominant (spotted in 7 and 6 genomes respectively) (Supplementary Table <xref ref-type="supplementary-material" rid="SM8">8</xref>). Among structural motifs (based on the classification of motifs into 33 groups), motifs 25 and 5 were most represented (8 and 6 genomes respectively). Motif 6, motif 20, motif 23, and motif 31 were least represented (1 genome each) (Supplementary Tables <xref ref-type="supplementary-material" rid="SM7">7</xref>, <xref ref-type="supplementary-material" rid="SM8">8</xref>). Additionally, on the basis of repeat-cas binding projections, probable <italic>cas</italic> genes harbored by <italic>Thermus</italic> genomes were predicted and found to belong to types I, III, and IV. Separate cluster trees were constructed for structure motifs (based on the classification of structural motifs into 18 groups) to denote the placement of <italic>Thermus</italic> repeats among all consensus repeats present in the database. A majority of repeats (10 repeats) mapped onto motif 1 and occupied a close phylogenetic position within the cluster tree (Supplementary Figure <xref ref-type="supplementary-material" rid="SM13">4</xref>). In order to discern the phages most frequently attacking <italic>Thermus</italic> genomes, we analyzed spacer matches with viral sequence database (Supplementary Table <xref ref-type="supplementary-material" rid="SM9">9</xref>). Positive inferences were based only on the results that satisfied the stringency criteria. A number of phages of different families were detected to infect <italic>Thermus</italic> species (Figure <xref ref-type="fig" rid="F7">7B</xref>). Among these, phages of families <italic>Sphaerolipoviridae, Siphoviridae, Myoviridae</italic>, and <italic>Herpesviridae</italic> were identified as the most prominent bacteriophages. Earlier <italic>Siphoviridae, Myoviridae, Inoviridae</italic>, and <italic>Tectiviridae</italic> have been reported from <italic>Thermus</italic> species (Yu et al., <xref ref-type="bibr" rid="B93">2006</xref>). However, our analysis revealed the dominance of <italic>Sphaerolipoviridae</italic> (28.1%), with known thermophilic phages P23-77 and IN93 being prominently detected. A detailed list of viruses, the invasion memory of which is incorporated within <italic>Thermus</italic> CRISPRs is mentioned in Supplementary Table <xref ref-type="supplementary-material" rid="SM9">9</xref>. Our analysis thus reveals the abundant presence of defense mechanism and frequent viral encounters in this genus. Even though a large number of spacers (1,223) were analyzed, only 62 (5.07%) could be assigned significant matches to the viral database.</p>
<table-wrap position="float" id="T3">
<label>Table 3</label>
<caption><p>Summary of CRISPR elements found across all <italic>Thermus</italic> genomes under this study.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Species</bold></th>
<th valign="top" align="center"><bold>Confirmed CRISPRs</bold></th>
<th valign="top" align="center"><bold>DR length</bold></th>
<th valign="top" align="left"><bold>Spacers</bold></th>
<th valign="top" align="left"><bold><italic>Cas</italic> types</bold></th>
<th valign="top" align="left"><bold><italic>Cas</italic> genes</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left"><italic>T. parvatiensis</italic></td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">37</td>
<td valign="top" align="left">9</td>
<td valign="top" align="left">Other</td>
<td valign="top" align="left"><italic>cas6, cas2</italic></td>
</tr>
<tr>
<td valign="top" align="left"><italic>T. thermophilus</italic> HB27</td>
<td valign="top" align="center">9</td>
<td valign="top" align="center">32-37</td>
<td valign="top" align="left">6&#x0002B;3&#x0002B;15&#x0002B;7&#x0002B;7&#x0002B;6&#x0002B;6&#x0002B;13&#x0002B;9 &#x0003D; 72</td>
<td valign="top" align="left">I, III, IV</td>
<td valign="top" align="left"><italic>csa3, cas2, cas10, csm2gr11, csm3gr7, csx1, cmr3gr5, cmr4gr7, cmr5gr11, DinG, cas5, cas8c, cas7b, cas1, cas4, WYL, cas8b5, cas7, cas3, cas6</italic></td>
</tr>
<tr>
<td valign="top" align="left"><italic>T. thermophilus</italic> HB8</td>
<td valign="top" align="center">9</td>
<td valign="top" align="center">29-37</td>
<td valign="top" align="left">14&#x0002B;4&#x0002B;3&#x0002B;12&#x0002B;9&#x0002B;12&#x0002B;23&#x0002B;20&#x0002B;12 &#x0003D; 109</td>
<td valign="top" align="left">I, III</td>
<td valign="top" align="left"><italic>csx1, cas1, cas2, cas10, csm2gr11, csm3gr7, csm3gr5, cmr4gr7, cmr5gr11, WYL, cas3HD, cas8e, cse2gr11, cas7, cas5, cas6e, cas1, cas8b5, cas3, cas6</italic></td>
</tr>
<tr>
<td valign="top" align="left"><italic>T. thermophilus</italic> JL-18</td>
<td valign="top" align="center">6</td>
<td valign="top" align="center">29-37</td>
<td valign="top" align="left">21&#x0002B;23&#x0002B;7&#x0002B;25&#x0002B;19&#x0002B;5 &#x0003D; 100</td>
<td valign="top" align="left">I</td>
<td valign="top" align="left"><italic>WYL, cas3HD, cas8e, cse2gr11, cas7, cas5, cas6e, cas1, cas2, cas4, cas8b5, cas3, cas6, csa3, cas8U1, csb2gr5, casR</italic></td>
</tr>
<tr>
<td valign="top" align="left"><italic>T. thermophilus</italic> SG0.5JP17-16</td>
<td valign="top" align="center">6</td>
<td valign="top" align="center">28-37</td>
<td valign="top" align="left">3&#x0002B;13&#x0002B;6&#x0002B;10&#x0002B;6&#x0002B;18 &#x0003D; 56</td>
<td valign="top" align="left">I</td>
<td valign="top" align="left"><italic>casR, cas7, csb2gr5, cas3, cas8U1, cas4, csa3, cas2, cas1, WYL, cas8b5, cas5, cas6, cas8c, cas7b</italic></td>
</tr>
<tr>
<td valign="top" align="left"><italic>T. scotoductus</italic></td>
<td valign="top" align="center">3</td>
<td valign="top" align="center">29-30</td>
<td valign="top" align="left">9&#x0002B;36&#x0002B;42 &#x0003D; 87</td>
<td valign="top" align="left">I</td>
<td valign="top" align="left"><italic>cas2, cas1, cas6e, cas5, cas7, cse2gr11, cas8e, cas3HD, WYL</italic></td>
</tr>
<tr>
<td valign="top" align="left"><italic>T. oshimai</italic></td>
<td valign="top" align="center">5</td>
<td valign="top" align="center">29-36</td>
<td valign="top" align="left">14&#x0002B;6&#x0002B;88&#x0002B;30&#x0002B;4 &#x0003D; 134</td>
<td valign="top" align="left">I, III, IV</td>
<td valign="top" align="left"><italic>cas6, csx1, csm3gr7, csm4gr5, csm2gr11, cas10, cas2, csx1, WYL, cas3HD, cas8e, cse2gr11, cas7, cas6e, cas1, csa3, DinG, cas3</italic></td>
</tr>
<tr>
<td valign="top" align="left"><italic>T</italic>. sp. CCB_US3_UF1</td>
<td valign="top" align="center">7</td>
<td valign="top" align="center">28-36</td>
<td valign="top" align="left">3&#x0002B;17&#x0002B;14&#x0002B;23&#x0002B;18&#x0002B;9&#x0002B;12 &#x0003D; 96</td>
<td valign="top" align="left">I, III</td>
<td valign="top" align="left"><italic>csm3gr7, csm5gr11, cmr4gr7, cmr3gr5, cas10, csx1, cas4, cas3HD, cas6, cas8b1, cas7, csm2gr11, csm4gr5, cas1</italic></td>
</tr>
<tr>
<td valign="top" align="left"><italic>T. aquaticus</italic></td>
<td valign="top" align="center">7</td>
<td valign="top" align="center">26-36</td>
<td valign="top" align="left">7&#x0002B;5&#x0002B;22&#x0002B;3&#x0002B;6&#x0002B;12&#x0002B;3 &#x0003D; 58</td>
<td valign="top" align="left">I, III</td>
<td valign="top" align="left"><italic>cas2, cas1, csx1, csa3, cas5, cas7, cas8b1, cas3HD, cas4, cas10, csm2gr11, csm3gr7, csm4gr5, cas6, csm5gr11, csm4gr7</italic></td>
</tr>
<tr>
<td valign="top" align="left"><italic>T. brockianus</italic></td>
<td valign="top" align="center">8</td>
<td valign="top" align="center">29-37</td>
<td valign="top" align="left">9&#x0002B;14&#x0002B;7&#x0002B;10&#x0002B;12&#x0002B;8&#x0002B;25&#x0002B;27 &#x0003D; 112</td>
<td valign="top" align="left">I, III</td>
<td valign="top" align="left"><italic>cas6, cas10, csm2gr11, csm3gr7, csm4gr5, csx1, cas2, cas1, cmr3gr5, cmr4gr7, cmr5gr11, cas1, cas4, cas3, cas3HD, cas8b1, cas7b, cas5, cas8c, cas8e, cse2gr11, cas7, cas5, cas6E</italic></td>
</tr>
<tr>
<td valign="top" align="left"><italic>T. antranikianii</italic></td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="left">0</td>
<td valign="top" align="left">&#x02014;</td>
<td valign="top" align="left">&#x02014;</td>
</tr>
<tr>
<td valign="top" align="left"><italic>T. filiformis</italic></td>
<td valign="top" align="center">9</td>
<td valign="top" align="center">25-37</td>
<td valign="top" align="left">3&#x0002B;24&#x0002B;4&#x0002B;5&#x0002B;6&#x0002B;22&#x0002B;5&#x0002B;12&#x0002B;7 &#x0003D; 88</td>
<td valign="top" align="left">I, III, IV</td>
<td valign="top" align="left"><italic>cas4, WYL, DinG, cas3HD, cas8e, cse2gr11, cas7, cas5, cas6e, csx1, cas10, cmr3gr5, csm3gr7, cmr4gr7, cmr4gr11, cas2, cas6, cmr2gr11, csm4gr5, casR, cas2, cas1, cas3D, cas8b1, cas7b</italic></td>
</tr>
<tr>
<td valign="top" align="left"><italic>T. islandicus</italic></td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="left">0</td>
<td valign="top" align="left">I, III</td>
<td valign="top" align="left"><italic>cas1, csx1, cas6, cas3, cas7, cas8b5, WYL, cas8b1, cas7b, cas5, cas2, csa3, csm3gr7, csm4gr5, csm2gr11, cas10</italic></td>
</tr>
<tr>
<td valign="top" align="left"><italic>T. igniterrae</italic></td>
<td valign="top" align="center">9</td>
<td valign="top" align="center">28-36</td>
<td valign="top" align="left">5&#x0002B;23&#x0002B;3&#x0002B;11&#x0002B;3&#x0002B;4&#x0002B;15&#x0002B;17&#x0002B;28 &#x0003D; 109</td>
<td valign="top" align="left">I, III</td>
<td valign="top" align="left"><italic>csm3gr7, csm4gr5, csm2gr11, cas10, csx1, WYL, cas3HD, cas8e, cse2gr11, cas7, cas5, cas6e, cas1</italic></td>
</tr>
<tr>
<td valign="top" align="left"><italic>T. caliditerrae</italic></td>
<td valign="top" align="center">4</td>
<td valign="top" align="center">35-36</td>
<td valign="top" align="left">21&#x0002B;18&#x0002B;9&#x0002B;10 &#x0003D; 58</td>
<td valign="top" align="left">I, III</td>
<td valign="top" align="left"><italic>csm3gr7, cmr5gr11, cmr3gr5, cas10, cas6, csx1, csm2gr11, csm4gr5, cas2, cas1, cas3</italic></td>
</tr>
<tr>
<td valign="top" align="left"><italic>T. amyloliquefaciens</italic></td>
<td valign="top" align="center">5</td>
<td valign="top" align="center">29-36</td>
<td valign="top" align="left">15&#x0002B;5&#x0002B;9&#x0002B;6&#x0002B;18 &#x0003D; 53</td>
<td valign="top" align="left">I, III</td>
<td valign="top" align="left"><italic>csx1, csm4gr5, csm3gr7, csm2gr11, cas10, cas6, csx1, cas2, csa3, cas1, cas6e, cas5, cas7, cse2gr11, cas8e, cas3HD, WYL</italic></td>
</tr>
<tr>
<td valign="top" align="left"><italic>T. tengchongensis</italic></td>
<td valign="top" align="center">4</td>
<td valign="top" align="center">30-37</td>
<td valign="top" align="left">67&#x0002B;3&#x0002B;12&#x0002B;9 &#x0003D; 91</td>
<td valign="top" align="left">I, III</td>
<td valign="top" align="left"><italic>csa3, csm2gr11, csa3, csa3, casR, cas7, cas8b1, cas3HD, cas4, cas3, cas5, cas8b5, WYL</italic></td>
</tr>
</tbody>
</table>
</table-wrap>
<fig id="F7" position="float">
<label>Figure 7</label>
<caption><p>CRISPR repeat <bold>(A)</bold> and spacer <bold>(B)</bold> analyses. <bold>(A)</bold> Dendrogram depicting clustering of CRISPR array repeats from all genomes of <italic>Thermus</italic> spp. against CRISPR repeats database (4,719 consensus repeats). The classification of repeats is based upon the division of all repeats into 24 conserved sequence families and 18 conserved structural motifs. The position of branches representing repeats from <italic>Thermus</italic> spp. is marked by red arrows and depicted by a number which corresponds to a genome, as follows: 1. <italic>T. aquaticus</italic>, 2. <italic>T. amyloliquefaciens</italic>, 3. <italic>T. brockianus</italic>, 4. <italic>T. caliditerrae</italic>, 5. <italic>Thermus</italic> sp. CCB_US3_UF1, 6. <italic>T. filiformis</italic>, 7. <italic>T. thermophilus</italic> HB8, 8. <italic>T. thermophilus</italic> HB27, 9. <italic>T. igniterrae</italic>, 10. <italic>T. thermophilus</italic> JL-18, 11. <italic>T. oshimai</italic>, 12. <italic>T. scotoductus</italic>, 13. <italic>T. tengchongensis</italic>, 14. <italic>T. parvatiensis</italic>, 15. <italic>T. thermophilus</italic> SG0.5JP17-16. Concentric circles are color coded and represent the following from inside out respectively: Structure motifs, sequence families, CAS subtypes, taxonomic groups, superclasses. Black and blue color branches represent bacterial and archaeal sequences respectively. <bold>(B)</bold> Pie diagram representing the relative abundance of viral families infecting <italic>Thermus</italic> genomes, predicted on the basis of CRISPR spacer analysis.</p></caption>
<graphic xlink:href="fmicb-08-01410-g0007.tif"/>
</fig>
<p>CRISPRs constitute the characteristic prokaryotic and archaeal adaptive as well as inheritable immune system composed of short repeat sequences (direct repeats/DRs) interspaced with short segments of nucleotides known as spacers. Spacers represent the memory of past invasions by foreign genetic elements like viruses (Barrangou et al., <xref ref-type="bibr" rid="B7">2007</xref>) or plasmids (Marraffini and Sontheimer, <xref ref-type="bibr" rid="B54">2008</xref>). Spacers are incorporated into CRISPR loci whenever a bacteriophage infects the organism. This way, a CRISPR array can be considered as a library of past viral invasions faced by an organism. CRISPR arrays are associated with CRISPR associated genes (<italic>Cas</italic> genes), which are present in the vicinity of CRISPR arrays. Together with virus specific spacers, <italic>Cas</italic> genes encode an arsenal of proteins and RNAs, which in conjugation, destroy the foreign element, the next time it invades (Mojica et al., <xref ref-type="bibr" rid="B58">2005</xref>; Barrangou et al., <xref ref-type="bibr" rid="B7">2007</xref>). The CRISPR system is thus, a defense mechanism against bacteriophage invasions on the bacterial genome. CRISPR arrays that lack the requisite <italic>Cas</italic> genes in their vicinity are known as questionable or false CRISPRs. Predominance of viruses in thermophilic niches and consequently prevalence of CRISPRs in thermophilic genomes is known (Anderson et al., <xref ref-type="bibr" rid="B2">2011</xref>; Weinberger et al., <xref ref-type="bibr" rid="B88">2012b</xref>). We analyzed the CRISPR loci from all <italic>Thermus</italic> genomes in this study in order to investigate the probable viruses infecting the genus and to uncover the organization of <italic>cas</italic> genes. Our examination divulged the ubiquity of CRISPR arrays within the genus <italic>Thermus</italic>, reflecting a resilient viral defense system. The predominance of CRISPRs among the <italic>Thermus</italic> group suggests presence and activity of phages in thermophilic environments. A wide scenario of phage invasion in <italic>T. oshimai</italic> was particularly denoted by high number of spacers present in this species. The uneven distribution of CRISPR arrays within a group can be explained by the hypothesis that probably the cost of harboring CRISPR elements in particular bacteria outweighs the benefits harnessed (Weinberger et al., <xref ref-type="bibr" rid="B87">2012a</xref>). <italic>Cas</italic> genes, which are momentous to the functioning of CRISPRs, are divided into two classes (class I and II), depending on their mechanism of action. Class I CRISPR-Cas systems act by employing a number of Cas proteins which bring about the required action in a cascade of events. Class II systems, on the other hand, rely on single effector proteins for binding and cleavage of the target site. Based on the specific proteins involved, class I is further divided into types I, III and IV and class II is divided into types II and V. Widespread presence of type I (15 genomes) and type III (12 genomes) was observed in <italic>Thermus</italic>. Type IV systems were detected only in 3 genomes. Type IV systems have been known to be rare (&#x0003C;2%) in both bacteria and archaea (Makarova et al., <xref ref-type="bibr" rid="B53">2015</xref>). Evolutionary relationships were uncovered on the basis of repeat sequences by clustering repeat sequences into conserved sequence families and structural motifs (Lange et al., <xref ref-type="bibr" rid="B48">2013</xref>). CRISPR DRs transcribe repeat RNA sequences which serve as Cas protein binding templates. Repeat sequences show significant conservation in their sequence as well as hairpin structure forming motifs (Lange et al., <xref ref-type="bibr" rid="B48">2013</xref>). Sequence conservation has been used as a criterion for clustering of DRs into families. Similarly, structural motif grouping is based on RNA loop structures. Using this analysis, we were able to identify the distribution of sequence families and structural families within this genus. Interestingly, motifs 20 and 31, which were identified in <italic>T. filiformis</italic> and HB8, have been known to constitute a mixture of both bacterial and archaeal domains. Viral diversity analysis delineated most probable viral predators for the <italic>Thermus</italic> group. A high proportion of spacers, however, were left un-assigned, implying the huge viral genosphere that is yet unexplored.</p>
</sec>
<sec>
<title>Genes imparting competence</title>
<p>Genes implicated in natural transformability of HB27 include <italic>PilA1, PilA2, PilA3, PilA4, ComZ, CinA, DprA, ComEA, ComEC, PilF, PilC, PilM-N, PilN-O-W, ComF, PilQ</italic>, and <italic>PilD</italic>. Apart from HB27, all of the above genes were harbored by <italic>T. oshimai</italic>, CCB_US3_UF1, <italic>T. islandicus</italic>, JL-18, and SG0.5JP17-16. In this study, all <italic>Thermus</italic> genomes were found to show a homogeneous profile with respect to genes except <italic>PilA1-PilA4</italic> and <italic>ComZ</italic>. Among competence genes, <italic>PilA</italic> gene plays a decisive role in efficient translocation (Schwarzenlander et al., <xref ref-type="bibr" rid="B76">2009</xref>). <italic>PilA1-A4</italic> genes and <italic>ComZ</italic> are present as a cluster in all genomes, however the arrangement and organization of genes show a difference across genotypes. The <italic>PilA-ComZ</italic> locus was found to be harbored in all <italic>Thermus</italic> genomes except <italic>T. igniterrae</italic> and <italic>T. antranikianii</italic>. Either these strains have not yet acquired the cluster or the cluster has been missed out during sequencing (draft genomes). The <italic>PilA-ComZ</italic> operon contains <italic>PilA1, A2, A3, A4</italic>, and <italic>ComZ</italic> as principal genes. The operon, however, also contains other genes with pilin/putative pilin/pseudopilin domains and genes coding for hypothetical proteins. The <italic>PilA-ComZ</italic> locus in <italic>Thermus</italic> was found to be comprised of 9&#x02013;12 genes out of which 4&#x02013;7 genes were pseudopilins with significant similarity to the <italic>PilA</italic> genes of strain HB27 (Figure <xref ref-type="fig" rid="F8">8</xref>). Apart from pseudopilin genes, the locus is comprised of genes coding for hypothetical proteins, chromosome segregation protein, apolipoprotein and DUF820 superfamily nucleases. <italic>PilA1</italic> gene (present in 14 genomes) was found to be duplicated in 7 genomes. The two copies of <italic>PilA1</italic> were, however, not identical and showed identity ranging from 67 to 81%. By convention, we have named the <italic>PilA1</italic> of HB27 (471 bp) as copy 1. Copy 2 of <italic>PilA1</italic> (present in 7 strains) is a smaller gene ranging from 272 to 373 bp (mean &#x0003D; 329 bp). In <italic>T. caliditerrae</italic>, only a fragmented copy (135 bp) of <italic>PilA1</italic> was recovered. <italic>PilA2</italic> (582&#x02013;614 bp) was recovered in 7 strains as a complete copy. In case of <italic>T. scotoductus</italic>, identity at the C- and N- terminals and non-identity at the middle of the gene sequence was observed. A truncated version of <italic>PilA2</italic> was identified in <italic>T. tengchongensis. PilA3</italic> was recovered in a complete state (685&#x02013;746 bp) in nine strains. In <italic>T. scotoductus</italic>, a truncated <italic>PilA3</italic> was detected. The most deviant forms among <italic>PilA</italic> genes were observed for <italic>PilA4</italic>. Patterns of alignment among all <italic>PilA</italic> genes best denote this observation (Supplementary Figure <xref ref-type="supplementary-material" rid="SM14">5</xref>). The alignment features are most consistent (&#x0003E;70%) in case of <italic>PilA1</italic> through <italic>PilA3</italic> genes. In case of <italic>PilA4</italic>, similarity among genes is observed only at the C- and N- termini of the genes. <italic>PilA4</italic> of HB27 is 396 bp in size, however, in the other 13 strains harboring <italic>PilA4</italic>, only a small part (87&#x02013;121 bp) showed significant similarity (&#x0003E;80%, e-value: 1E-15) to <italic>PilA4</italic> of HB27. Additionally, truncated versions of <italic>PilA4</italic> were observed in <italic>T. aquaticus, T. caliditerrae</italic>, and <italic>T. amyloliquefaciens</italic>.</p>
<fig id="F8" position="float">
<label>Figure 8</label>
<caption><p>Organization of putatively horizontally transferred <italic>PilA-ComZ</italic> locus genes from all <italic>Thermus</italic> genomes in this study. Blue, Pilus family proteins; Gray, Nucleases; Tan, Hypothetical/Conserved proteins; Maroon, Transposase/Conjugative element; Rhomboid, unformed CDS; Navy blue, Toxin-antitoxin elements; Black triangles, tRNAs.</p></caption>
<graphic xlink:href="fmicb-08-01410-g0008.tif"/>
</fig>
<p>A complete <italic>PilA-ComZ</italic> locus was identified in HB27, <italic>T. oshimai</italic>, CCB_US3_UF1, JL-18, SG0.5JP17-16, and <italic>T. islandicus</italic>. However, in other genomes, we could identify the presence of &#x0201C;genetic scars.&#x0201D; Genetic scars are truncated genes or pseudogenes (without start/stop codons), but show significant identity to functional <italic>PilA</italic> gene regions of HB27 (taken as a reference for all comparisons here). Two genetic scars were identified in <italic>T. scotoductus</italic> (<italic>A3</italic> and <italic>A2</italic>), two in <italic>T. caliditerrae</italic> (<italic>A1</italic> and <italic>A4</italic>), one in <italic>T. amyloliquefaciens</italic> (<italic>A4</italic>), one in <italic>T. aquaticus</italic> (<italic>A4</italic>) and one in <italic>T. tengchongensis</italic> (<italic>A2</italic>). Interestingly, an IS4 family transposase, which is known to be found widely across the <italic>Thermus</italic> genomes was found incorporated in the <italic>PilA</italic>-<italic>ComZ</italic> locus of <italic>T. aquaticus</italic>. This can be an evidence for recent transposition activity at the locus. Another evidence is the presence of a conjugative protein in the <italic>PilA</italic>-<italic>ComZ</italic> locus of HB27. Interestingly, a toxin-antitoxin (TA) gene system was found to be incorporated into the competence locus of <italic>T. caliditerrae</italic> and showed similarity to a TA pair in <italic>Rhodothermus marinus</italic>. Horizontally transferred regions are generally found to be associated with tRNA genes (Darmon and Leach, <xref ref-type="bibr" rid="B19">2014</xref>). As expected, tRNA genes were located in close proximity of the <italic>PilA-ComZ</italic> loci of 10 genomes in this study. Out of the ten genomes in which tRNA genes were associated with the competence locus, seven are still not fully formed, thus giving a strong boost to the hypothesis of recent horizontal acquisition of the locus. Genetic relatedness of <italic>Thermus</italic> strains on the basis of <italic>PilA1</italic>-<italic>A4</italic> genes was reconstructed to infer the evolutionary history of <italic>PilA</italic> genes on the basis of their sequence development (Supplementary Figure <xref ref-type="supplementary-material" rid="SM15">6</xref>). <italic>PilA1</italic> copy1 and copy2 gene trees highlighted strain CCB_US3_UF1 and JL-18 as an outlier for both trees and <italic>T. parvatiensis</italic> as an additional outlier for <italic>PilA1</italic> copy1 gene. CCB_US3_UF1 and JL-18 retained their outlier positions in all other trees. In case of <italic>PilA2</italic> and <italic>PilA3</italic> sequences, <italic>T. scotoductus</italic> emerged as an outlier. For <italic>PilA4, T. parvatiensis</italic>, CCB, <italic>T. scotoductus</italic> and JL-18 lie on the out branches of the dendrogram.</p>
<p>Genes responsible for imparting competence in the genus <italic>Thermus</italic> can be divided into two groups: the first group of genes are responsible for uptake of DNA and the second group is responsible for transport of DNA into the cell. Some of these genes belong to the T4P family of proteins which form a complex on the membrane of the cell, spanning the S-layer, Outer Membrane (OM), Secondary Cell Wall Polymers (SCWP) and Peptidoglycans (PG) which comprise the periplasm and the inner membrane (IM). Sixteen genes have been known to play a role in natural transformation in HB27, the transformation machinery of which is the most extensively studied. PilQ is a secretin which forms a macromolecular homopolymeric complex on the outer membrane (Burkhardt et al., <xref ref-type="bibr" rid="B12">2011</xref>) and binds to the incoming DNA. Once DNA reaches the periplasmic space, the pseudopilus proteins (pilA1 through pilA4) come into play, forming subunits of the DNA translocator complex present in the periplasm. PilA1-PilA4 pilins form a shaft like structure in the periplasm which is attached to the outer membrane through PilQ (Burkhardt et al., <xref ref-type="bibr" rid="B12">2011</xref>) and to the inner membrane through a motor ATPase PilF. PilF uses the energy of ATP hydrolysis to draw DNA toward the inner membrane (Rose et al., <xref ref-type="bibr" rid="B69">2011</xref>; Collins et al., <xref ref-type="bibr" rid="B17">2013</xref>). Double stranded DNA further binds to the inner membrane protein ComEA. An inner membrane channel protein, ComEC (co-transcribed along with ComEA) has recently been shown to modulate the expression of <italic>PilA4</italic> and <italic>PilN</italic> in relation to environmental cues like nutrient limitation and low temperature (Salzer et al., <xref ref-type="bibr" rid="B74">2016</xref>). The DNA passing through the IM is single stranded DNA. Hence, a nuclease has to be acting on double stranded DNA to make it single stranded. This nuclease has not been identified yet.</p>
<p>A stable arrangement of the <italic>PilA-ComZ</italic> locus could be observed in <italic>T. oshimai</italic>, CCB_US3_UF1, JL-18, SG0.5JP17-16, and <italic>T. islandicus</italic>. The <italic>PilA</italic>-<italic>ComZ</italic> locus represents a small horizontally acquired region (genomic islet) on the chromosome. It can be distinguished by high number of hypothetical protein coding genes, gene duplications and their association with tRNA genes (Darmon and Leach, <xref ref-type="bibr" rid="B19">2014</xref>). Gene duplication events observed in case of <italic>PilA1</italic> genes demonstrate species radiation forces and amenability of the genomes to evolutionary forces (Roth et al., <xref ref-type="bibr" rid="B70">2007</xref>). The presence of a transposase in <italic>T. aquaticus</italic>, a conjugative element in HB27, TA element in <italic>T. caliditerrae</italic> and truncated <italic>PilA</italic> genes (genetic scars) on the locus indicate recent horizontal origins. Truncated <italic>PilA4</italic> genes observed across this genus denote that it has either not developed or has undergone degradation, determined by the presence of genetic scars which show similarity to N-/C-terminals of <italic>PilA4</italic> of HB27, but not to the complete gene. The close association of a TA system with <italic>PilA</italic>-<italic>ComZ</italic> cluster of <italic>T. caliditerrae</italic> reflects recent acquisition of this cluster. TA systems have been found to be associated with genomic islands and other mobile elements. Association with a TA system, promotes the maintenance of a horizontally acquired island and stabilization into the host genome (Rowe-Magnus et al., <xref ref-type="bibr" rid="B72">2003</xref>; Iqbal et al., <xref ref-type="bibr" rid="B38">2015</xref>). Generally, horizontally transferred loci are marked with pseudogenes as there is a strong selection pressure against these regions (Hao and Golding, <xref ref-type="bibr" rid="B34">2010</xref>). Pseudogenes are non-functional versions of a previously functional gene, which are in the process of getting lost from the genome (Hao and Golding, <xref ref-type="bibr" rid="B34">2010</xref>). Pseudogenes are known to be associated with recently laterally acquired regions or failed HGT events (Hao and Golding, <xref ref-type="bibr" rid="B34">2010</xref>). The predominance of pseudogenes can be due to the high rates of gene turnover in laterally acquired regions. Some of the truncated genes may get stabilized and some eliminated in due course of time. The evidence thus provided leads us to believe that the competence machinery in <italic>Thermus</italic> is of horizontal origin and in course of evolution, may get stabilized or eliminated. The presence of a highly efficient transformation system however does not ensure the incorporation of incoming DNA into the host genetic material. Natural transformation in native conditions is activated during environmental challenges such as starvation, wherein DNA is taken up from the environment (Seitz and Blokesch, <xref ref-type="bibr" rid="B78">2013</xref>). Most of the nucleic acid taken up during this process is used to fulfill nutritional requirements. In this process, some amount of DNA may get incorporated into the host genome, thus diversifying the host pan repertoire and expanding the already diverse arsenal of <italic>Thermus</italic> group.</p>
</sec>
<sec>
<title>Choice between natural competence and viral resistance</title>
<p>The pilus structure in <italic>Thermus</italic> imparted by the T4P genes plays a role not only in twitching motility and natural competence, but also in bacteriophage infection. <italic>PilA</italic> mutants have been shown to lose not only twitching motility and natural competence, but are also resistant to phage infection in HB27 and HB8 strains (Tamakoshi et al., <xref ref-type="bibr" rid="B81">2011</xref>). On the basis of our comparative analysis for the <italic>Thermus</italic> group, we propose a link between pilus gene diversification and CRISPR abundance. Our data suggests continued acquisition and evolution of pilus gene structure among the analyzed <italic>Thermus</italic> genomes. Along with this, a CRISPR system with high number of spacers suggests a robust immune machinery against bacteriophages. In case of <italic>T. islandicus</italic>, only one questionable CRISPR array was observed, and no CRISPR array was observed in the case of <italic>T. antranikianii</italic>. Interestingly, a <italic>PilA</italic>-<italic>ComZ</italic> locus was also absent in <italic>T. antranikianii</italic>. The two observations, when coupled together suggest that <italic>T. antranikianii</italic> might be resistant to a huge proportion of viruses due to lack of pilus system genes which are implicated in phage entry, thus avoiding the need for harboring the CRISPR system. Other <italic>Thermus</italic> members have, on the other hand, chosen pilus mediated natural transformation as an important evolutionary trait, even though it makes them more susceptible to phage attacks, leading them to harbor more frequent CRISPR defense systems. Thus, natural transformation may be regarded as an overall benefit imparting trait in the small thermophilic genomes of <italic>Thermus</italic>. Natural transformation has played a role in survival of these organisms since long. Therefore, the dispersal of this system is a rather favorable phenomenon wide this genus even though it imposes an additional cost of harboring CRISPR machinery on them.</p>
</sec>
</sec>
</sec>
<sec sec-type="conclusions" id="s4">
<title>Conclusion</title>
<p>Organisms belonging to the genus <italic>Thermus</italic> have occupied a significant position and have diversified present knowledge about thermophilic survival. <italic>T. parvatiensis</italic>, in accordance with its affiliation to the genus has maintained a small genome and a plastic plasmid. Plasmids of <italic>Thermus</italic> are hotspots for genome dynamism, acting as centers of influx as well as efflux of genes and pathways in this genus. A dynamic pan genome along with strain specific gene reservoirs signify acquisition and conservation of favorable attributes. One of the factors contributing towards this dynamism is an active natural transformation system of this genus. The natural competence machinery in <italic>Thermus</italic> has proved to be an overall advantageous trait for the dual reason of nutrition limitation and genetic variability. It has, however, made these organisms susceptible to viral grazing, leading to the development of viral defense arsenal, known as CRISPRs. The efficacy of choices made has led to proficient sustenance of this genus in the face of adversity and beyond.</p>
</sec>
<sec id="s5">
<title>Ethics statement</title>
<p>This article does not contain any studies with human participants or animals performed by any of the authors.</p>
</sec>
<sec id="s6">
<title>Author contributions</title>
<p>RL conceived the study and supervised manuscript preparation. CT performed the analysis except CRIPSR analysis. HM performed CRISPR analysis. HK prepared all tables and figures. VD, RN and KK helped in data interpretation and drafting of the manuscript.</p>
<sec>
<title>Conflict of interest statement</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
</sec>
</body>
<back>
<ack><p>CT and HK gratefully acknowledge Council of Scientific and Industrial Research and Indian Council of Medical Research respectively for providing research fellowships.</p>
</ack>
<sec sec-type="supplementary-material" id="s7">
<title>Supplementary material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="http://journal.frontiersin.org/article/10.3389/fmicb.2017.01410/full#supplementary-material">http://journal.frontiersin.org/article/10.3389/fmicb.2017.01410/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="Table1.PDF" id="SM1" mimetype="application/pdf" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Table2.PDF" id="SM2" mimetype="application/pdf" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Table3.PDF" id="SM3" mimetype="application/pdf" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Table4.PDF" id="SM4" mimetype="application/pdf" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Table5.PDF" id="SM5" mimetype="application/pdf" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Table6.PDF" id="SM6" mimetype="application/pdf" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Table7.PDF" id="SM7" mimetype="application/pdf" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Table8.PDF" id="SM8" mimetype="application/pdf" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Table9.PDF" id="SM9" mimetype="application/pdf" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Image1.jpg" id="SM10" mimetype="image/jpeg" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Image2.jpg" id="SM11" mimetype="image/jpeg" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Image3.jpg" id="SM12" mimetype="image/jpeg" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Image4.jpg" id="SM13" mimetype="image/jpeg" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Image5.jpg" id="SM14" mimetype="image/jpeg" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Image6.jpg" id="SM15" mimetype="image/jpeg" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Altschul</surname> <given-names>S. F.</given-names></name> <name><surname>Gish</surname> <given-names>W.</given-names></name> <name><surname>Miller</surname> <given-names>W.</given-names></name> <name><surname>Myers</surname> <given-names>E. W.</given-names></name> <name><surname>Lipman</surname> <given-names>D. J.</given-names></name></person-group> (<year>1990</year>). <article-title>Basic local alignment search tool</article-title>. <source>J. Mol. Biol.</source> <volume>215</volume>, <fpage>403</fpage>&#x02013;<lpage>410</lpage>. <pub-id pub-id-type="doi">10.1016/S0022-2836(05)80360-2</pub-id><pub-id pub-id-type="pmid">2231712</pub-id></citation></ref>
<ref id="B2">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Anderson</surname> <given-names>R. E.</given-names></name> <name><surname>Brazelton</surname> <given-names>W. J.</given-names></name> <name><surname>Baross</surname> <given-names>J. A.</given-names></name></person-group> (<year>2011</year>). <article-title>Using CRISPRs as a metagenomic tool to identify microbial hosts of a diffuse flow hydrothermal vent</article-title>. <source>FEMS Microbiol. Ecol.</source> <volume>77</volume>, <fpage>120</fpage>&#x02013;<lpage>133</lpage>. <pub-id pub-id-type="doi">10.1111/j.1574-6941.2011.01090.x</pub-id><pub-id pub-id-type="pmid">21410492</pub-id></citation></ref>
<ref id="B3">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Arndt</surname> <given-names>D.</given-names></name> <name><surname>Grant</surname> <given-names>J. R.</given-names></name> <name><surname>Marcu</surname> <given-names>A.</given-names></name> <name><surname>Sajed</surname> <given-names>T.</given-names></name> <name><surname>Pon</surname> <given-names>A.</given-names></name> <name><surname>Liang</surname> <given-names>Y.</given-names></name> <etal/></person-group>. (<year>2016</year>). <article-title>PHASTER: a better, faster version of the PHAST phage search tool</article-title>. <source>Nucleic Acids Res.</source> <volume>44</volume>, <fpage>W16</fpage>&#x02013;<lpage>W21</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkw387</pub-id><pub-id pub-id-type="pmid">27141966</pub-id></citation></ref>
<ref id="B4">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Auch</surname> <given-names>A. F.</given-names></name> <name><surname>Von Jan</surname> <given-names>M.</given-names></name> <name><surname>Klenk</surname> <given-names>H. P.</given-names></name> <name><surname>G&#x000F6;ker</surname> <given-names>M.</given-names></name></person-group> (<year>2010</year>). <article-title>Digital DNA-DNA hybridization for microbial species delineation by means of genome-to-genome sequence comparison</article-title>. <source>Stand. Genomic Sci.</source> <volume>2</volume>, <fpage>117</fpage>&#x02013;<lpage>134</lpage>. <pub-id pub-id-type="doi">10.4056/sigs.531120</pub-id><pub-id pub-id-type="pmid">21304684</pub-id></citation></ref>
<ref id="B5">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Averhoff</surname> <given-names>B.</given-names></name></person-group> (<year>2009</year>). <article-title>Shuffling genes around in hot environments: the unique DNA transporter of <italic>Thermus thermophilus</italic></article-title>. <source>FEMS Microbiol. Rev.</source> <volume>33</volume>, <fpage>611</fpage>&#x02013;<lpage>626</lpage>. <pub-id pub-id-type="doi">10.1111/j.1574-6976.2008.00160.x</pub-id><pub-id pub-id-type="pmid">19207744</pub-id></citation></ref>
<ref id="B6">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Aziz</surname> <given-names>R. K.</given-names></name> <name><surname>Bartels</surname> <given-names>D.</given-names></name> <name><surname>Best</surname> <given-names>A. A.</given-names></name> <name><surname>DeJongh</surname> <given-names>M.</given-names></name> <name><surname>Disz</surname> <given-names>T.</given-names></name> <name><surname>Edwards</surname> <given-names>R. A.</given-names></name> <etal/></person-group>. (<year>2008</year>). <article-title>The RAST server: rapid annotations using subsystems technology</article-title>. <source>BMC Genomics.</source> <volume>9</volume>:<fpage>75</fpage>. <pub-id pub-id-type="doi">10.1186/1471-2164-9-75</pub-id><pub-id pub-id-type="pmid">18261238</pub-id></citation></ref>
<ref id="B7">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Barrangou</surname> <given-names>R.</given-names></name> <name><surname>Fremaux</surname> <given-names>C.</given-names></name> <name><surname>Deveau</surname> <given-names>H.</given-names></name> <name><surname>Richards</surname> <given-names>M.</given-names></name> <name><surname>Boyaval</surname> <given-names>P.</given-names></name> <name><surname>Moineau</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2007</year>). <article-title>CRISPR provides acquired resistance against viruses in prokaryotes</article-title>. <source>Science</source> <volume>315</volume>, <fpage>1709</fpage>&#x02013;<lpage>1712</lpage>. <pub-id pub-id-type="doi">10.1126/science.1138140</pub-id><pub-id pub-id-type="pmid">17379808</pub-id></citation></ref>
<ref id="B8">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Blank</surname> <given-names>S.</given-names></name> <name><surname>Schr&#x000F6;der</surname> <given-names>C.</given-names></name> <name><surname>Schirrmacher</surname> <given-names>G.</given-names></name> <name><surname>Reisinger</surname> <given-names>C.</given-names></name> <name><surname>Antranikian</surname> <given-names>G.</given-names></name></person-group> (<year>2014</year>). <article-title>Biochemical characterization of a recombinant xylanase from <italic>Thermus brockianus</italic>, suitable for biofuel production</article-title>. <source>JSM Biotechnol. Biomed. Eng.</source> <volume>2</volume>:<fpage>1027</fpage>.</citation></ref>
<ref id="B9">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Brock</surname> <given-names>T. D.</given-names></name> <name><surname>Freeze</surname> <given-names>H.</given-names></name></person-group> (<year>1969</year>). <article-title><italic>Thermus aquaticus</italic> gen. n. and sp. <italic>n</italic>., a nonsporulating extreme thermophile</article-title>. <source>J. Bacteriol.</source> <volume>98</volume>, <fpage>289</fpage>&#x02013;<lpage>297</lpage>. <pub-id pub-id-type="pmid">5781580</pub-id></citation></ref>
<ref id="B10">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Brown</surname> <given-names>N. L.</given-names></name> <name><surname>Stoyanov</surname> <given-names>J. V.</given-names></name> <name><surname>Kidd</surname> <given-names>S. P.</given-names></name> <name><surname>Hobman</surname> <given-names>J. L.</given-names></name></person-group> (<year>2003</year>). <article-title>The MerR family of transcriptional regulators</article-title>. <source>FEMS Microbiol. Rev.</source> <volume>27</volume>, <fpage>145</fpage>&#x02013;<lpage>163</lpage>. <pub-id pub-id-type="doi">10.1016/S0168-6445(03)00051-2</pub-id><pub-id pub-id-type="pmid">12829265</pub-id></citation></ref>
<ref id="B11">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bruggemann</surname> <given-names>H.</given-names></name> <name><surname>Chen</surname> <given-names>C.</given-names></name></person-group> (<year>2006</year>). <article-title>Comparative genomics of <italic>Thermus thermophilus</italic>: plasticity of the megaplasmid and its contribution to a thermophilic lifestyle</article-title>. <source>J. Biotechnol.</source> <volume>124</volume>, <fpage>654</fpage>&#x02013;<lpage>661</lpage>. <pub-id pub-id-type="doi">10.1016/j.jbiotec.2006.03.043</pub-id><pub-id pub-id-type="pmid">16713647</pub-id></citation></ref>
<ref id="B12">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Burkhardt</surname> <given-names>J.</given-names></name> <name><surname>Vonck</surname> <given-names>J.</given-names></name> <name><surname>Averhoff</surname> <given-names>B.</given-names></name></person-group> (<year>2011</year>). <article-title>Structure and function of PilQ, a secretin of the DNA transporter from the thermophilic bacterium <italic>Thermus thermophilus</italic> HB27</article-title>. <source>J. Biol. Chem.</source> <volume>286</volume>, <fpage>9977</fpage>&#x02013;<lpage>9988</lpage>. <pub-id pub-id-type="doi">10.1074/jbc.M110.212688</pub-id><pub-id pub-id-type="pmid">21285351</pub-id></citation></ref>
<ref id="B13">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Carballeira</surname> <given-names>N.</given-names></name> <name><surname>Nazabal</surname> <given-names>M.</given-names></name> <name><surname>Brito</surname> <given-names>J.</given-names></name> <name><surname>Garcia</surname> <given-names>O.</given-names></name></person-group> (<year>1990</year>). <article-title>Purification of a thermostable DNA polymerase from <italic>Thermus thermophilus</italic> HB8, useful in the polymerase chain reaction</article-title>. <source>Biotechniques</source> <volume>9</volume>, <fpage>276</fpage>&#x02013;<lpage>281</lpage>. <pub-id pub-id-type="pmid">2223065</pub-id></citation></ref>
<ref id="B14">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chaudhari</surname> <given-names>N. M.</given-names></name> <name><surname>Gupta</surname> <given-names>V. K.</given-names></name> <name><surname>Dutta</surname> <given-names>C.</given-names></name></person-group> (<year>2016</year>). <article-title>BPGA- an ultra-fast pan-genome analysis pipeline</article-title>. <source>Sci Rep.</source> <volume>6</volume>:<fpage>24373</fpage>. <pub-id pub-id-type="doi">10.1038/srep24373</pub-id><pub-id pub-id-type="pmid">27071527</pub-id></citation></ref>
<ref id="B15">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chin</surname> <given-names>C.</given-names></name> <name><surname>Alexander</surname> <given-names>D. H.</given-names></name> <name><surname>Marks</surname> <given-names>P.</given-names></name> <name><surname>Klammer</surname> <given-names>A. A.</given-names></name> <name><surname>Drake</surname> <given-names>J.</given-names></name> <name><surname>Heiner</surname> <given-names>C.</given-names></name> <etal/></person-group>. (<year>2013</year>). <article-title>Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data</article-title>. <source>Nature Methods</source> <volume>10</volume>, <fpage>563</fpage>&#x02013;<lpage>569</lpage>. <pub-id pub-id-type="doi">10.1038/nmeth.2474</pub-id><pub-id pub-id-type="pmid">23644548</pub-id></citation></ref>
<ref id="B16">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chung</surname> <given-names>A. P.</given-names></name> <name><surname>Rainey</surname> <given-names>F. A.</given-names></name> <name><surname>Valente</surname> <given-names>M.</given-names></name> <name><surname>Nobre</surname> <given-names>M. F.</given-names></name> <name><surname>daCosta</surname> <given-names>M. S.</given-names></name></person-group> (<year>2000</year>). <article-title><italic>Thermus igniterrae</italic> sp. nov. and <italic>Thermus antranikianii</italic> sp. nov., two new species from Iceland</article-title>. <source>Int. J. Syst. Evol. Microbiol.</source> <volume>50</volume>, <fpage>209</fpage>&#x02013;<lpage>217</lpage>. <pub-id pub-id-type="doi">10.1099/00207713-50-1-209</pub-id><pub-id pub-id-type="pmid">10826806</pub-id></citation></ref>
<ref id="B17">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Collins</surname> <given-names>R. F.</given-names></name> <name><surname>Hassan</surname> <given-names>D.</given-names></name> <name><surname>Karuppiah</surname> <given-names>V.</given-names></name> <name><surname>Thistlethwaite</surname> <given-names>A.</given-names></name> <name><surname>Derrick</surname> <given-names>J. P.</given-names></name></person-group> (<year>2013</year>). <article-title>Structure and mechanism of the PilF DNA transformation ATPase from <italic>Thermus thermophilus</italic></article-title>. <source>Biochem. J.</source> <volume>450</volume>, <fpage>417</fpage>&#x02013;<lpage>425</lpage>. <pub-id pub-id-type="doi">10.1042/BJ20121599</pub-id><pub-id pub-id-type="pmid">23252471</pub-id></citation></ref>
<ref id="B18">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Darling</surname> <given-names>A. C. E.</given-names></name> <name><surname>Mau</surname> <given-names>B.</given-names></name> <name><surname>Perna</surname> <given-names>N. T.</given-names></name></person-group> (<year>2010</year>). <article-title>progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement</article-title>. <source>PLoS ONE</source> <volume>5</volume>:<fpage>e11147</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0011147</pub-id><pub-id pub-id-type="pmid">20593022</pub-id></citation></ref>
<ref id="B19">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Darmon</surname> <given-names>E.</given-names></name> <name><surname>Leach</surname> <given-names>D. R. F.</given-names></name></person-group> (<year>2014</year>). <article-title>Bacterial genome instability</article-title>. <source>Microbiol. Mol. Biol. Rev.</source> <volume>78</volume>, <fpage>1</fpage>&#x02013;<lpage>39</lpage>. <pub-id pub-id-type="doi">10.1128/MMBR.00035-13</pub-id><pub-id pub-id-type="pmid">24600039</pub-id></citation></ref>
<ref id="B20">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Deng</surname> <given-names>W.</given-names></name> <name><surname>Nickle</surname> <given-names>D. C.</given-names></name> <name><surname>Learn</surname> <given-names>G. H.</given-names></name> <name><surname>Maust</surname> <given-names>B.</given-names></name> <name><surname>Mullins</surname> <given-names>J. I.</given-names></name></person-group> (<year>2007</year>). <article-title>ViroBLAST: a stand-alone BLAST web server for flexible queries of multiple databases and user&#x00027;s datasets</article-title>. <source>Bioinformatics</source> <volume>23</volume>, <fpage>2334</fpage>&#x02013;<lpage>2336</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btm331</pub-id><pub-id pub-id-type="pmid">17586542</pub-id></citation></ref>
<ref id="B21">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dhillon</surname> <given-names>B. K.</given-names></name> <name><surname>Laird</surname> <given-names>M. R.</given-names></name> <name><surname>Shay</surname> <given-names>J. A.</given-names></name> <name><surname>Winsor</surname> <given-names>G. L.</given-names></name> <name><surname>Lo</surname> <given-names>R.</given-names></name> <name><surname>Nizam</surname> <given-names>F.</given-names></name> <etal/></person-group>. (<year>2015</year>). <article-title>IslandViewer 3: more flexible, interactive genomic island discovery, visualization and analysis</article-title>. <source>Nucleic Acids Res.</source> <volume>43</volume>, <fpage>W104</fpage>&#x02013;<lpage>W108</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkv401</pub-id><pub-id pub-id-type="pmid">25916842</pub-id></citation></ref>
<ref id="B22">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Doyle</surname> <given-names>J. J.</given-names></name> <name><surname>Doyle</surname> <given-names>J. L.</given-names></name></person-group> (<year>1990</year>). <article-title>Isolation of plant DNA from fresh tissue</article-title>. <source>Focus</source> <volume>12</volume>, <fpage>13</fpage>&#x02013;<lpage>15</lpage>.</citation></ref>
<ref id="B23">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dwivedi</surname> <given-names>V.</given-names></name> <name><surname>Kumari</surname> <given-names>K.</given-names></name> <name><surname>Gupta</surname> <given-names>S. K.</given-names></name> <name><surname>Kumari</surname> <given-names>R.</given-names></name> <name><surname>Tripathi</surname> <given-names>C.</given-names></name> <name><surname>Lata</surname> <given-names>P.</given-names></name> <etal/></person-group>. (<year>2015</year>). <article-title><italic>Thermus parvatiensis</italic> RL<sup>T</sup> sp. nov., isolated from a hot water spring, located atop the Himalayan ranges at Manikaran, India</article-title>. <source>Indian J. Microbiol.</source> <volume>55</volume>:<fpage>357</fpage>. <pub-id pub-id-type="doi">10.1007/s12088-015-0538-4</pub-id></citation></ref>
<ref id="B24">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dwivedi</surname> <given-names>V.</given-names></name> <name><surname>Sangwan</surname> <given-names>N.</given-names></name> <name><surname>Nigam</surname> <given-names>A.</given-names></name> <name><surname>Garg</surname> <given-names>N.</given-names></name> <name><surname>Niharika</surname> <given-names>N.</given-names></name> <name><surname>Khurana</surname> <given-names>P.</given-names></name> <etal/></person-group>. (<year>2012</year>). <article-title>Draft genome sequence of <italic>Thermus</italic> sp. RL isolated from hot water spring located atop the Himalayan ranges at Manikaran. Indian J</article-title>. <source>Bacteriol.</source> <volume>194</volume>:<fpage>3534</fpage>. <pub-id pub-id-type="doi">10.1128/JB.00604-12</pub-id></citation></ref>
<ref id="B25">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Edgar</surname> <given-names>R. C.</given-names></name></person-group> (<year>2004</year>). <article-title>MUSCLE: multiple sequence alignment with high accuracy and throughput</article-title>. <source>Nucleic Acids Res.</source> <volume>32</volume>, <fpage>1792</fpage>&#x02013;<lpage>1797</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkh340</pub-id><pub-id pub-id-type="pmid">15034147</pub-id></citation></ref>
<ref id="B26">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Edgar</surname> <given-names>R. C.</given-names></name></person-group> (<year>2010</year>). <article-title>Search and clustering orders of magnitude faster than BLAST</article-title>. <source>Bioinformatics</source> <volume>26</volume>, <fpage>2460</fpage>&#x02013;<lpage>2461</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btq461</pub-id><pub-id pub-id-type="pmid">20709691</pub-id></citation></ref>
<ref id="B27">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Engelke</surname> <given-names>D. R.</given-names></name> <name><surname>Krikos</surname> <given-names>A.</given-names></name> <name><surname>Bruck</surname> <given-names>M. E.</given-names></name> <name><surname>Ginsburg</surname> <given-names>D.</given-names></name></person-group> (<year>1990</year>). <article-title>Purification of <italic>Thermus aquaticus</italic> DNA polymerase expressed in <italic>Escherichia coli</italic></article-title>. <source>Anal. Biochemi.</source> <volume>191</volume>, <fpage>396</fpage>&#x02013;<lpage>400</lpage>. <pub-id pub-id-type="doi">10.1016/0003-2697(90)90238-5</pub-id><pub-id pub-id-type="pmid">2085185</pub-id></citation></ref>
<ref id="B28">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Felsenstein</surname> <given-names>J.</given-names></name></person-group> (<year>1993</year>). <source>PHYLIP (Phylogeny Inference Package), Version 3.5c</source>. <publisher-loc>Distributed by the author. Seattle, WD</publisher-loc>: <publisher-name>Department of Genome Sciences, University of Washington</publisher-name>.</citation></ref>
<ref id="B29">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Friedrich</surname> <given-names>A.</given-names></name> <name><surname>Prust</surname> <given-names>C.</given-names></name> <name><surname>Hartsch</surname> <given-names>T.</given-names></name> <name><surname>Henne</surname> <given-names>A.</given-names></name> <name><surname>Averhoff</surname> <given-names>B.</given-names></name></person-group> (<year>2002</year>). <article-title>Molecular analyses of the natural transformation machinery and identification of pilus structures in the extremely thermophilic bacterium <italic>Thermus thermophilus</italic> strain HB27</article-title>. <source>Appl. Environ. Microbiol.</source> <volume>68</volume>, <fpage>745</fpage>&#x02013;<lpage>755</lpage>. <pub-id pub-id-type="doi">10.1128/AEM.68.2.745-755.2002</pub-id><pub-id pub-id-type="pmid">11823215</pub-id></citation></ref>
<ref id="B30">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gao</surname> <given-names>F.</given-names></name> <name><surname>Zhang</surname> <given-names>C. T.</given-names></name></person-group> (<year>2008</year>). <article-title>Ori-Finder: a web based system for finding <italic>oriC</italic>s in unannotated bacterial genomes</article-title>. <source>BMC Bioinformatics.</source> <volume>9</volume>:<fpage>79</fpage>. <pub-id pub-id-type="doi">10.1186/1471-2105-9-79</pub-id><pub-id pub-id-type="pmid">18237442</pub-id></citation></ref>
<ref id="B31">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gounder</surname> <given-names>K.</given-names></name> <name><surname>Brzuszkiewicz</surname> <given-names>E.</given-names></name> <name><surname>Liesegang</surname> <given-names>H.</given-names></name> <name><surname>Wollherr</surname> <given-names>A.</given-names></name> <name><surname>Daniel</surname> <given-names>R.</given-names></name> <name><surname>Gottschalk</surname> <given-names>G.</given-names></name> <etal/></person-group>. (<year>2011</year>). <article-title>Sequence of the hyperplastic genome of the naturally competent <italic>Thermus scotoductus</italic> SA-01</article-title>. <source>BMC Genomics</source> <volume>12</volume>:<fpage>577</fpage>. <pub-id pub-id-type="doi">10.1186/1471-2164-12-577</pub-id><pub-id pub-id-type="pmid">22115438</pub-id></citation></ref>
<ref id="B32">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Grissa</surname> <given-names>I.</given-names></name> <name><surname>Vergnaud</surname> <given-names>G.</given-names></name> <name><surname>Pourcel</surname> <given-names>C.</given-names></name></person-group> (<year>2007</year>). <article-title>CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats</article-title>. <source>Nucleic Acids Res.</source> <volume>35</volume>, <fpage>W52</fpage>&#x02013;<lpage>W57</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkm360</pub-id><pub-id pub-id-type="pmid">17537822</pub-id></citation></ref>
<ref id="B33">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Guindon</surname> <given-names>S.</given-names></name> <name><surname>Delsuc</surname> <given-names>F.</given-names></name> <name><surname>Dufayard</surname> <given-names>J. F.</given-names></name> <name><surname>Gascuel</surname> <given-names>O.</given-names></name></person-group> (<year>2009</year>). <article-title>Estimating maximum likelihood phylogenies with PhyML</article-title>. <source>Methods Mol. Biol.</source> <volume>537</volume>, <fpage>113</fpage>&#x02013;<lpage>137</lpage>. <pub-id pub-id-type="doi">10.1007/978-1-59745-251-9_6</pub-id><pub-id pub-id-type="pmid">19378142</pub-id></citation></ref>
<ref id="B34">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hao</surname> <given-names>W.</given-names></name> <name><surname>Golding</surname> <given-names>G. B.</given-names></name></person-group> (<year>2010</year>). <article-title>Inferring bacterial genome flux while considering truncated genes</article-title>. <source>Genetics</source> <volume>186</volume>, <fpage>411</fpage>&#x02013;<lpage>426</lpage>. <pub-id pub-id-type="doi">10.1534/genetics.110.118448</pub-id><pub-id pub-id-type="pmid">20551435</pub-id></citation></ref>
<ref id="B35">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hasegawa</surname> <given-names>M.</given-names></name> <name><surname>Kishino</surname> <given-names>H.</given-names></name> <name><surname>Yano</surname> <given-names>T.</given-names></name></person-group> (<year>1985</year>). <article-title>Dating of the human-ape splitting by a molecular clock of mitochondrial DNA</article-title>. <source>J. Mol. Evol.</source> <volume>22</volume>, <fpage>160</fpage>&#x02013;<lpage>174</lpage>. <pub-id pub-id-type="doi">10.1007/BF02101694</pub-id><pub-id pub-id-type="pmid">3934395</pub-id></citation></ref>
<ref id="B36">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Henne</surname> <given-names>A.</given-names></name> <name><surname>Bruggemann</surname> <given-names>H.</given-names></name> <name><surname>Raasch</surname> <given-names>C.</given-names></name> <name><surname>Wiezer</surname> <given-names>A.</given-names></name> <name><surname>Hartsch</surname> <given-names>T.</given-names></name> <name><surname>Liesegang</surname> <given-names>H.</given-names></name> <etal/></person-group>. (<year>2004</year>). <article-title>The genome sequence of the extreme thermophile <italic>Thermus thermophilus</italic></article-title>. <source>Nat. Biotechnol.</source> <volume>22</volume>, <fpage>547</fpage>&#x02013;<lpage>553</lpage>. <pub-id pub-id-type="doi">10.1038/nbt956</pub-id><pub-id pub-id-type="pmid">15064768</pub-id></citation></ref>
<ref id="B37">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hudson</surname> <given-names>J. A.</given-names></name> <name><surname>Morgan</surname> <given-names>H. W.</given-names></name> <name><surname>Daniel</surname> <given-names>R. M.</given-names></name></person-group> (<year>1987</year>). <article-title><italic>Thermus filiformis</italic> sp. nov., a filamentous caldoactive bacterium</article-title>. <source>Int. J. Syst. Evol. Microbiol.</source> <volume>37</volume>, <fpage>431</fpage>&#x02013;<lpage>436</lpage>. <pub-id pub-id-type="doi">10.1099/00207713-37-4-431</pub-id></citation></ref>
<ref id="B38">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Iqbal</surname> <given-names>N.</given-names></name> <name><surname>Gu&#x000E9;rout</surname> <given-names>A. M.</given-names></name> <name><surname>Krin</surname> <given-names>E.</given-names></name> <name><surname>Le Roux</surname> <given-names>F.</given-names></name> <name><surname>Mazel</surname> <given-names>D.</given-names></name></person-group> (<year>2015</year>). <article-title>Comprehensive functional analysis of the 18 <italic>Vibro cholerae</italic> N16961 Toxin-Antitoxin systems substantiates their role in stabilizing the superintegron</article-title>. <source>J. Bacteriol.</source> <volume>197</volume>, <fpage>2150</fpage>&#x02013;<lpage>2159</lpage>. <pub-id pub-id-type="doi">10.1128/JB.00108-15</pub-id><pub-id pub-id-type="pmid">25897030</pub-id></citation></ref>
<ref id="B39">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kelly</surname> <given-names>S. L.</given-names></name> <name><surname>Kelly</surname> <given-names>D. E.</given-names></name></person-group> (<year>2013</year>). <article-title>Microbial cytochromes P450: biodiversity and biotechnology. Where do cytochromes P450 come from, what do they do and what can they do for us?</article-title> <source>Philos. Trans. R. Soc. Lond. B. Biol. Sci.</source> <volume>368</volume>:<fpage>20120476</fpage>. <pub-id pub-id-type="doi">10.1098/rstb.2012.0476</pub-id><pub-id pub-id-type="pmid">23297358</pub-id></citation></ref>
<ref id="B40">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kerepesi</surname> <given-names>C.</given-names></name> <name><surname>B&#x000E1;nky</surname> <given-names>D.</given-names></name> <name><surname>Grolmusz</surname> <given-names>V.</given-names></name></person-group> (<year>2014</year>). <article-title>AmphoraNet: the webserver implementation of the AMPHORA2 metagenomic workflow suite</article-title>. <source>Gene</source> <volume>533</volume>, <fpage>538</fpage>&#x02013;<lpage>540</lpage>. <pub-id pub-id-type="doi">10.1016/j.gene.2013.10.015</pub-id><pub-id pub-id-type="pmid">24144838</pub-id></citation></ref>
<ref id="B41">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Khorchid</surname> <given-names>A.</given-names></name> <name><surname>Ikura</surname> <given-names>M.</given-names></name></person-group> (<year>2006</year>). <article-title>Bacterial histidine kinase as a signal sensor and transducer</article-title>. <source>Int. J. Biochem. Cell. Biol.</source> <volume>38</volume>, <fpage>307</fpage>&#x02013;<lpage>312</lpage>. <pub-id pub-id-type="doi">10.1016/j.biocel.2005.08.018</pub-id><pub-id pub-id-type="pmid">16242988</pub-id></citation></ref>
<ref id="B42">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Konstantinidis</surname> <given-names>K. T.</given-names></name> <name><surname>Tiedje</surname> <given-names>J. M.</given-names></name></person-group> (<year>2005</year>). <article-title>Towards a genome-based taxonomy for prokaryotes</article-title>. <source>J. Bacteriol.</source> <volume>187</volume>, <fpage>6258</fpage>&#x02013;<lpage>6264</lpage>. <pub-id pub-id-type="doi">10.1128/JB.187.18.6258-6264.2005</pub-id><pub-id pub-id-type="pmid">16159757</pub-id></citation></ref>
<ref id="B43">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>K&#x000F6;rner</surname> <given-names>H.</given-names></name> <name><surname>Sofia</surname> <given-names>H. J.</given-names></name> <name><surname>Zumft</surname> <given-names>W. G.</given-names></name></person-group> (<year>2003</year>). <article-title>Phylogeny of the bacterial superfamily of Crp-Fnr transcription regulators: exploiting the metabolic spectrum by controlling alternative gene programs</article-title>. <source>FEMS Microbiol. Rev.</source> <volume>27</volume>, <fpage>559</fpage>&#x02013;<lpage>592</lpage>. <pub-id pub-id-type="doi">10.1016/S0168-6445(03)00066-4</pub-id><pub-id pub-id-type="pmid">14638413</pub-id></citation></ref>
<ref id="B44">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kretza</surname> <given-names>E.</given-names></name> <name><surname>Papaneophytou</surname> <given-names>C. P.</given-names></name> <name><surname>Papi</surname> <given-names>R. M.</given-names></name> <name><surname>Karidi</surname> <given-names>K.</given-names></name> <name><surname>Kiparissides</surname> <given-names>C.</given-names></name> <name><surname>Kyriakidis</surname> <given-names>D. A.</given-names></name></person-group> (<year>2012</year>). <article-title>Lipase activity in <italic>Thermus thermophilus</italic> HB8: purification and characterization of the extracellular enzyme</article-title>. <source>Biotechnol. Bioprocess Eng.</source> <volume>17</volume>, <fpage>512</fpage>&#x02013;<lpage>525</lpage>. <pub-id pub-id-type="doi">10.1007/s12257-011-0481-0</pub-id></citation></ref>
<ref id="B45">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kumwenda</surname> <given-names>B.</given-names></name> <name><surname>Litthauer</surname> <given-names>D.</given-names></name> <name><surname>Reva</surname> <given-names>O.</given-names></name></person-group> (<year>2014</year>). <article-title>Analysis of genomic rearrangements, horizontal gene transfer and role of plasmids in the evolution of industrial important <italic>Thermus</italic> species</article-title>. <source>BMC Genomics</source> <volume>15</volume>:<fpage>813</fpage>. <pub-id pub-id-type="doi">10.1186/1471-2164-15-813</pub-id><pub-id pub-id-type="pmid">25257245</pub-id></citation></ref>
<ref id="B46">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kurtz</surname> <given-names>S.</given-names></name> <name><surname>Phillippy</surname> <given-names>A.</given-names></name> <name><surname>Delcher</surname> <given-names>A. L.</given-names></name> <name><surname>Smoot</surname> <given-names>M.</given-names></name> <name><surname>Shumway</surname> <given-names>M.</given-names></name> <name><surname>Antonescu</surname> <given-names>C.</given-names></name> <etal/></person-group>. (<year>2004</year>). <article-title>Versatile and open software for comparing large genomes</article-title>. <source>Genome Biol.</source> <volume>5</volume>:<fpage>R12</fpage>. <pub-id pub-id-type="doi">10.1186/gb-2004-5-2-r12</pub-id><pub-id pub-id-type="pmid">14759262</pub-id></citation></ref>
<ref id="B47">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lagesan</surname> <given-names>K.</given-names></name> <name><surname>Hallin</surname> <given-names>P.</given-names></name> <name><surname>R&#x000F8;dland</surname> <given-names>E. A.</given-names></name> <name><surname>Staerfeldt</surname> <given-names>H. H.</given-names></name> <name><surname>Rognes</surname> <given-names>T.</given-names></name> <name><surname>Ussery</surname> <given-names>D. W.</given-names></name></person-group> (<year>2007</year>). <article-title>RNAmmer: consistent and rapid annotation of ribosomal RNA genes</article-title>. <source>Nucleic Acids Res.</source> <volume>35</volume>, <fpage>3100</fpage>&#x02013;<lpage>3108</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkm160</pub-id></citation></ref>
<ref id="B48">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lange</surname> <given-names>S. J.</given-names></name> <name><surname>Alkhnbashi</surname> <given-names>O. S.</given-names></name> <name><surname>Rose</surname> <given-names>D.</given-names></name> <name><surname>Will</surname> <given-names>S.</given-names></name> <name><surname>Backofen</surname> <given-names>R.</given-names></name></person-group> (<year>2013</year>). <article-title>CRISPRmap: an automated classification of repeat conservation in prokaryotic adaptive immune systems</article-title>. <source>Nucleic Acids Res.</source> <volume>41</volume>, <fpage>8034</fpage>&#x02013;<lpage>8044</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkt606</pub-id><pub-id pub-id-type="pmid">23863837</pub-id></citation></ref>
<ref id="B49">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Laslett</surname> <given-names>D.</given-names></name> <name><surname>Canback</surname> <given-names>B.</given-names></name></person-group> (<year>2004</year>). <article-title>ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences</article-title>. <source>Nucleic Acids Res.</source> <volume>32</volume>, <fpage>11</fpage>&#x02013;<lpage>16</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkh152</pub-id><pub-id pub-id-type="pmid">14704338</pub-id></citation></ref>
<ref id="B50">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lassmann</surname> <given-names>T.</given-names></name> <name><surname>Sonnhammer</surname> <given-names>E. L. L.</given-names></name></person-group> (<year>2005</year>). <article-title>KAlign - an accurate and fast multiple sequence alignment algorithm</article-title>. <source>BMC Bioinformatics</source> <volume>6</volume>:<fpage>298</fpage>. <pub-id pub-id-type="doi">10.1186/1471-2105-6-298</pub-id><pub-id pub-id-type="pmid">16343337</pub-id></citation></ref>
<ref id="B51">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>H.</given-names></name> <name><surname>Durbin</surname> <given-names>R.</given-names></name></person-group> (<year>2009</year>). <article-title>Fast and accurate short read alignment with Burrows-Wheeler Transform</article-title>. <source>Bioinformatics.</source> <volume>25</volume>, <fpage>1754</fpage>&#x02013;<lpage>1760</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btp324</pub-id><pub-id pub-id-type="pmid">19451168</pub-id></citation></ref>
<ref id="B52">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lorenz</surname> <given-names>M. G.</given-names></name> <name><surname>Wackernagel</surname> <given-names>W.</given-names></name></person-group> (<year>1994</year>). <article-title>Bacterial gene transfer by natural genetic transformation in the environment</article-title>. <source>Microbiol. Rev.</source> <volume>58</volume>, <fpage>563</fpage>&#x02013;<lpage>602</lpage>. <pub-id pub-id-type="pmid">7968924</pub-id></citation></ref>
<ref id="B53">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Makarova</surname> <given-names>K. S.</given-names></name> <name><surname>Wolf</surname> <given-names>Y. I.</given-names></name> <name><surname>Alkhnbashi</surname> <given-names>O. S.</given-names></name> <name><surname>Costa</surname> <given-names>F.</given-names></name> <name><surname>Shah</surname> <given-names>S. A.</given-names></name> <etal/></person-group>. (<year>2015</year>). <article-title>An updated evolutionary classification of CRISPR-Cas systems</article-title>. <source>Nature Rev. Microbiol.</source> <volume>13</volume>, <fpage>722</fpage>&#x02013;<lpage>736</lpage>. <pub-id pub-id-type="doi">10.1038/nrmicro3569</pub-id><pub-id pub-id-type="pmid">26411297</pub-id></citation></ref>
<ref id="B54">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Marraffini</surname> <given-names>L. A.</given-names></name> <name><surname>Sontheimer</surname> <given-names>E. J.</given-names></name></person-group> (<year>2008</year>). <article-title>CRISPR interference limits horizontal gene transfer in staphylococci by targeting DNA</article-title>. <source>Science</source> <volume>322</volume>, <fpage>1843</fpage>&#x02013;<lpage>1845</lpage>. <pub-id pub-id-type="doi">10.1126/science.1165771</pub-id><pub-id pub-id-type="pmid">19095942</pub-id></citation></ref>
<ref id="B55">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mefferd</surname> <given-names>C. C.</given-names></name> <name><surname>Zhou</surname> <given-names>E. M.</given-names></name> <name><surname>Yu</surname> <given-names>T. T.</given-names></name> <name><surname>Ming</surname> <given-names>H.</given-names></name> <name><surname>Murugapiran</surname> <given-names>S. K.</given-names></name> <name><surname>Huntemann</surname> <given-names>M.</given-names></name> <etal/></person-group>. (<year>2016</year>). <article-title>High-quality draft genomes from <italic>Thermus caliditerrae</italic> YIM 77777 and <italic>T. tengchongensis</italic> YIM 77401, isolates from Tengchong, China</article-title>. <source>Genome Announc.</source> <volume>4</volume>:<fpage>e00312</fpage>-<lpage>16</lpage>. <pub-id pub-id-type="doi">10.1128/genomeA.00312-16</pub-id><pub-id pub-id-type="pmid">27125486</pub-id></citation></ref>
<ref id="B56">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Milne</surname> <given-names>I.</given-names></name> <name><surname>Bayer</surname> <given-names>M.</given-names></name> <name><surname>Cardle</surname> <given-names>L.</given-names></name> <name><surname>Shaw</surname> <given-names>P.</given-names></name> <name><surname>Stephen</surname> <given-names>G.</given-names></name> <name><surname>Wright</surname> <given-names>F.</given-names></name> <etal/></person-group>. (<year>2010</year>). <article-title>Tablet-next generation sequence assembly visualization</article-title>. <source>Bioinformatics</source> <volume>26</volume>, <fpage>401</fpage>&#x02013;<lpage>402</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btp666</pub-id><pub-id pub-id-type="pmid">19965881</pub-id></citation></ref>
<ref id="B57">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ming</surname> <given-names>H.</given-names></name> <name><surname>Yin</surname> <given-names>Y. R.</given-names></name> <name><surname>Li</surname> <given-names>S.</given-names></name> <name><surname>Nie</surname> <given-names>G. X.</given-names></name> <name><surname>Yu</surname> <given-names>T. T.</given-names></name> <name><surname>Zhou</surname> <given-names>E. M.</given-names></name> <etal/></person-group>. (<year>2014</year>). <article-title><italic>Thermus caliditerrae</italic> sp. <italic>nov.</italic>, a novel thermophilic species isolated from a geothermal area</article-title>. <source>Int. J. Syst. Evol. Microbiol.</source> <volume>64</volume>, <fpage>650</fpage>&#x02013;<lpage>656</lpage>. <pub-id pub-id-type="doi">10.1099/ijs.0.056838-0</pub-id><pub-id pub-id-type="pmid">24158953</pub-id></citation></ref>
<ref id="B58">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mojica</surname> <given-names>F. J.</given-names></name> <name><surname>D&#x000ED;ez-Villase&#x000F1;or</surname> <given-names>C.</given-names></name> <name><surname>Garc&#x000ED;a-Mart&#x000ED;nez</surname> <given-names>J.</given-names></name> <name><surname>Soria</surname> <given-names>E.</given-names></name></person-group> (<year>2005</year>). <article-title>Intervening sequences of regularly spaced prokaryotic repeats derive from foreign genetic elements</article-title>. <source>J. Mol. Evol.</source> <volume>60</volume>, <fpage>174</fpage>&#x02013;<lpage>182</lpage>. <pub-id pub-id-type="doi">10.1007/s00239-004-0046-3</pub-id><pub-id pub-id-type="pmid">15791728</pub-id></citation></ref>
<ref id="B59">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Molina-Henares</surname> <given-names>A. J.</given-names></name> <name><surname>Krell</surname> <given-names>T.</given-names></name> <name><surname>Guazzaroni</surname> <given-names>M. E.</given-names></name> <name><surname>Segura</surname> <given-names>A.</given-names></name> <name><surname>Ramos</surname> <given-names>J. L.</given-names></name></person-group> (<year>2005</year>). <article-title>Members of the IcIR family of bacterial transcriptional regulators function as activators and/or repressors</article-title>. <source>FEMS Microbiol. Rev.</source> <volume>30</volume>, <fpage>157</fpage>&#x02013;<lpage>186</lpage>. <pub-id pub-id-type="doi">10.1111/j.1574-6976.2005.00008.x</pub-id></citation></ref>
<ref id="B60">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Murugapiran</surname> <given-names>S. K.</given-names></name> <name><surname>Huntemann</surname> <given-names>M.</given-names></name> <name><surname>Wei</surname> <given-names>C.-L.</given-names></name> <name><surname>Han</surname> <given-names>J.</given-names></name> <name><surname>Detter</surname> <given-names>J. C.</given-names></name> <name><surname>Han</surname> <given-names>C.</given-names></name> <etal/></person-group>. (<year>2013</year>). <article-title><italic>Thermus oshimai</italic> JL-2 and <italic>T. thermophilus</italic> JL-18 genome analysis illuminates pathways for carbon, nitrogen, and sulfur cycling</article-title>. <source>Stand. Genomic Sci.</source> <volume>7</volume>, <fpage>449</fpage>&#x02013;<lpage>468</lpage>. <pub-id pub-id-type="doi">10.4056/sigs.3667269</pub-id><pub-id pub-id-type="pmid">24019992</pub-id></citation></ref>
<ref id="B61">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nobre</surname> <given-names>M. F.</given-names></name> <name><surname>Tr&#x000FC;per</surname> <given-names>H. G.</given-names></name> <name><surname>da Costa</surname> <given-names>M. S.</given-names></name></person-group> (<year>1996</year>). <article-title>Transfer of <italic>Thermus ruber</italic> (Loginova et al. 1984), <italic>Thermus silvanus</italic> (Tenreiro et al. 1995), and <italic>Thermus chliarophilus</italic> (Tenreiro et al. 1995) to <italic>Meiothermus</italic> gen. nov. as <italic>Meiothermus ruber</italic> comb. nov., <italic>Meiothermus silvanus</italic> comb. nov., and <italic>Meiothermus chliarophilus</italic> comb. nov., respectively, and emendation of the genus <italic>Thermus</italic></article-title>. <source>Int. J. Syst. Bacteriol.</source> <volume>46</volume>, <fpage>604</fpage>&#x02013;<lpage>606</lpage>. <pub-id pub-id-type="doi">10.1099/00207713-46-2-604</pub-id></citation></ref>
<ref id="B62">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ohtani</surname> <given-names>N.</given-names></name> <name><surname>Tomita</surname> <given-names>M.</given-names></name> <name><surname>Itaya</surname> <given-names>M.</given-names></name></person-group> (<year>2012</year>). <article-title>The third plasmid pVV8 from <italic>Thermus thermophilus</italic> HB8: isolation, characterization, and sequence determination</article-title>. <source>Extremophiles</source> <volume>16</volume>, <fpage>237</fpage>&#x02013;<lpage>244</lpage>. <pub-id pub-id-type="doi">10.1007/s00792-011-0424-x</pub-id><pub-id pub-id-type="pmid">22212656</pub-id></citation></ref>
<ref id="B63">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Okonechnikov</surname> <given-names>K.</given-names></name> <name><surname>Golosova</surname> <given-names>O.</given-names></name> <name><surname>Fursov</surname> <given-names>M.</given-names></name></person-group> (<year>2012</year>). <article-title>Unipro UGENE: a unified bioinformatics toolkit</article-title>. <source>Bioinformatics</source> <volume>28</volume>, <fpage>1166</fpage>&#x02013;<lpage>1167</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/bts091</pub-id><pub-id pub-id-type="pmid">22368248</pub-id></citation></ref>
<ref id="B64">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Opperman</surname> <given-names>D. J.</given-names></name> <name><surname>van Heerden</surname> <given-names>E.</given-names></name></person-group> (<year>2006</year>). <article-title>Aerobic Cr(VI) reduction by <italic>Thermus scotoductus</italic> strain SA-01</article-title>. <source>J. Appl. Microbiol.</source> <volume>103</volume>, <fpage>1907</fpage>&#x02013;<lpage>1913</lpage>. <pub-id pub-id-type="doi">10.1111/j.1365-2672.2007.03429.x</pub-id><pub-id pub-id-type="pmid">17953600</pub-id></citation></ref>
<ref id="B65">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pa&#x00161;i&#x00107;</surname> <given-names>L.</given-names></name> <name><surname>Rodriguez-Mueller</surname> <given-names>B.</given-names></name> <name><surname>Martin-Cuadrado</surname> <given-names>A. B.</given-names></name> <name><surname>Mira</surname> <given-names>A.</given-names></name> <name><surname>Rohwer</surname> <given-names>F.</given-names></name> <name><surname>Rodriguez-Valera</surname> <given-names>F.</given-names></name></person-group> (<year>2009</year>). <article-title>Metagenomic islands of hyperhalophiles: the case of <italic>Salinibacter ruber</italic></article-title>. <source>BMC Genomics</source> <volume>10</volume>:<fpage>570</fpage>. <pub-id pub-id-type="doi">10.1186/1471-2164-10-570</pub-id><pub-id pub-id-type="pmid">19951421</pub-id></citation></ref>
<ref id="B66">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Proudfoot</surname> <given-names>M.</given-names></name> <name><surname>Kuznetsova</surname> <given-names>E.</given-names></name> <name><surname>Brown</surname> <given-names>G.</given-names></name> <name><surname>Rao</surname> <given-names>N. N.</given-names></name> <name><surname>Kitagawa</surname> <given-names>M.</given-names></name> <name><surname>Mori</surname> <given-names>H.</given-names></name> <etal/></person-group>. (<year>2004</year>). <article-title>General enzymatic screens identify three new nucleotidases in <italic>Escherichia coli</italic>. Biochemical characterization of SurE, YfbR, and YjjG</article-title>. <source>J. Biol. Chem.</source> <volume>279</volume>, <fpage>54687</fpage>&#x02013;<lpage>54694</lpage>. <pub-id pub-id-type="doi">10.1074/jbc.M411023200</pub-id><pub-id pub-id-type="pmid">15489502</pub-id></citation></ref>
<ref id="B67">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rao</surname> <given-names>V. B.</given-names></name> <name><surname>Saunders</surname> <given-names>N. B.</given-names></name></person-group> (<year>1992</year>). <article-title>A rapid polymerase-chain-reaction-directed sequencing strategy using a thermostable DNA polymerase from <italic>Thermus flavus</italic></article-title>. <source>Gene</source> <volume>113</volume>, <fpage>17</fpage>&#x02013;<lpage>23</lpage>. <pub-id pub-id-type="doi">10.1016/0378-1119(92)90665-C</pub-id><pub-id pub-id-type="pmid">1563631</pub-id></citation></ref>
<ref id="B68">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Richter</surname> <given-names>M.</given-names></name> <name><surname>Rosello-Mora</surname> <given-names>R.</given-names></name></person-group> (<year>2009</year>). <article-title>Shifting the genomic gold standard for the prokaryotic species definition</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A.</source> <volume>106</volume>, <fpage>19126</fpage>&#x02013;<lpage>19131</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.0906412106</pub-id><pub-id pub-id-type="pmid">19855009</pub-id></citation></ref>
<ref id="B69">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rose</surname> <given-names>I.</given-names></name> <name><surname>Biukovi&#x00107;</surname> <given-names>G.</given-names></name> <name><surname>Aderhold</surname> <given-names>P.</given-names></name> <name><surname>M&#x000FC;ller</surname> <given-names>V.</given-names></name> <name><surname>Gr&#x000FC;ber</surname> <given-names>G.</given-names></name> <name><surname>Averhoff</surname> <given-names>B.</given-names></name></person-group> (<year>2011</year>). <article-title>Identification and characterization of a unique, zinc-containing transport ATPase essential for natural transformation in <italic>Thermus thermophilus</italic> HB27</article-title>. <source>Extremophiles</source> <volume>15</volume>, <fpage>191</fpage>&#x02013;<lpage>202</lpage>. <pub-id pub-id-type="doi">10.1007/s00792-010-0343-2</pub-id><pub-id pub-id-type="pmid">21210168</pub-id></citation></ref>
<ref id="B70">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Roth</surname> <given-names>C.</given-names></name> <name><surname>Rastogi</surname> <given-names>S.</given-names></name> <name><surname>Arvestad</surname> <given-names>L.</given-names></name> <name><surname>Dittmar</surname> <given-names>K.</given-names></name> <name><surname>Light</surname> <given-names>S.</given-names></name> <name><surname>Ekman</surname> <given-names>D.</given-names></name> <etal/></person-group>. (<year>2007</year>). <article-title>Evolution after gene duplication: models, mechanisms, sequences, systems, and organisms</article-title>. <source>J. Exp. Zool.</source> <volume>306B</volume>, <fpage>58</fpage>&#x02013;<lpage>73</lpage>. <pub-id pub-id-type="doi">10.1002/jez.b.21124</pub-id></citation></ref>
<ref id="B71">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rouli</surname> <given-names>L.</given-names></name> <name><surname>Merhej</surname> <given-names>V.</given-names></name> <name><surname>Fournier</surname> <given-names>P. E.</given-names></name> <name><surname>Raoult</surname> <given-names>D.</given-names></name></person-group> (<year>2015</year>). <article-title>The bacterial pangenome as a new tool for analysing pathogenic bacteria</article-title>. <source>New Microbes New Infec.</source> <volume>7</volume>, <fpage>72</fpage>&#x02013;<lpage>85</lpage>. <pub-id pub-id-type="doi">10.1016/j.nmni.2015.06.005</pub-id><pub-id pub-id-type="pmid">26442149</pub-id></citation></ref>
<ref id="B72">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rowe-Magnus</surname> <given-names>D. A.</given-names></name> <name><surname>Guerout</surname> <given-names>A. M.</given-names></name> <name><surname>Biskri</surname> <given-names>L.</given-names></name> <name><surname>Bouige</surname> <given-names>P.</given-names></name> <name><surname>Mazel</surname> <given-names>D.</given-names></name></person-group> (<year>2003</year>). <article-title>Comparative analysis of superintegrons: engineering extensive genetic diversity in the Vibrionaceae</article-title>. <source>Genome Res.</source> <volume>13</volume>, <fpage>428</fpage>&#x02013;<lpage>442</lpage>. <pub-id pub-id-type="doi">10.1101/gr.617103</pub-id><pub-id pub-id-type="pmid">12618374</pub-id></citation></ref>
<ref id="B73">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sabath</surname> <given-names>N.</given-names></name> <name><surname>Ferrada</surname> <given-names>E.</given-names></name> <name><surname>Barve</surname> <given-names>A.</given-names></name> <name><surname>Wagner</surname> <given-names>A.</given-names></name></person-group> (<year>2013</year>). <article-title>Growth temperature and genome size in bacteria are negatively correlated, suggesting genomic streamlining during thermal adaptation</article-title>. <source>Genome Biol. Evol.</source> <volume>5</volume>, <fpage>966</fpage>&#x02013;<lpage>977</lpage>. <pub-id pub-id-type="doi">10.1093/gbe/evt050</pub-id><pub-id pub-id-type="pmid">23563968</pub-id></citation></ref>
<ref id="B74">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Salzer</surname> <given-names>R.</given-names></name> <name><surname>Kern</surname> <given-names>T.</given-names></name> <name><surname>Joos</surname> <given-names>F.</given-names></name> <name><surname>Averhoff</surname> <given-names>B.</given-names></name></person-group> (<year>2016</year>). <article-title>The <italic>Thermus thermophilus comEA/comEC</italic> operon is associated with DNA binding and regulation of the DNA translocator and type IV pili</article-title>. <source>Environ. Microbiol.</source> <volume>18</volume>, <fpage>65</fpage>&#x02013;<lpage>74</lpage>. <pub-id pub-id-type="doi">10.1111/1462-2920.12820</pub-id><pub-id pub-id-type="pmid">25727469</pub-id></citation></ref>
<ref id="B75">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sangwan</surname> <given-names>N.</given-names></name> <name><surname>Lambert</surname> <given-names>C.</given-names></name> <name><surname>Sharma</surname> <given-names>A.</given-names></name> <name><surname>Gupta</surname> <given-names>V.</given-names></name> <name><surname>Khurana</surname> <given-names>P.</given-names></name> <name><surname>Khurana</surname> <given-names>J. P.</given-names></name> <etal/></person-group>. (<year>2015</year>). <article-title>Arsenic rich Himalayan hot spring metagenomics reveal genetically novel predator-prey genotypes</article-title>. <source>Environ. Microbiol. Rep.</source> <volume>7</volume>, <fpage>812</fpage>&#x02013;<lpage>823</lpage>. <pub-id pub-id-type="doi">10.1111/1758-2229.12297</pub-id><pub-id pub-id-type="pmid">25953741</pub-id></citation></ref>
<ref id="B76">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schwarzenlander</surname> <given-names>C.</given-names></name> <name><surname>Haase</surname> <given-names>W.</given-names></name> <name><surname>Averhoff</surname> <given-names>B.</given-names></name></person-group> (<year>2009</year>). <article-title>The role of single subunits of the DNA transport machinery of <italic>Thermus thermophilus</italic> HB27 in DNA binding and transport</article-title>. <source>Environ. Microbiol.</source> <volume>11</volume>, <fpage>801</fpage>&#x02013;<lpage>808</lpage>. <pub-id pub-id-type="doi">10.1111/j.1462-2920.2008.01801.x</pub-id><pub-id pub-id-type="pmid">19396940</pub-id></citation></ref>
<ref id="B77">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Segata</surname> <given-names>N.</given-names></name> <name><surname>B&#x000F6;rnigen</surname> <given-names>D.</given-names></name> <name><surname>Morgan</surname> <given-names>X. C.</given-names></name> <name><surname>Huttenhower</surname> <given-names>C.</given-names></name></person-group> (<year>2013</year>). <article-title>PhyloPhlAn is a new method for improved phylogenetic and taxonomic placement of microbes</article-title>. <source>Nat. Comm.</source> <volume>4</volume>:<fpage>2304</fpage>. <pub-id pub-id-type="doi">10.1038/ncomms3304</pub-id><pub-id pub-id-type="pmid">23942190</pub-id></citation></ref>
<ref id="B78">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Seitz</surname> <given-names>P.</given-names></name> <name><surname>Blokesch</surname> <given-names>M.</given-names></name></person-group> (<year>2013</year>). <article-title>Cues and regulatory pathways involved in natural competence and transformation in pathogenic and environmental Gram-negative bacteria</article-title>. <source>FEMS Microbiol. Rev.</source> <volume>37</volume>, <fpage>336</fpage>&#x02013;<lpage>363</lpage>. <pub-id pub-id-type="doi">10.1111/j.1574-6976.2012.00353.x</pub-id><pub-id pub-id-type="pmid">22928673</pub-id></citation></ref>
<ref id="B79">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shaw</surname> <given-names>J. F.</given-names></name> <name><surname>Lin</surname> <given-names>F. P.</given-names></name> <name><surname>Chen</surname> <given-names>S. C.</given-names></name> <name><surname>Chen</surname> <given-names>H. C.</given-names></name></person-group> (<year>1995</year>). <article-title>Purification and properties of an extracellular &#x003B1;-amylase from <italic>Thermus</italic> sp</article-title>. <source>Bot. Bull. Acad. Sci.</source> <volume>36</volume>, <fpage>195</fpage>&#x02013;<lpage>200</lpage>.</citation></ref>
<ref id="B80">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Steffen</surname> <given-names>M. M.</given-names></name> <name><surname>Li</surname> <given-names>Z.</given-names></name> <name><surname>Effler</surname> <given-names>T. C.</given-names></name> <name><surname>Hauser</surname> <given-names>L. J.</given-names></name> <name><surname>Boyer</surname> <given-names>G. L.</given-names></name> <name><surname>Wilhelm</surname> <given-names>S. W.</given-names></name></person-group> (<year>2012</year>). <article-title>Comparative metagenomics of toxic freshwater cyanobacteria bloom communities on two continents</article-title>. <source>PLoS ONE</source> <volume>7</volume>:<fpage>e44002</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0044002</pub-id><pub-id pub-id-type="pmid">22952848</pub-id></citation></ref>
<ref id="B81">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tamakoshi</surname> <given-names>M.</given-names></name> <name><surname>Murakami</surname> <given-names>A.</given-names></name> <name><surname>Sugisawa</surname> <given-names>M.</given-names></name> <name><surname>Tsuneizumi</surname> <given-names>K.</given-names></name> <name><surname>Takeda</surname> <given-names>S.</given-names></name> <name><surname>Saheki</surname> <given-names>T.</given-names></name> <etal/></person-group>. (<year>2011</year>). <article-title>Genomic and proteomic characterization of the large Myoviridae bacteriophage &#x003A6;TMA of the extreme thermophile <italic>Thermus thermophilus</italic></article-title>. <source>Bacteriophage</source> <volume>1</volume>, <fpage>152</fpage>&#x02013;<lpage>164</lpage>. <pub-id pub-id-type="doi">10.4161/bact.1.3.16712</pub-id><pub-id pub-id-type="pmid">22164349</pub-id></citation></ref>
<ref id="B82">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tamura</surname> <given-names>K.</given-names></name> <name><surname>Stecher</surname> <given-names>G.</given-names></name> <name><surname>Peterson</surname> <given-names>D.</given-names></name> <name><surname>Filipski</surname> <given-names>A.</given-names></name> <name><surname>Kumar</surname> <given-names>S.</given-names></name></person-group> (<year>2013</year>). <article-title>MEGA6: molecular evolutionary genetics analysis version 6.0</article-title>. <source>Mol. Biol. Evol.</source> <volume>30</volume>, <fpage>2725</fpage>&#x02013;<lpage>2729</lpage>. <pub-id pub-id-type="doi">10.1093/molbev/mst197</pub-id><pub-id pub-id-type="pmid">24132122</pub-id></citation></ref>
<ref id="B83">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Teh</surname> <given-names>B. S.</given-names></name> <name><surname>Rahman</surname> <given-names>A. Y. A.</given-names></name> <name><surname>Saito</surname> <given-names>J. A.</given-names></name> <name><surname>Hou</surname> <given-names>S.</given-names></name> <name><surname>Alam</surname> <given-names>M.</given-names></name></person-group> (<year>2012</year>). <article-title>Complete genome sequence of the thermophilic bacterium <italic>Thermus</italic> sp. strain CCB_US3_UF1</article-title>. <source>J. Bacteriol.</source> <volume>194</volume>, <fpage>1240</fpage>. <pub-id pub-id-type="doi">10.1128/JB.06589-11</pub-id><pub-id pub-id-type="pmid">22328745</pub-id></citation></ref>
<ref id="B84">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tettelin</surname> <given-names>H.</given-names></name> <name><surname>Masignani</surname> <given-names>V.</given-names></name> <name><surname>Cieslewicz</surname> <given-names>M. J.</given-names></name> <name><surname>Donati</surname> <given-names>C.</given-names></name> <name><surname>Medini</surname> <given-names>D.</given-names></name> <name><surname>Ward</surname> <given-names>N. L.</given-names></name> <etal/></person-group>. (<year>2005</year>). <article-title>Genome analysis of multiple pathogenic isolates of <italic>Streptococcus agalactiae</italic>: implications for the microbial &#x0201C;pan-genome.&#x0201D;</article-title> <source>Proc. Natl. Acad. Sci. U.S.A.</source> <volume>102</volume>, <fpage>13950</fpage>&#x02013;<lpage>13955</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.0506758102</pub-id><pub-id pub-id-type="pmid">16172379</pub-id></citation></ref>
<ref id="B85">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Touchon</surname> <given-names>M.</given-names></name> <name><surname>de Sousa</surname> <given-names>J. A. M.</given-names></name> <name><surname>Rocha</surname> <given-names>E. P. C.</given-names></name></person-group> (<year>2017</year>). <article-title>Embracing the enemy: the diversification of microbial gene repertoires by phage-mediated horizontal gene transfer</article-title>. <source>Curr. Opin. Microbiol.</source> <volume>38</volume>, <fpage>66</fpage>&#x02013;<lpage>73</lpage>. <pub-id pub-id-type="doi">10.1016/j.mib.2017.04.010</pub-id><pub-id pub-id-type="pmid">28527384</pub-id></citation></ref>
<ref id="B86">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vernikos</surname> <given-names>G. S.</given-names></name> <name><surname>Parkhill</surname> <given-names>J.</given-names></name></person-group> (<year>2006</year>). <article-title>Interpolated variable order motifs for identification of horizontally acquired DNA: revisiting the <italic>Salmonella</italic> pathogenicity islands</article-title>. <source>Bioinformatics</source> <volume>22</volume>, <fpage>2196</fpage>&#x02013;<lpage>2203</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btl369</pub-id><pub-id pub-id-type="pmid">16837528</pub-id></citation></ref>
<ref id="B87">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Weinberger</surname> <given-names>A. D.</given-names></name> <name><surname>Sun</surname> <given-names>C. L.</given-names></name> <name><surname>Plucinski</surname> <given-names>M. M.</given-names></name> <name><surname>Denef</surname> <given-names>V. J.</given-names></name> <name><surname>Thomas</surname> <given-names>B. C.</given-names></name> <name><surname>Horvath</surname> <given-names>P.</given-names></name> <etal/></person-group>. (<year>2012a</year>). <article-title>Persisting viral sequences shape microbial CRISPR-based immunity</article-title>. <source>PLoS Comput. Biol.</source> <volume>8</volume>:<fpage>e1002475</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pcbi.1002475</pub-id><pub-id pub-id-type="pmid">22532794</pub-id></citation></ref>
<ref id="B88">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Weinberger</surname> <given-names>A. D.</given-names></name> <name><surname>Wolf</surname> <given-names>Y. I.</given-names></name> <name><surname>Lobkovsky</surname> <given-names>A. E.</given-names></name> <name><surname>Gilmore</surname> <given-names>M. S.</given-names></name> <name><surname>Koonin</surname> <given-names>E. V.</given-names></name></person-group> (<year>2012b</year>). <article-title>Viral diversity threshold for adaptive immunity in prokaryotes</article-title>. <source>mBio</source> <volume>3</volume>:<fpage>e00456</fpage>-<lpage>12</lpage>. <pub-id pub-id-type="doi">10.1128/mBio.00456-12</pub-id><pub-id pub-id-type="pmid">23221803</pub-id></citation></ref>
<ref id="B89">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Weissbach</surname> <given-names>H.</given-names></name> <name><surname>Etienne</surname> <given-names>F.</given-names></name> <name><surname>Hoshi</surname> <given-names>T.</given-names></name> <name><surname>Heinemann</surname> <given-names>S. H.</given-names></name> <name><surname>Lowther</surname> <given-names>W. T.</given-names></name> <name><surname>Matthews</surname> <given-names>B.</given-names></name> <etal/></person-group>. (<year>2002</year>). <article-title>Peptide methionine sulfoxide reductase: structure, mechanism of action, and biological function</article-title>. <source>Arch. Biochem. Biophysic.</source> <volume>397</volume>, <fpage>172</fpage>&#x02013;<lpage>178</lpage>. <pub-id pub-id-type="doi">10.1006/abbi.2001.2664</pub-id><pub-id pub-id-type="pmid">11795868</pub-id></citation></ref>
<ref id="B90">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Wheeler</surname> <given-names>D.</given-names></name> <name><surname>Bhagwat</surname> <given-names>M.</given-names></name></person-group> (<year>2007</year>). <article-title>BLAST QuickStart</article-title>, in <source>Comparative Genomics</source>, <volume>Vol. 1, 2</volume> Edn., ed <person-group person-group-type="editor"><name><surname>Bergman</surname> <given-names>N. H.</given-names></name></person-group> (<publisher-loc>Totowa, NJ</publisher-loc>: <publisher-name>Humana Press</publisher-name>).</citation></ref>
<ref id="B91">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wu</surname> <given-names>S.</given-names></name> <name><surname>Zhu</surname> <given-names>Z.</given-names></name> <name><surname>Fu</surname> <given-names>L.</given-names></name> <name><surname>Niu</surname> <given-names>B.</given-names></name> <name><surname>Li</surname> <given-names>W.</given-names></name></person-group> (<year>2011</year>). <article-title>WebMGA: a customizable web server for fast metagenomic sequence analysis</article-title>. <source>BMC Genomics.</source> <volume>12</volume>:<fpage>444</fpage>. <pub-id pub-id-type="doi">10.1186/1471-2164-12-444</pub-id><pub-id pub-id-type="pmid">21899761</pub-id></citation></ref>
<ref id="B92">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yang</surname> <given-names>H. C.</given-names></name> <name><surname>Rosen</surname> <given-names>B. P.</given-names></name></person-group> (<year>2016</year>). <article-title>New mechanisms of bacterial arsenic resistance</article-title>. <source>Biomed. J.</source> <volume>39</volume>, <fpage>5</fpage>&#x02013;<lpage>13</lpage>. <pub-id pub-id-type="doi">10.1016/j.bj.2015.08.003</pub-id><pub-id pub-id-type="pmid">27105594</pub-id></citation></ref>
<ref id="B93">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yu</surname> <given-names>M. X.</given-names></name> <name><surname>Slater</surname> <given-names>M. R.</given-names></name> <name><surname>Ackermann</surname> <given-names>H. W.</given-names></name></person-group> (<year>2006</year>). <article-title>Isolation and characterization of <italic>Thermus</italic> bacteriophages</article-title>. <source>Arch. Virol.</source> <volume>151</volume>, <fpage>663</fpage>. <pub-id pub-id-type="doi">10.1007/s00705-005-0667-x</pub-id><pub-id pub-id-type="pmid">16308675</pub-id></citation></ref>
<ref id="B94">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yu</surname> <given-names>T. T.</given-names></name> <name><surname>Ming</surname> <given-names>H.</given-names></name> <name><surname>Yao</surname> <given-names>J. C.</given-names></name> <name><surname>Zhou</surname> <given-names>E. M.</given-names></name> <name><surname>Park</surname> <given-names>D. J.</given-names></name> <name><surname>Hozzein</surname> <given-names>W. N.</given-names></name> <etal/></person-group>. (<year>2015</year>). <article-title><italic>Thermus amyloliquefaciens</italic> sp. nov., isolated from a hot spring sediment sample</article-title>. <source>Int. J. Syst. Evol. Microbiol.</source> <volume>65</volume>, <fpage>2491</fpage>&#x02013;<lpage>2495</lpage>. <pub-id pub-id-type="doi">10.1099/ijs.0.000289</pub-id><pub-id pub-id-type="pmid">25920724</pub-id></citation></ref>
<ref id="B95">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>Q.</given-names></name> <name><surname>Ye</surname> <given-names>Y.</given-names></name></person-group> (<year>2017</year>). <article-title>Not all predicted CRISPR-Cas systems are equal: isolated <italic>cas</italic> genes and classes of CRISPR like elements</article-title>. <source>BMC Bioinformatics.</source> <volume>18</volume>:<fpage>92</fpage>. <pub-id pub-id-type="doi">10.1186/s12859-017-1512-4</pub-id><pub-id pub-id-type="pmid">28166719</pub-id></citation></ref>
<ref id="B96">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhou</surname> <given-names>A.</given-names></name> <name><surname>Chen</surname> <given-names>Y. I.</given-names></name> <name><surname>Zane</surname> <given-names>G. M.</given-names></name> <name><surname>He</surname> <given-names>Z.</given-names></name> <name><surname>Hemme</surname> <given-names>C. L.</given-names></name> <name><surname>Joachimiak</surname> <given-names>M. P.</given-names></name> <etal/></person-group>. (<year>2012</year>). <article-title>Functional characterization of Crp/Fnr-type global transcriptional regulators in <italic>Desulfovibrio vulgaris</italic> hildenborough</article-title>. <source>Appl. Environ. Microbiol.</source> <volume>78</volume>, <fpage>1168</fpage>&#x02013;<lpage>1177</lpage>. <pub-id pub-id-type="doi">10.1128/AEM.05666-11</pub-id><pub-id pub-id-type="pmid">22156435</pub-id></citation></ref>
<ref id="B97">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhou</surname> <given-names>E. M.</given-names></name> <name><surname>Murugapiran</surname> <given-names>S. K.</given-names></name> <name><surname>Mefferd</surname> <given-names>C. C.</given-names></name> <name><surname>Liu</surname> <given-names>L.</given-names></name> <name><surname>Xian</surname> <given-names>W. D.</given-names></name> <name><surname>Yin</surname> <given-names>Y. R.</given-names></name> <etal/></person-group>. (<year>2016</year>). <article-title>High-quality draft genome sequence of <italic>Thermus amyloliquefaciens</italic> type strain YIM 77409<sup>T</sup> with an incomplete denitrification pathway</article-title>. <source>Stand. Genomic Sci.</source> <volume>11</volume>:<fpage>20</fpage>. <pub-id pub-id-type="doi">10.1186/s40793-016-0140-3</pub-id><pub-id pub-id-type="pmid">26925197</pub-id></citation></ref>
<ref id="B98">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhou</surname> <given-names>Y.</given-names></name> <name><surname>Liang</surname> <given-names>Y.</given-names></name> <name><surname>Lynch</surname> <given-names>K.</given-names></name> <name><surname>Dennis</surname> <given-names>J. J.</given-names></name> <name><surname>Wishart</surname> <given-names>D. S.</given-names></name></person-group> (<year>2011</year>). <article-title>PHAST: a fast phage search tool</article-title>. <source>Nucleic Acids Res.</source> <volume>39</volume>, <fpage>W347</fpage>&#x02013;<lpage>W352</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkr485</pub-id><pub-id pub-id-type="pmid">21672955</pub-id></citation></ref>
</ref-list>
<fn-group>
<fn fn-type="financial-disclosure"><p><bold>Funding.</bold> This work was supported by grants from the Department of Biotechnology (Grant no. BT/PR15118/BCE/8/1141/2015), Government of India and National Bureau of Agriculturally Important Microorganisms (Grant no. NBAIM/AMAAS/2014-17/PF/9).</p>
</fn>
</fn-group>
</back>
</article>