<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="2.3" xml:lang="EN">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Microbiol.</journal-id>
<journal-title>Frontiers in Microbiology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Microbiol.</abbrev-journal-title>
<issn pub-type="epub">1664-302X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fmicb.2023.1104456</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Microbiology</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Advantages of long- and short-reads sequencing for the hybrid investigation of the <italic>Mycobacterium tuberculosis</italic> genome</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author"><name>
<surname>Di Marco</surname>
<given-names>Federico</given-names>
</name>
<xref rid="aff1" ref-type="aff"><sup>1</sup></xref>
<xref rid="aff2" ref-type="aff"><sup>2</sup></xref>
<uri xlink:href="https://loop.frontiersin.org/people/2107955/overview"/>
</contrib>
<contrib contrib-type="author"><name>
<surname>Spitaleri</surname>
<given-names>Andrea</given-names>
</name>
<xref rid="aff1" ref-type="aff"><sup>1</sup></xref>
<xref rid="aff3" ref-type="aff"><sup>3</sup></xref>
<uri xlink:href="https://loop.frontiersin.org/people/285897/overview"/>
</contrib>
<contrib contrib-type="author"><name>
<surname>Battaglia</surname>
<given-names>Simone</given-names>
</name>
<xref rid="aff1" ref-type="aff"><sup>1</sup></xref>
</contrib>
<contrib contrib-type="author"><name>
<surname>Batignani</surname>
<given-names>Virginia</given-names>
</name>
<xref rid="aff1" ref-type="aff"><sup>1</sup></xref>
<uri xlink:href="https://loop.frontiersin.org/people/2154560/overview"/>
</contrib>
<contrib contrib-type="author"><name>
<surname>Cabibbe</surname>
<given-names>Andrea Maurizio</given-names>
</name>
<xref rid="aff1" ref-type="aff"><sup>1</sup></xref>
<uri xlink:href="https://loop.frontiersin.org/people/489173/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes"><name>
<surname>Cirillo</surname>
<given-names>Daniela Maria</given-names>
</name>
<xref rid="aff1" ref-type="aff"><sup>1</sup></xref>
<xref rid="c001" ref-type="corresp"><sup>&#x002A;</sup></xref>
<uri xlink:href="https://loop.frontiersin.org/people/1595231/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Emerging Bacterial Pathogens Unit, IRCCS San Raffaele Scientific Institute</institution>, <addr-line>Milan</addr-line>, <country>Italy</country></aff>
<aff id="aff2"><sup>2</sup><institution>Fondazione Centro San Raffaele</institution>, <addr-line>Milan</addr-line>, <country>Italy</country></aff>
<aff id="aff3"><sup>3</sup><institution>Universit&#x00E0; Vita Salute San Raffaele</institution>, <addr-line>Milan</addr-line>, <country>Italy</country></aff>
<author-notes>
<fn id="fn0001" fn-type="edited-by">
<p>Edited by: Davood Darban-Sarokhalil, Iran University of Medical Sciences, Iran</p>
</fn>
<fn id="fn0002" fn-type="edited-by">
<p>Reviewed by: Alessandro Atzeni, Rovira i Virgili University, Spain; Tianfeng He, Ningbo Municipal Center for Disease Control and Prevention, China; Yanlin Zhao, Chinese Center For Disease Control and Prevention, China</p>
</fn>
<corresp id="c001">&#x002A;Correspondence: Daniela Maria Cirillo, <email>cirillo.daniela@hsr.it</email></corresp>
<fn id="fn0003" fn-type="other">
<p>This article was submitted to Infectious Agents and Disease, a section of the journal Frontiers in Microbiology</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>02</day>
<month>02</month>
<year>2023</year>
</pub-date>
<pub-date pub-type="collection">
<year>2023</year>
</pub-date>
<volume>14</volume>
<elocation-id>1104456</elocation-id>
<history>
<date date-type="received">
<day>21</day>
<month>11</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>16</day>
<month>01</month>
<year>2023</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2023 Di Marco, Spitaleri, Battaglia, Batignani, Cabibbe and Cirillo.</copyright-statement>
<copyright-year>2023</copyright-year>
<copyright-holder>Di Marco, Spitaleri, Battaglia, Batignani, Cabibbe and Cirillo</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<sec>
<title>Introduction</title>
<p>In the fight to limit the global spread of antibiotic resistance, computational challenges associated with sequencing technology can impact the accuracy of downstream analysis, including drug resistance identification, transmission, and genome resolution. About 10% of <italic>Mycobacterium tuberculosis</italic> (MTB) genome is constituted by the PE/PPE family, a GC-rich repetitive genome region. Although sequencing using short read technology is widely used, it is well recognized its limit in the PE/PPE regions due to the unambiguously mapping process onto the reference genome. The aim of this study was to compare the performances of short-reads (SRS), long-reads (LRS) and hybrid-reads (HYBR) based analysis over different common investigative tasks: genome coverage estimation, variant calling and cluster analysis, drug resistance detection and de novo assembly.</p>
</sec>
<sec>
<title>Methods</title>
<p>For the study 13 model MTB clinical isolates were sequenced with both SRS and LRS. HYBR were produced correcting the long reads with the short reads. The fastq from the three approaches were then processed using a customized version of MTBseq for genome coverage estimation and variant calling and using two different assemblers for de novo assembly evaluation.</p>
</sec>
<sec>
<title>Results</title>
<p>Estimation of genome coverage performances showed lower 8X breadth coverage for SRS respect to LRS and HYBR: considering the PE/PPE genes, SRS showed low results for the PE_PGRS family, while obtained acceptable coverage in PE and PPE genes; LRS and HYBR reached optimal coverages in PE/PPE genes. For variant calling HYBR showed the highest resolution, detecting the highest percentage of uniquely identified mutations compared to LRS and SRS. All three approaches agreed on the identification of two major clusters, with HYBR identifying an higher number of SNPs between the two clusters. Comparing the quality of the assemblies, HYBR and LRS obtained better results than SRS.</p>
</sec>
<sec>
<title>Discussion</title>
<p>In conclusion, depending on the aim of the investigation, both SRS and LRS present complementary advantages and limitations implying that for a full resolution of MTB genomes, where all the mentioned analyses and both technologies are needed, the use of the HYBR approach represents a valid option and a well-rounded strategy.</p>
</sec>
</abstract>
<kwd-group>
<kwd>next-generation sequencing</kwd>
<kwd>hybrid approach</kwd>
<kwd>long reads</kwd>
<kwd>drug resistance</kwd>
<kwd><italic>Mycobacterium tuberculosis</italic></kwd>
<kwd>transmission analysis</kwd>
<kwd>repetitive regions</kwd>
</kwd-group>
<counts>
<fig-count count="6"/>
<table-count count="1"/>
<equation-count count="0"/>
<ref-count count="44"/>
<page-count count="9"/>
<word-count count="6052"/>
</counts>
</article-meta>
</front>
<body>
<sec id="sec1" sec-type="intro">
<title>1. Introduction</title>
<p>Next-generation sequencing (NGS) technologies play a fundamental role in studying microbial genomes (<xref ref-type="bibr" rid="ref19">K&#x00F6;ser et al., 2014</xref>). Nowadays, the whole-genome sequencing (WGS) of pathogens and viruses is routinely exploited in epidemiological outbreak analysis (<xref ref-type="bibr" rid="ref9">Ferdinand et al., 2021</xref>), to identify and characterize bacterial pathogens and transmission chains. Recently, WGS has emerged as a powerful tool that could help in the battle of the spread of antibiotic resistance for different species (<xref ref-type="bibr" rid="ref10">Gladstone et al., 2021</xref>). <italic>Tuberculosis</italic> still constitutes one of the most serious threats to human health, killing nearly 1.5 million of people per year (<xref ref-type="bibr" rid="ref42">World Health Organization, 2021a</xref>). The higher accuracy of short-reads technology (SRS), such as Illumina, together with the use of a catalog of <italic>Mycobacterium tuberculosis</italic> (MTB) mutations to interpret drug resistance determinants has significantly improved the interpretation of clinical genomes (<xref ref-type="bibr" rid="ref8">Ektefaie et al., 2021</xref>; <xref ref-type="bibr" rid="ref38">Walker et al., 2022</xref>). The same technology has been used to investigate tuberculosis outbreaks and transmission dynamics by adopting whole-genome SNP (wgSNP) or core genome Multi-Locus Sequence Typing (cgMLST) schemes assessing genetic relatedness of MTB genomes (<xref ref-type="bibr" rid="ref16">Kohl et al., 2014</xref>, <xref ref-type="bibr" rid="ref17">2018</xref>). However, short-reads technologies are not able to fully resolve hard-to-sequence regions, because has suboptimal capacity to resolve reliably large structural variations, gene duplications, or variations in repetitive regions (<xref ref-type="bibr" rid="ref26">Modlin et al., 2021</xref>), thereby reducing coverage depth involving a lack of characterization in terms of drug resistance, virulence, and transmission analysis (<xref ref-type="bibr" rid="ref25">Medha et al., 2021</xref>; <xref ref-type="bibr" rid="ref23">Marin et al., 2022</xref>). Accurately resolving such regions becomes critical to close bacterial genomes, obtaining more information about virulence, evolutionary mechanisms of drug resistance, and on strain relatedness. The availability of long reads (LRS) from third-generation sequencing technologies, e.g., Oxford Nanopore Technologies (ONT) or PacBio, can improve the resolution of bacterial genomes at level of gene rearrangement, repetitive regions (proline-glutamate/proline-proline-glutamate, PE/PPE), and long insertions/deletions (InDel), usually neglected by short-read sequencing (SRS) due to their low-complex nature. Notably, ONT is a portable, robust, and low-capital-cost sequencer that could conceivably be utilized to conduct WGS analysis in a rapid manner. Recently, different bioinformatic pipelines have been developed to implement the advantages of SRS and LRS in a single unique approach (<xref ref-type="bibr" rid="ref37">Walker et al., 2014</xref>; <xref ref-type="bibr" rid="ref41">Wick et al., 2017</xref>, <xref ref-type="bibr" rid="ref40">2021b</xref>). The procedure usually involves using first SRS to make <italic>de novo</italic> assembly and then LRS to build the bridges between the ambiguous regions, relying mostly on the SRS steps. In this work, we aim to compare the performances of SRS, LRS, and hybrid approach on MTB clinical cluster isolates, which are resistant to first- and second-line drugs. For this purpose, we implemented the use of &#x201C;hybrid reads&#x201D; (HYBR), in which we first corrected the long reads with high accurate short reads, and then we used them as input for the downstream analysis, including identification of mutations, drug resistance prediction, transmission analysis, <italic>de novo</italic> genome assembly, and overall genome coverage. Our reverse HYBR approach outperforms the standard hybrid pipeline. Moreover, we aimed to characterize the repetitive regions of the genome, including PPE and PE genes, which are normally neglected during the SRS analysis. The outcome from this analysis indicates that PE and PPE genes, except PE_PGRS, can be included in the SRS analysis at the cost of increasing the sequencing depth. The study was performed using a subset of <italic>M. tuberculosis</italic> strains previously characterized in our laboratory (<xref ref-type="bibr" rid="ref27">Mustazzolu et al., 2018</xref>; <xref ref-type="bibr" rid="ref1">Abascal et al., 2020</xref>; <xref ref-type="bibr" rid="ref36">Villa et al., 2021</xref>).</p>
</sec>
<sec id="sec2" sec-type="materials|methods">
<title>2. Materials and methods</title>
<sec id="sec3">
<title>2.1. Strain selection</title>
<p>We sequenced with the two platforms (Illumina and ONT) and perform the bioinformatic analysis with the three pipelines (SRS, LRS, and HYBR) on 13 &#x201C;model&#x201D; MTB clinical isolates, selected for being resistant to several drugs and in clusters (<xref ref-type="bibr" rid="ref27">Mustazzolu et al., 2018</xref>; <xref ref-type="bibr" rid="ref1">Abascal et al., 2020</xref>; <xref ref-type="bibr" rid="ref36">Villa et al., 2021</xref>). The characteristics of the isolates are reported in <xref rid="tab1" ref-type="table">Table 1</xref>. Our choice was based on whether LRS was accurate enough to perform standard analyses, including variant calling and cluster characterization on strains with multiple mutations conferring resistance and linked epidemiologically. The first cluster group involves preXDR strains while the second MDR strains.</p>
<table-wrap position="float" id="tab1">
<label>Table 1</label>
<caption>
<p>Isolate characteristics.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="middle">Isolate</th>
<th align="center" valign="middle">Cluster</th>
<th align="center" valign="middle">Year of collection</th>
<th align="center" valign="middle">Lineage</th>
<th align="left" valign="middle">Resistance profile</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">IT1708</td>
<td align="center" valign="top">1</td>
<td align="center" valign="top">2019</td>
<td align="char" valign="top" char=".">4.8</td>
<td align="left" valign="top">Pre-XDR</td>
</tr>
<tr>
<td align="left" valign="top">IT1365</td>
<td align="center" valign="top">1</td>
<td align="center" valign="top">2018</td>
<td align="char" valign="top" char=".">4.8</td>
<td align="left" valign="top">Pre-XDR</td>
</tr>
<tr>
<td align="left" valign="top">IT645</td>
<td align="center" valign="top">1</td>
<td align="center" valign="top">2017</td>
<td align="char" valign="top" char=".">4.8</td>
<td align="left" valign="top">Pre-XDR</td>
</tr>
<tr>
<td align="left" valign="top">IT1748</td>
<td align="center" valign="top">1</td>
<td align="center" valign="top">2020</td>
<td align="char" valign="top" char=".">4.8</td>
<td align="left" valign="top">Pre-XDR</td>
</tr>
<tr>
<td align="left" valign="top">IT696</td>
<td align="center" valign="top">1</td>
<td align="center" valign="top">2018</td>
<td align="char" valign="top" char=".">4.8</td>
<td align="left" valign="top">Pre-XDR</td>
</tr>
<tr>
<td align="left" valign="top">IT1508</td>
<td align="center" valign="top">1</td>
<td align="center" valign="top">2019</td>
<td align="char" valign="top" char=".">4.8</td>
<td align="left" valign="top">Pre-XDR</td>
</tr>
<tr>
<td align="left" valign="top">IT1745</td>
<td align="center" valign="top">1</td>
<td align="center" valign="top">2020</td>
<td align="char" valign="top" char=".">4.8</td>
<td align="left" valign="top">Pre-XDR</td>
</tr>
<tr>
<td align="left" valign="top">IT1313</td>
<td align="center" valign="top">1</td>
<td align="center" valign="top">2018</td>
<td align="char" valign="top" char=".">4.8</td>
<td align="left" valign="top">Pre-XDR</td>
</tr>
<tr>
<td align="left" valign="top">IT1428</td>
<td align="center" valign="top">1</td>
<td align="center" valign="top">2018</td>
<td align="char" valign="top" char=".">4.8</td>
<td align="left" valign="top">Pre-XDR</td>
</tr>
<tr>
<td align="left" valign="top">IT491</td>
<td align="center" valign="top">2</td>
<td align="center" valign="top">2009</td>
<td align="char" valign="top" char=".">4.3.3</td>
<td align="left" valign="top">MDR</td>
</tr>
<tr>
<td align="left" valign="top">MGIT84</td>
<td align="center" valign="top">2</td>
<td align="center" valign="top">2016</td>
<td align="char" valign="top" char=".">4.3.3</td>
<td align="left" valign="top">MDR</td>
</tr>
<tr>
<td align="left" valign="top">IT318</td>
<td align="center" valign="top">2</td>
<td align="center" valign="top">2010</td>
<td align="char" valign="top" char=".">4.3.3</td>
<td align="left" valign="top">MDR</td>
</tr>
<tr>
<td align="left" valign="top">IT650</td>
<td align="center" valign="top">2</td>
<td align="center" valign="top">2017</td>
<td align="char" valign="top" char=".">4.3.3</td>
<td align="left" valign="top">MDR</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>Lineage called using MTBseq pipeline (<xref ref-type="bibr" rid="ref17">Kohl et al., 2018</xref>). Resistance profile according to WHO classification.</p>
</table-wrap-foot>
</table-wrap>
</sec>
<sec id="sec4">
<title>2.2. DNA extraction</title>
<p>All the strains were cultured in Middlebrook 7H9 broth in order to perform DNA extraction using Maxwell 16 Cell DNA Purification kit (Promega) and Zymo Genomic DNA Clean &#x0026; Concentrator&#x2122; (D4010, D4011) kit, for Illumina and ONT sequencing, respectively.</p>
</sec>
<sec id="sec5">
<title>2.3. Oxford nanopore technologies and illumina library preparation and sequencing</title>
<p>Long-reads sequencing was performed with MinION Mk1B platform (Oxford Nanopore Technologies, Oxford, United Kingdom) with a FLO-MIN106 R9.4.1 flow cell and using Rapid Barcoding Kit (SQK-RBK004) for library preparation. Short-reads sequencing was performed on NextSeq 500 and MiniSeq platforms (Illumina Inc., San Diego, CA, United States) with paired-end Nextera XT library preparation following the manufacturer&#x2019;s instructions.</p>
</sec>
<sec id="sec6">
<title>2.4. Short-reads, long-reads, and hybrid-reads data analysis</title>
<p>A graphical description of the analysis workflow is presented in <xref rid="fig1" ref-type="fig">Figure 1</xref>: our HYBR approach is presented in red, while the SRS and the LRS in yellow and blue, respectively. Raw fast5 files were base called using Guppy v5 to obtain LRS fastq files. The quality of the sequencing was assessed using NanoPlot v1.34.0 (<xref ref-type="bibr" rid="ref5">de Coster et al., 2018</xref>). The HYBR approach consisted first in the correction of the long reads with short reads using Ratatosk v0.4 (<xref ref-type="bibr" rid="ref13">Holley et al., 2021</xref>) to obtain the corrected hybrid reads. Mapping on the H37Rv genome (NCBI genome number: NC_000962.3) was performed using the BWA mem algorithm v0.7.17 (<xref ref-type="bibr" rid="ref24">Md et al., 2019</xref>) for SRS and minimap2 algorithm v2.24 (<xref ref-type="bibr" rid="ref21">Li, 2018</xref>) for LRS and hybrid reads. The .bam files obtained by the mapping were then processed using the MTBseq v1.0.3 (<xref ref-type="bibr" rid="ref17">Kohl et al., 2018</xref>) pipeline starting from the TBlist step using default parameters except for <italic>minphred20</italic> and <italic>minbqual</italic> options set, respectively, at 0 and 4 and including the repetitive regions in the analysis. Distance matrices built on the unambiguously called positions in the MTBseq pipeline were used to generate transmission trees using the software GrapeTree v2.2 (<xref ref-type="bibr" rid="ref44">Zhou et al., 2018</xref>) and samples with a distance lower than 5 SNPs were classified as closely genetically related.</p>
<fig position="float" id="fig1">
<label>Figure 1</label>
<caption>
<p>Bioinformatic analysis scheme. SRS pipeline workflow in yellow, LRS in blue, and HYBR in Red. After the acquisition of the short- and long-reads sequencing, hybrid long reads are produced. The tasks of the analyses were then performed in parallel starting from each of the three types of reads.</p>
</caption>
<graphic xlink:href="fmicb-14-1104456-g001.tif"/>
</fig>
<p>H37Rv reference genome was divided into consecutive regions of 1,000&#x2009;bp length and breadth coverage (defined as percentage of genome bases sequenced at a given sequencing depth) at 8x depth and was evaluated using mosdepth v0.3.1 (<xref ref-type="bibr" rid="ref28">Pedersen and Quinlan, 2018</xref>). Median breadth coverage was plotted using Circos v0.69.8 (<xref ref-type="bibr" rid="ref20">Krzywinski et al., 2009</xref>). One hundred and sixty-nine PE/PPE regions were also investigated. Coordinates for the repetitive regions were searched on Mycobrowser (<xref ref-type="bibr" rid="ref14">Kapopoulou et al., 2011</xref>). Breadth coverage in the PE/PPE region was evaluated at depths 1x to 40x. Coverages between techniques were compared, performing ANOVA and <italic>post-hoc</italic> test with holm correction.</p>
<p>The detected variants using the MTBseq pipeline with a frequency higher than 10%, and at least 4 reads with a quality score higher than 20, were used for drug resistance detection, using the WHO catalog as reference (<xref ref-type="bibr" rid="ref43">World Health Organization, 2021b</xref>): both presence of resistance-associated and <italic>ad-interim</italic> resistance-associated mutations were considered for this comparison.</p>
<p>We investigate the assembly performance between the different approaches including a fourth, namely Unicycler v0.4.4 (<xref ref-type="bibr" rid="ref41">Wick et al., 2017</xref>), a widely used algorithm based on the short-long reads hybrid approach. The latter one exploits short and long reads simultaneously during the assembly, whereas our approach uses the short reads to first correct the long reads and then perform the assembly with Flye v2.9 (<xref ref-type="bibr" rid="ref18">Kolmogorov et al., 2019</xref>) using the long-corrected reads. The comparison between <italic>De Novo</italic> assembly algorithms for LRS (Flye), only SRS (Unicycler), HYBR (Flye), and simultaneously short-long reads hybrid assembly (HYBA) (Unicycler) was assessed considering assembly metrics calculated by Quast v5.0.2 (<xref ref-type="bibr" rid="ref12">Gurevich et al., 2013</xref>) using H37Rv as the reference genome. The considered metrics were the number of contigs, number of misassembled contigs, number of Gaps, the fraction of retrieved genes, the fraction of genome, largest alignment of the assembly, the length of the shortest contig at 50% of the total assembly length (NA50), the length of the shortest contig at the 50% of the total genome length (NG50), and number of partial genes. The results were compared between techniques performing ANOVA and <italic>post-hoc</italic> test with holm correction. Statistical analyses were performed using R v4.0.5 (<xref ref-type="bibr" rid="ref32">R Core Team, 2019</xref>) and Rstudio Server 2022.02.2 (<xref ref-type="bibr" rid="ref34">RStudio Team, 2019</xref>).</p>
</sec>
</sec>
<sec id="sec7" sec-type="results">
<title>3. Results</title>
<sec id="sec8">
<title>3.1. Genome coverage</title>
<p>In the MTBseq framework, a breadth coverage at 8X depth is assumed to be the minimum threshold to cover the whole reference genome. In <xref rid="fig2" ref-type="fig">Figure 2A</xref>, it is shown the fully covered genome at 8X between the three approaches, resulting different (<italic>p</italic>&#x2009;&#x003C;&#x2009;0.001): <italic>post-hoc</italic> test showed that SRS approach led to breadth coverage (98.9&#x2009;&#x00B1;&#x2009;0.1%) lower than LRS (99.6&#x2009;&#x00B1;&#x2009;0.1%, <italic>p</italic>&#x2009;&#x003C;&#x2009;0.001) and HYBR (99.7&#x2009;&#x00B1;&#x2009;0.1%, <italic>p</italic>&#x2009;&#x003C;&#x2009;0.001), while LRS and HYBR performed similarly (<italic>p</italic>&#x2009;=&#x2009;0.9).</p>
<fig position="float" id="fig2">
<label>Figure 2</label>
<caption>
<p><bold>(A)</bold> MTB genome breadth coverage at 8x. SRS&#x2014;Yellow, HYBR&#x2014;Red, LRS&#x2013;Blue; <bold>(B)</bold> Approaches genome/genes breadth coverage Circos plot at 8x. Outer to inner: Blue&#x2014;LRS; Red&#x2014;HYBR; Yellow&#x2014;SRS. Genes with a breadth coverage lower than 90% were annotated (in red PE/PPE genes). The black line represents the 8x breadth coverage percentage at that position (0% inner&#x2013;100% outer); <bold>(C)</bold> Number of low-covered PE/PPE genes at different levels of depths. Blue&#x2014;LRS; Red&#x2014;HYBR; Yellow&#x2014;SRS.</p>
</caption>
<graphic xlink:href="fmicb-14-1104456-g002.tif"/>
</fig>
<p><xref rid="fig2" ref-type="fig">Figure 2B</xref> shows the Circos plot of the breadth coverage at 8X along genome coordinates, where the black line spikes represent low-covered regions. SRS, LRS, and HYBR approach scored a low breadth coverage (&#x003C;90%) in 75, 13, and 13 genes, respectively, of the whole genome. In particular, in the repetitive regions, SRS, LRS, and HYBR showed a low breadth coverage in 41, 5, and 4 genes out of 168 PE/PPE total (<xref rid="fig2" ref-type="fig">Figure 2B</xref>). Among the 41 PE/PPE genes with poor breadth coverage in SRS, 37 belong to the PE_PGRS family. Interestingly, HYBR presented only 1 of those genes, PE_PGRS4, with low breadth coverage, whereas LRS resulted low breadth coverage in 2 genes (PE_PGRS3 and PE_PGRS4). We studied the percentage of low-covered PE/PPE genes as function of the depth coverage (<xref rid="fig2" ref-type="fig">Figure 2C</xref>). SRS has an almost exponential slope by indicating that low-covered regions increase with the depths, as expected. LRS and HYBR maintain a flat trend up to 12X, afterward both approaches start to increase the number of genes low covered. All approaches present comparable low-resolution values after 40X.</p>
<p>To better investigate the drops of coverage resolution, we constructed a neighbor-joining tree based only on PE/PPE reference sequences from MycoBrowser (<xref ref-type="bibr" rid="ref14">Kapopoulou et al., 2011</xref>) to evaluate their similarities. The tree shows three different genes clades, namely PE, PPE, and PE_PGRS, respectively, orange, yellow, and red leaves (<xref rid="fig3" ref-type="fig">Figure 3</xref>). We then annotated the tree with the breadth coverage at 8X from our data according to the approaches (outer rings). Among the repetitive regions, the family of PE_PGRS genes shows the lowest breadth coverages in our data when using SRS, whereas they are well covered using LRS and HYBR approach.</p>
<fig position="float" id="fig3">
<label>Figure 3</label>
<caption>
<p>PE/PPE genes neighbor-joining tree based on multiple sequence alignment result. Orange tips: PE genes; Yellow tips: PE_PGRS genes; Red tips: PPE genes. Yellow layer: SRS 8X breadth coverage; Red layer: HYBR 8X breadth coverage; Blue layer: LRS 8X breadth coverage.</p>
</caption>
<graphic xlink:href="fmicb-14-1104456-g003.tif"/>
</fig>
</sec>
<sec id="sec9">
<title>3.2. Variant calling and cluster analysis</title>
<p>We compared the variant calls between SRS, LRS, and HYBR, using the MTBseq pipeline framework as described in the methods section. We focused our analysis to identify the single-point mutations (SNPs) present uniquely in each pipeline. The approaches showed different results (<italic>p</italic>&#x2009;&#x003C;&#x2009;0.001), with the <italic>post-hoc</italic> test showing significant differences between all the pairwise comparisons: LRS showed the lowest mean number of uniquely identified mutations (0.3&#x2009;&#x00B1;&#x2009;0.1%), followed by SRS (1.3&#x2009;&#x00B1;&#x2009;0.2%) and by HYBR (5.1&#x2009;&#x00B1;&#x2009;0.4%). Considering the uniquely identified mutations not detected by the other approaches, HYBR misses 37% (36) for low coverage and 63% (62) for low frequency, SRS 68% (123) for coverage and 32% (58) for frequency, and LRS 58% (903) for coverage and 42% (651) for frequency. Among the 663 different mutations that were uniquely identified by the HYBR approach, 63% were located in the PE/PPE genes, 33% in other genes, and 4% in intergenic regions. LRS identified 46 SNPs uniquely, of which 37% located in PE/PPE genes. Finally, of 65 SNPs uniquely identified by SRS, only 23% belonged to PE/PPE genes.</p>
<p>We calculated the minimum spanning tree within the MTBseq framework, and it was constructed on 499, 680, and 712 SNPs positions, respectively, for LRS, SRS, and HYBR pipelines. All three approaches agreed on the identification of two major clusters, cluster 1 and cluster 2 shown in blue and in red, respectively, (<xref rid="fig4" ref-type="fig">Figure 4</xref>). Cluster 2 has the same number of nodes and SNPs distance between strains (number on the edge) when analyzed with all three pipelines. Cluster 1, instead, shows a different compactness intra-cluster, namely the cluster dispersion, in all three approaches. We found that SRS identified 5 nodes (8 SNPs in total), HYBR 4 (5 SNPS) and LRS 3 nodes (4 SNPs). Although this discrepancy could reflect a different intra-cluster resolution, the strains are linked each other under the standard 5 SNPs, representing in all approaches a single chain of transmission. Finally, considering the distance between the two clusters, the HYBR approach identified a higher number of SNPs compared to SRS and LRS, due to an improved coverage of the repetitive regions (20 SNPs) and in agreement with the higher overall number of SNPs found.</p>
<fig position="float" id="fig4">
<label>Figure 4</label>
<caption>
<p>SNP-based Minimum spanning tree and clusters identification based on distance &#x003C;6; Cluster 1: Blue, Cluster 2: Red. <bold>(A)</bold>&#x2014;SRS, <bold>(B)</bold>&#x2014;HYBR, <bold>(C)</bold>&#x2014;LRS.</p>
</caption>
<graphic xlink:href="fmicb-14-1104456-g004.tif"/>
</fig>
</sec>
<sec id="sec10">
<title>3.3. Drug resistance</title>
<p>Regarding the presence of confidence-graded mutations associated with resistance to the main drugs as defined in the WHO catalog, we observed an almost perfect agreement between the pipelines to define the strains. SRS and HYBR detect identical resistance patterns, whereas LRS did not detect resistance to ethionamide in only one strain due to a low number of reads with quality higher than 20 (4/18) for the mutation &#x201C;fabG1_c-15&#x2009;t&#x201D; associated with ethionamide and isoniazid resistance. All the approaches detected 93 high/medium confidence drug resistances associated with SNPs and 10 classified as associated with drug resistances &#x201C;ad interim&#x201D; (<xref ref-type="bibr" rid="ref38">Walker et al., 2022</xref>).</p>
</sec>
<sec id="sec11">
<title>3.4. <italic>De novo</italic> assembly</title>
<p>We performed an assembly comparison to evaluate the importance of long reads technology. LRS and the HYBR approaches outperformed SRS and the widely-used HYBA approaches with Unicycler, in terms of number of contigs (<italic>p</italic>&#x2009;&#x003C;&#x2009;0.001), number of misassembled contigs (<italic>p</italic>&#x2009;&#x003C;&#x2009;0.001), number of gaps (<italic>p</italic>&#x2009;&#x003C;&#x2009;0.001), fraction of covered genome (<italic>p</italic>&#x2009;&#x003C;&#x2009;0.001), fraction of retrieved genes (<italic>p</italic>&#x2009;&#x003C;&#x2009;0.001), number of partially covered genes (<italic>p</italic>&#x2009;&#x003C;&#x2009;0.001), largest alignment length (<italic>p</italic>&#x2009;&#x003C;&#x2009;0.001), NA50 (<italic>p</italic>&#x2009;&#x003C;&#x2009;0.001), and NG50 (<italic>p</italic>&#x2009;&#x003C;&#x2009;0.001), with the SRS approach resulting the least effective for this task, as expected (<xref rid="fig5" ref-type="fig">Figure 5</xref>). HYBR and LRS obtained comparable results in the metrics considered. SRS obtained poor results in all the tasks, showing significant differences from the other three proposed approaches.</p>
<fig position="float" id="fig5">
<label>Figure 5</label>
<caption>
<p>Assemblies statistics comparison. SRS in yellow, Hybrid assembly (HYBA) in green, HYBR in red, LRS in blue. &#x002A;<italic>p</italic>&#x2009;&#x003C;&#x2009;0.05; &#x002A;&#x002A;<italic>p</italic>&#x2009;&#x003C;&#x2009;0.01; &#x002A;&#x002A;&#x002A;<italic>p</italic>&#x2009;&#x003C;&#x2009;0.001; &#x002A;&#x002A;&#x002A;&#x002A;<italic>p</italic>&#x2009;&#x003C;&#x2009;0.0001.</p>
</caption>
<graphic xlink:href="fmicb-14-1104456-g005.tif"/>
</fig>
</sec>
</sec>
<sec id="sec12" sec-type="discussions">
<title>4. Discussion</title>
<p>The characterization of MTB strains shows different challenges associated with the aims of the genomic analysis. Several solutions were proposed over the years to optimize the analysis, with a major focus on the use of SRS (<xref ref-type="bibr" rid="ref17">Kohl et al., 2018</xref>; <xref ref-type="bibr" rid="ref31">Phelan et al., 2019</xref>). Differently from other prokaryotic pathogens, <italic>MTB</italic> shows genomic features such as the lack of mobile genetic elements (e.g., plasmids), a high GC-content, and the relevant presence of highly variable repetitive regions. All those features contribute to increase possible biases in the genomic analysis (<xref ref-type="bibr" rid="ref22">Li and Wren, 2014</xref>; <xref ref-type="bibr" rid="ref26">Modlin et al., 2021</xref>). Common bioinformatic pipelines usually exclude the so-called &#x2018;biased regions&#x2019;, assuming that large repeats cannot be mapped onto the reference genome unambiguously, as mappability does not depend on coverage, and the results could decrease the accuracy of the transmission analysis (<xref ref-type="bibr" rid="ref17">Kohl et al., 2018</xref>; <xref ref-type="bibr" rid="ref35">South et al., 2022</xref>). This compromises the possibility to identify mutations relevant for the virulence and resistance to the main drugs and to provide a comprehensive analysis of entire genomes.</p>
<p>The introduction of the LRS technology represents a valid alternative to the SRS approaches, because it allowed a better characterization of the MTB genome, e.g., InDel and repetitive regions. Comparing the results between SRS and LRS, different studies highlight genome regions where the SRS lacks accuracy due to limit of the technology (<xref ref-type="bibr" rid="ref26">Modlin et al., 2021</xref>; <xref ref-type="bibr" rid="ref29">Peker et al., 2021</xref>; <xref ref-type="bibr" rid="ref11">G&#x00F3;mez-Gonz&#x00E1;lez et al., 2022</xref>; <xref ref-type="bibr" rid="ref23">Marin et al., 2022</xref>). Although LRS approaches still present a high-error rate (~5&#x2013;15%), their random nature allows to improve accuracy with higher coverage (<xref ref-type="bibr" rid="ref33">Rhoads and Au, 2015</xref>; <xref ref-type="bibr" rid="ref3">Athanasopoulou et al., 2021</xref>; <xref ref-type="bibr" rid="ref2">Amoutzias et al., 2022</xref>).</p>
<p>In this study, we compared the performances of LRS, SRS, and our modified version of the hybrid long-short reads, HYBR, on 13 MTB strains previously described, showing MDR and preXDR patterns. We analyzed the coverage and the variant calling along the whole genome. We carried out a comparative analysis between the 3 approaches, by performing genome coverage estimation, cluster analysis, drug resistance detection, and <italic>de novo</italic> assembly. The results obtained showed that the implementation of the HYBR approach, which has the advantage to include the features of the long and short reads, allows a better description of the study strains in terms of genome breadth coverage and assembly compared to SRS, and variant calling and related downstream analysis compared to LRS. In fact, our hybrid approach relies on long reads first corrected by the short reads and then used them in the downstream analysis: this approach allows to adopt the newly hybrid corrected reads for all the tasks of the investigation, while usually hybrid approaches involve both LRS and SRS only for the assembly step. SRS showed several limitations in terms of coverage along the whole-genome compared to LRS and HYBR. The PE_PGRS genes regions resulted as the more problematic for SRS, although those families of repetitive genes retain an important role in terms MTB pathogenesis, and the low coverage could correspond to a not trivial loss of information in the pathogen characterization (<xref ref-type="bibr" rid="ref6">de Maio et al., 2020</xref>). In particular, we found that PE_PGRS3 and PE_PGRS4 genes present very low coverage in all approaches. Recently, few studies characterized those specific regions in the genome, showing that they are close to each other and present a homologous sequence (percent identity of 81%) due to gene duplication, indicating that they could potentially present critical issues with every technology (<xref ref-type="bibr" rid="ref15">Karboul et al., 2008</xref>; <xref ref-type="bibr" rid="ref30">Phelan et al., 2016</xref>; <xref ref-type="bibr" rid="ref6">de Maio et al., 2020</xref>). Interestingly, the remaining PE and PPE regions showed an overall acceptable coverage for SRS and as already described in other studies, the common practice of excluding those genes from the analysis, due to the high GC-content and the repetitive sequences, could be overcome by removing only the PE_PGRS genes (<xref ref-type="bibr" rid="ref26">Modlin et al., 2021</xref>; <xref ref-type="bibr" rid="ref23">Marin et al., 2022</xref>).</p>
<p>The variant calling showed how, with low depths, the high-error rate of the LRS technology masks the variants detected with the random noise produced by the basecalling step. Nevertheless, this issue could be addressed with an enhancement of the sequencing depth, differently from SRS technologies where the error is due to systematic biases (<xref ref-type="bibr" rid="ref4">Cabibbe et al., 2020</xref>). The HYBR approach outperformed both SRS and LRS, the latter missing few mutations due to coverage issues in those regions. The hybrid reads approach requires a good sequencing depth from LRS otherwise it will inherit the same issues of the parental LRS in terms of signal/noise ratio, especially in those regions where SRS correction does not perform optimally. In fact, most of the undetected mutations were due to frequency threshold (75%), especially in those regions not well covered by SRS. Nevertheless, considering the repetitive regions, this result indicates that HYBR approach can reveal a great number of mutations compared to SRS and LRS, due a better coverage.</p>
<p>In the <italic>de novo</italic> assembly evaluation, the three approaches were compared among each other and to the widely used hybrid assembler Unicycler (<xref ref-type="bibr" rid="ref7">de Maio et al., 2019</xref>; <xref ref-type="bibr" rid="ref39">Wick et al., 2021a</xref>). As stated by the developer, the hybrid assembly executed by Unicycler corresponds to a &#x201C;short-read-first&#x201D; approach in which the short reads assembly graph is scaffolded to completion by the long reads (<xref ref-type="bibr" rid="ref41">Wick et al., 2017</xref>). This approach was proposed with the assumption that LRS presents low depth and accuracy. The improvement of the ONT technology claiming to lower error rate at 1% with the introduction of the new 10.x flow cell chemistry, allowed to rely on the opposite &#x201C;long-reads-first&#x201D; approaches as Trycicler (<xref ref-type="bibr" rid="ref40">Wick et al., 2021b</xref>). In the current comparison, the LRS still relies on the previous technology presenting low depth and accuracy. Nevertheless, the HYBR and the LRS showed the best results, confirming Flye as one of the best-performing assemblers for long reads (<xref ref-type="bibr" rid="ref39">Wick et al., 2021a</xref>). Interestingly, in our dataset, the hybrid assembler Unicycler performed poorly than Flye, especially considering the NG50 metrics (the length of the shortest contig at 50% of the total genome length), presenting a mean of 1.6&#x2009;&#x00B1;&#x2009;0.3&#x2009;Mb, lower than the HYBR with 4.3&#x2009;&#x00B1;&#x2009;0.1&#x2009;Mb (<italic>p</italic>&#x2009;=&#x2009;0.02), indicating that our HYBR can better assembly the genomes. As expected for this task, SRS performed very poorly emphasizing its inadequacy for the <italic>de novo</italic> assembly.</p>
<p>This study presents some limitations: the limited number of samples considered for the analysis despite the deep investigation conducted on each genome and the adoption of the 9.x flow cells technology for LRS bearing a higher error rate compared to the new 10.x as the latter was not available at the time of the study.</p>
<p>This study outlines the strengths and the weaknesses of three approaches. The repetitive regions of the PE_PRGS genes represent a source of blind spots for the SRS, while the remaining PE/PPE regions, usually neglected as well, could be safely included in the analysis, showing good coverage. The LRS shows issues in terms of signal-to-noise ratio but still can correctly identify genetically closed strains and drug resistance-associated mutations, and the increase of sequencing depth enables usually to fix the issue. The HYBR approach overcomes the limitations of both SRS and LRS, showing the best results in all the considered tasks. Although hybrid reads approach suffers from the relative higher cost compared to the single sequencing run of SRS and LRS, it could offer the advantage to better evaluate problematic regions in variant calling, where LRS presents critical issue, and in <italic>de novo</italic> assembly, where SRS cannot compete with LRS.</p>
<p>In conclusion, depending on the aim of the investigation, both SRS and LRS present complementary advantages and limitations implying that for a full resolution of MTB genomes, where all the mentioned analyses and both technologies are needed, the use of the hybrid reads approach represents a valid option and a well-rounded strategy (<xref rid="fig6" ref-type="fig">Figure 6</xref>).</p>
<fig position="float" id="fig6">
<label>Figure 6</label>
<caption>
<p>Tasks performance comparison between three approaches. Created with <ext-link xlink:href="http://BioRender.com" ext-link-type="uri">BioRender.com</ext-link>.</p>
</caption>
<graphic xlink:href="fmicb-14-1104456-g006.tif"/>
</fig>
</sec>
<sec id="sec13" sec-type="data-availability">
<title>Data availability statement</title>
<p>The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found at: <ext-link xlink:href="https://www.ncbi.nlm.nih.gov/" ext-link-type="uri">https://www.ncbi.nlm.nih.gov/</ext-link>, PRJNA903660. The codes used for the analysis presented in the study are deposited in the Github repository, accessible at: <ext-link xlink:href="https://github.com/Allen13x/MTB_LRSvsSRS" ext-link-type="uri">https://github.com/Allen13x/MTB_LRSvsSRS</ext-link>.</p>
</sec>
<sec id="sec14">
<title>Author contributions</title>
<p>FDM, AS, SB, AMC, and DMC conceived and supervised the study. SB and VB performed the sequencing. FDM and AS performed the bioinformatics analysis. FDM and AS wrote the draft manuscript. FDM, AS, AMC, and DMC revisioned the draft manuscript. All authors contributed to the article and approved the submitted version.</p>
</sec>
<sec id="sec15" sec-type="funding-information">
<title>Funding</title>
<p>This study was partially supported by the 2nd ERANet-LAC Transnational Joint Call on Research and Innovation (grant: TRANS-TB-TRANS PER-2012-ELAC2015/T08-0664).</p>
</sec>
<sec id="conf1" sec-type="COI-statement">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec id="sec100" sec-type="disclaimer">
<title>Publisher&#x2019;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
</body>
<back>
<ref-list>
<title>References</title>
<ref id="ref1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Abascal</surname> <given-names>E.</given-names></name> <name><surname>Herranz</surname> <given-names>M.</given-names></name> <name><surname>Acosta</surname> <given-names>F.</given-names></name> <name><surname>Agapito</surname> <given-names>J.</given-names></name> <name><surname>Cabibbe</surname> <given-names>A. M.</given-names></name> <name><surname>Monteserin</surname> <given-names>J.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Screening of inmates transferred to Spain reveals a Peruvian prison as a reservoir of persistent <italic>Mycobacterium tuberculosis</italic> MDR strains and mixed infections</article-title>. <source>Sci. Rep.</source> <volume>10</volume>:<fpage>2704</fpage>. doi: <pub-id pub-id-type="doi">10.1038/S41598-020-59373-W</pub-id>, PMID: <pub-id pub-id-type="pmid">32066749</pub-id></citation></ref>
<ref id="ref2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Amoutzias</surname> <given-names>G. D.</given-names></name> <name><surname>Nikolaidis</surname> <given-names>M.</given-names></name> <name><surname>Hesketh</surname> <given-names>A.</given-names></name></person-group> (<year>2022</year>). <article-title>The notable achievements and the prospects of bacterial pathogen genomics</article-title>. <source>Microorganisms</source> <volume>10</volume>:<fpage>1040</fpage>. doi: <pub-id pub-id-type="doi">10.3390/MICROORGANISMS10051040</pub-id></citation></ref>
<ref id="ref3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Athanasopoulou</surname> <given-names>K.</given-names></name> <name><surname>Boti</surname> <given-names>M. A.</given-names></name> <name><surname>Adamopoulos</surname> <given-names>P. G.</given-names></name> <name><surname>Skourou</surname> <given-names>P. C.</given-names></name> <name><surname>Scorilas</surname> <given-names>A.</given-names></name></person-group> (<year>2021</year>). <article-title>Third-generation sequencing: the spearhead towards the radical transformation of modern genomics</article-title>. <source>Life</source> <volume>12</volume>:<fpage>30</fpage>. doi: <pub-id pub-id-type="doi">10.3390/LIFE12010030</pub-id></citation></ref>
<ref id="ref4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cabibbe</surname> <given-names>A. M.</given-names></name> <name><surname>Spitaleri</surname> <given-names>A.</given-names></name> <name><surname>Battaglia</surname> <given-names>S.</given-names></name> <name><surname>Colman</surname> <given-names>R. E.</given-names></name> <name><surname>Suresh</surname> <given-names>A.</given-names></name> <name><surname>Uplekar</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Application of targeted next-generation sequencing assay on a portable sequencing platform for culture-free detection of drug-resistant tuberculosis from clinical samples</article-title>. <source>J. Clin. Microbiol.</source> <volume>58</volume>:<fpage>e00632-20</fpage>. doi: <pub-id pub-id-type="doi">10.1128/JCM.00632-20</pub-id>, PMID: <pub-id pub-id-type="pmid">32727827</pub-id></citation></ref>
<ref id="ref5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>de Coster</surname> <given-names>W.</given-names></name> <name><surname>D&#x2019;Hert</surname> <given-names>S.</given-names></name> <name><surname>Schultz</surname> <given-names>D. T.</given-names></name> <name><surname>Cruts</surname> <given-names>M.</given-names></name> <name><surname>van Broeckhoven</surname> <given-names>C.</given-names></name></person-group> (<year>2018</year>). <article-title>NanoPack: visualizing and processing long-read sequencing data</article-title>. <source>Bioinformatics</source> <volume>34</volume>, <fpage>2666</fpage>&#x2013;<lpage>2669</lpage>. doi: <pub-id pub-id-type="doi">10.1093/BIOINFORMATICS/BTY149</pub-id>, PMID: <pub-id pub-id-type="pmid">29547981</pub-id></citation></ref>
<ref id="ref6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>de Maio</surname> <given-names>F.</given-names></name> <name><surname>Berisio</surname> <given-names>R.</given-names></name> <name><surname>Manganelli</surname> <given-names>R.</given-names></name> <name><surname>Delogu</surname> <given-names>G.</given-names></name></person-group> (<year>2020</year>). <article-title>PE_PGRS proteins of <italic>Mycobacterium tuberculosis</italic>: a specialized molecular task force at the forefront of host&#x2013;pathogen interaction</article-title>. <source>Virulence</source> <volume>11</volume>, <fpage>898</fpage>&#x2013;<lpage>915</lpage>. doi: <pub-id pub-id-type="doi">10.1080/21505594.2020.1785815/SUPPL_FILE/KVIR_A_1785815_SM4297.ZIP</pub-id>, PMID: <pub-id pub-id-type="pmid">32713249</pub-id></citation></ref>
<ref id="ref7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>de Maio</surname> <given-names>N.</given-names></name> <name><surname>Shaw</surname> <given-names>L. P.</given-names></name> <name><surname>Hubbard</surname> <given-names>A.</given-names></name> <name><surname>George</surname> <given-names>S.</given-names></name> <name><surname>Sanderson</surname> <given-names>N. D.</given-names></name> <name><surname>Swann</surname> <given-names>J.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>Comparison of long-read sequencing technologies in the hybrid assembly of complex bacterial genomes</article-title>. <source>Microb. Genom.</source> <volume>5</volume>:<fpage>e000294</fpage>. doi: <pub-id pub-id-type="doi">10.1099/MGEN.0.000294/CITE/REFWORKS</pub-id></citation></ref>
<ref id="ref8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ektefaie</surname> <given-names>Y.</given-names></name> <name><surname>Dixit</surname> <given-names>A.</given-names></name> <name><surname>Freschi</surname> <given-names>L.</given-names></name> <name><surname>Farhat</surname> <given-names>M. R.</given-names></name></person-group> (<year>2021</year>). <article-title>Globally diverse <italic>Mycobacterium tuberculosis</italic> resistance acquisition: a retrospective geographical and temporal analysis of whole genome sequences</article-title>. <source>Lancet Microbe</source> <volume>2</volume>, <fpage>e96</fpage>&#x2013;<lpage>e104</lpage>. doi: <pub-id pub-id-type="doi">10.1016/S2666-5247(20)30195-6</pub-id>, PMID: <pub-id pub-id-type="pmid">33912853</pub-id></citation></ref>
<ref id="ref9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ferdinand</surname> <given-names>A. S.</given-names></name> <name><surname>Kelaher</surname> <given-names>M.</given-names></name> <name><surname>Lane</surname> <given-names>C. R.</given-names></name> <name><surname>da Silva</surname> <given-names>A. G.</given-names></name> <name><surname>Sherry</surname> <given-names>N. L.</given-names></name> <name><surname>Ballard</surname> <given-names>S. A.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>An implementation science approach to evaluating pathogen whole genome sequencing in public health</article-title>. <source>Genome Med.</source> <volume>13</volume>, <fpage>1</fpage>&#x2013;<lpage>11</lpage>. doi: <pub-id pub-id-type="doi">10.1186/S13073-021-00934-7/TABLES/2</pub-id></citation></ref>
<ref id="ref10"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gladstone</surname> <given-names>R. A.</given-names></name> <name><surname>McNally</surname> <given-names>A.</given-names></name> <name><surname>P&#x00F6;ntinen</surname> <given-names>A. K.</given-names></name> <name><surname>Tonkin-Hill</surname> <given-names>G.</given-names></name> <name><surname>Lees</surname> <given-names>J. A.</given-names></name> <name><surname>Skyt&#x00E9;n</surname> <given-names>K.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>Emergence and dissemination of antimicrobial resistance in Escherichia coli causing bloodstream infections in Norway in 2002&#x2013;17: a nationwide, longitudinal, microbial population genomic study</article-title>. <source>Lancet Microbe</source> <volume>2</volume>, <fpage>e331</fpage>&#x2013;<lpage>e341</lpage>. doi: <pub-id pub-id-type="doi">10.1016/S2666-5247(21)00031-8</pub-id>, PMID: <pub-id pub-id-type="pmid">35544167</pub-id></citation></ref>
<ref id="ref11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>G&#x00F3;mez-Gonz&#x00E1;lez</surname> <given-names>P. J.</given-names></name> <name><surname>Campino</surname> <given-names>S.</given-names></name> <name><surname>Phelan</surname> <given-names>J. E.</given-names></name> <name><surname>Clark</surname> <given-names>T. G.</given-names></name></person-group> (<year>2022</year>). <article-title>Portable sequencing of <italic>Mycobacterium tuberculosis</italic> for clinical and epidemiological applications</article-title>. <source>Brief. Bioinform.</source> <volume>23</volume>, <fpage>1</fpage>&#x2013;<lpage>10</lpage>. doi: <pub-id pub-id-type="doi">10.1093/BIB/BBAC256</pub-id></citation></ref>
<ref id="ref12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gurevich</surname> <given-names>A.</given-names></name> <name><surname>Saveliev</surname> <given-names>V.</given-names></name> <name><surname>Vyahhi</surname> <given-names>N.</given-names></name> <name><surname>Tesler</surname> <given-names>G.</given-names></name></person-group> (<year>2013</year>). <article-title>QUAST: quality assessment tool for genome assemblies</article-title>. <source>Bioinformatics</source> <volume>29</volume>, <fpage>1072</fpage>&#x2013;<lpage>1075</lpage>. doi: <pub-id pub-id-type="doi">10.1093/BIOINFORMATICS/BTT086</pub-id>, PMID: <pub-id pub-id-type="pmid">23422339</pub-id></citation></ref>
<ref id="ref13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Holley</surname> <given-names>G.</given-names></name> <name><surname>Beyter</surname> <given-names>D.</given-names></name> <name><surname>Ingimundardottir</surname> <given-names>H.</given-names></name> <name><surname>M&#x00F8;ller</surname> <given-names>P. L.</given-names></name> <name><surname>Kristmundsdottir</surname> <given-names>S.</given-names></name> <name><surname>Eggertsson</surname> <given-names>H. P.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>Ratatosk: hybrid error correction of long reads enables accurate variant calling and assembly</article-title>. <source>Genome Biol.</source> <volume>22</volume>, <fpage>1</fpage>&#x2013;<lpage>22</lpage>. doi: <pub-id pub-id-type="doi">10.1186/S13059-020-02244-4/FIGURES/8</pub-id></citation></ref>
<ref id="ref14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kapopoulou</surname> <given-names>A.</given-names></name> <name><surname>Lew</surname> <given-names>J. M.</given-names></name> <name><surname>Cole</surname> <given-names>S. T.</given-names></name></person-group> (<year>2011</year>). <article-title>The MycoBrowser portal: a comprehensive and manually annotated resource for mycobacterial genomes</article-title>. <source>Tuberculosis (Edinb.)</source> <volume>91</volume>, <fpage>8</fpage>&#x2013;<lpage>13</lpage>. doi: <pub-id pub-id-type="doi">10.1016/J.TUBE.2010.09.006</pub-id>, PMID: <pub-id pub-id-type="pmid">20980200</pub-id></citation></ref>
<ref id="ref15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Karboul</surname> <given-names>A.</given-names></name> <name><surname>Mazza</surname> <given-names>A.</given-names></name> <name><surname>Gey Van Pittius</surname> <given-names>N. C.</given-names></name> <name><surname>Ho</surname> <given-names>J. L.</given-names></name> <name><surname>Brousseau</surname> <given-names>R.</given-names></name> <name><surname>Mardassi</surname> <given-names>H.</given-names></name></person-group> (<year>2008</year>). <article-title>Frequent homologous recombination events in <italic>Mycobacterium tuberculosis</italic> PE/PPE multigene families: potential role in antigenic variability</article-title>. <source>J. Bacteriol.</source> <volume>190</volume>, <fpage>7838</fpage>&#x2013;<lpage>7846</lpage>. doi: <pub-id pub-id-type="doi">10.1128/JB.00827-08/SUPPL_FILE/SUPPLLEGS.DOC</pub-id>, PMID: <pub-id pub-id-type="pmid">18820012</pub-id></citation></ref>
<ref id="ref16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kohl</surname> <given-names>T. A.</given-names></name> <name><surname>Diel</surname> <given-names>R.</given-names></name> <name><surname>Harmsen</surname> <given-names>D.</given-names></name> <name><surname>Rothg&#x00E4;nger</surname> <given-names>J.</given-names></name> <name><surname>Meywald Walter</surname> <given-names>K.</given-names></name> <name><surname>Merker</surname> <given-names>M.</given-names></name> <etal/></person-group>. (<year>2014</year>). <article-title>Whole-genome-based <italic>Mycobacterium tuberculosis</italic> surveillance: a standardized, portable, and expandable approach</article-title>. <source>J. Clin. Microbiol.</source> <volume>52</volume>, <fpage>2479</fpage>&#x2013;<lpage>2486</lpage>. doi: <pub-id pub-id-type="doi">10.1128/JCM.00567-14</pub-id>, PMID: <pub-id pub-id-type="pmid">24789177</pub-id></citation></ref>
<ref id="ref17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kohl</surname> <given-names>T. A.</given-names></name> <name><surname>Utpatel</surname> <given-names>C.</given-names></name> <name><surname>Schleusener</surname> <given-names>V.</given-names></name> <name><surname>de Filippo</surname> <given-names>M. R.</given-names></name> <name><surname>Beckert</surname> <given-names>P.</given-names></name> <name><surname>Cirillo</surname> <given-names>D. M.</given-names></name> <etal/></person-group>. (<year>2018</year>). <article-title>MTBseq: a comprehensive pipeline for whole genome sequence analysis of <italic>Mycobacterium tuberculosis</italic> complex isolates</article-title>. <source>PeerJ</source> <volume>2018</volume>:<fpage>e5895</fpage>. doi: <pub-id pub-id-type="doi">10.7717/PEERJ.5895/SUPP-2</pub-id></citation></ref>
<ref id="ref18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kolmogorov</surname> <given-names>M.</given-names></name> <name><surname>Yuan</surname> <given-names>J.</given-names></name> <name><surname>Lin</surname> <given-names>Y.</given-names></name> <name><surname>Pevzner</surname> <given-names>P. A.</given-names></name></person-group> (<year>2019</year>). <article-title>Assembly of long, error-prone reads using repeat graphs</article-title>. <source>Nat. Biotechnol.</source> <volume>37</volume>, <volume>37</volume>, <fpage>540</fpage>&#x2013;<lpage>546</lpage>. doi: <pub-id pub-id-type="doi">10.1038/s41587-019-0072-8</pub-id>, PMID: <pub-id pub-id-type="pmid">30936562</pub-id></citation></ref>
<ref id="ref19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>K&#x00F6;ser</surname> <given-names>C. U.</given-names></name> <name><surname>Ellington</surname> <given-names>M. J.</given-names></name> <name><surname>Peacock</surname> <given-names>S. J.</given-names></name></person-group> (<year>2014</year>). <article-title>Whole-genome sequencing to control antimicrobial resistance</article-title>. <source>Trends Genet.</source> <volume>30</volume>:<fpage>401</fpage>. doi: <pub-id pub-id-type="doi">10.1016/J.TIG.2014.07.003</pub-id>, PMID: <pub-id pub-id-type="pmid">25096945</pub-id></citation></ref>
<ref id="ref20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Krzywinski</surname> <given-names>M.</given-names></name> <name><surname>Schein</surname> <given-names>J.</given-names></name> <name><surname>Birol</surname> <given-names>I.</given-names></name> <name><surname>Connors</surname> <given-names>J.</given-names></name> <name><surname>Gascoyne</surname> <given-names>R.</given-names></name> <name><surname>Horsman</surname> <given-names>D.</given-names></name> <etal/></person-group>. (<year>2009</year>). <article-title>Circos: an information aesthetic for comparative genomics</article-title>. <source>Genome Res.</source> <volume>19</volume>, <fpage>1639</fpage>&#x2013;<lpage>1645</lpage>. doi: <pub-id pub-id-type="doi">10.1101/GR.092759.109</pub-id>, PMID: <pub-id pub-id-type="pmid">19541911</pub-id></citation></ref>
<ref id="ref21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>H.</given-names></name></person-group> (<year>2018</year>). <article-title>Minimap2: pairwise alignment for nucleotide sequences</article-title>. <source>Bioinformatics</source> <volume>34</volume>, <fpage>3094</fpage>&#x2013;<lpage>3100</lpage>. doi: <pub-id pub-id-type="doi">10.1093/BIOINFORMATICS/BTY191</pub-id>, PMID: <pub-id pub-id-type="pmid">29750242</pub-id></citation></ref>
<ref id="ref22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>H.</given-names></name> <name><surname>Wren</surname> <given-names>J.</given-names></name></person-group> (<year>2014</year>). <article-title>Toward better understanding of artifacts in variant calling from high-coverage samples</article-title>. <source>Bioinformatics</source> <volume>30</volume>, <fpage>2843</fpage>&#x2013;<lpage>2851</lpage>. doi: <pub-id pub-id-type="doi">10.1093/BIOINFORMATICS/BTU356</pub-id>, PMID: <pub-id pub-id-type="pmid">24974202</pub-id></citation></ref>
<ref id="ref23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Marin</surname> <given-names>M.</given-names></name> <name><surname>Vargas</surname> <given-names>R.</given-names></name> <name><surname>Harris</surname> <given-names>M.</given-names></name> <name><surname>Jeffrey</surname> <given-names>B.</given-names></name> <name><surname>Epperson</surname> <given-names>L. E.</given-names></name> <name><surname>Durbin</surname> <given-names>D.</given-names></name> <etal/></person-group>. (<year>2022</year>). <article-title>Benchmarking the empirical accuracy of short-read sequencing across the M. tuberculosis genome</article-title>. <source>Bioinformatics</source> <volume>38</volume>, <fpage>1781</fpage>&#x2013;<lpage>1787</lpage>. doi: <pub-id pub-id-type="doi">10.1093/BIOINFORMATICS/BTAC023</pub-id>, PMID: <pub-id pub-id-type="pmid">35020793</pub-id></citation></ref>
<ref id="ref24"><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Md</surname> <given-names>V.</given-names></name> <name><surname>Misra</surname> <given-names>S.</given-names></name> <name><surname>Li</surname> <given-names>H.</given-names></name> <name><surname>Aluru</surname> <given-names>S.</given-names></name></person-group> (<year>2019</year>). <article-title>Efficient architecture-aware acceleration of BWA-MEM for multicore systems</article-title>. <conf-name>Proceedings - 2019 IEEE 33rd International Parallel and Distributed Processing Symposium, IPDPS 2019</conf-name>, <fpage>314</fpage>&#x2013;<lpage>324</lpage>.</citation></ref>
<ref id="ref25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Medha</surname></name> <name><surname>Sharma</surname> <given-names>S.</given-names></name> <name><surname>Sharma</surname> <given-names>M.</given-names></name></person-group> (<year>2021</year>). <article-title>Proline-glutamate/proline-proline-glutamate (PE/PPE) proteins of <italic>Mycobacterium tuberculosis</italic>: the multifaceted immune-modulators</article-title>. <source>Acta Trop.</source> <volume>222</volume>:<fpage>106035</fpage>. doi: <pub-id pub-id-type="doi">10.1016/J.ACTATROPICA.2021.106035</pub-id>, PMID: <pub-id pub-id-type="pmid">34224720</pub-id></citation></ref>
<ref id="ref26"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Modlin</surname> <given-names>S. J.</given-names></name> <name><surname>Robinhold</surname> <given-names>C.</given-names></name> <name><surname>Morrissey</surname> <given-names>C.</given-names></name> <name><surname>Mitchell</surname> <given-names>S. N.</given-names></name> <name><surname>Ramirez-Busby</surname> <given-names>S. M.</given-names></name> <name><surname>Shmaya</surname> <given-names>T.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>Exact mapping of illumina blind spots in the <italic>Mycobacterium tuberculosis</italic> genome reveals platform-wide and workflow-specific biases</article-title>. <source>Microb. Genom.</source> <volume>7</volume>:<fpage>000465</fpage>. doi: <pub-id pub-id-type="doi">10.1099/MGEN.0.000465/CITE/REFWORKS</pub-id></citation></ref>
<ref id="ref27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mustazzolu</surname> <given-names>A.</given-names></name> <name><surname>Borroni</surname> <given-names>E.</given-names></name> <name><surname>Cirillo</surname> <given-names>D. M.</given-names></name> <name><surname>Giannoni</surname> <given-names>F.</given-names></name> <name><surname>Iacobino</surname> <given-names>A.</given-names></name> <name><surname>Fattorini</surname> <given-names>L.</given-names></name> <etal/></person-group>. (<year>2018</year>). <article-title>Trend in rifampicin-, multidrug- and extensively drug-resistant tuberculosis in Italy, 2009-2016</article-title>. <source>Eur. Respir. J.</source> <volume>52</volume>:<fpage>1800070</fpage>. doi: <pub-id pub-id-type="doi">10.1183/13993003.00070-2018</pub-id>, PMID: <pub-id pub-id-type="pmid">29724919</pub-id></citation></ref>
<ref id="ref28"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pedersen</surname> <given-names>B. S.</given-names></name> <name><surname>Quinlan</surname> <given-names>A. R.</given-names></name></person-group> (<year>2018</year>). <article-title>Mosdepth: quick coverage calculation for genomes and exomes</article-title>. <source>Bioinformatics</source> <volume>34</volume>, <fpage>867</fpage>&#x2013;<lpage>868</lpage>. doi: <pub-id pub-id-type="doi">10.1093/BIOINFORMATICS/BTX699</pub-id>, PMID: <pub-id pub-id-type="pmid">29096012</pub-id></citation></ref>
<ref id="ref29"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Peker</surname> <given-names>N.</given-names></name> <name><surname>Schuele</surname> <given-names>L.</given-names></name> <name><surname>Kok</surname> <given-names>N.</given-names></name> <name><surname>Terrazos</surname> <given-names>M.</given-names></name> <name><surname>Neuenschwander</surname> <given-names>S. M.</given-names></name> <name><surname>de Beer</surname> <given-names>J.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>Evaluation of whole-genome sequence data analysis approaches for short- and long-read sequencing of <italic>Mycobacterium tuberculosis</italic></article-title>. <source>Microb. Genom.</source> <volume>7</volume>:<fpage>000695</fpage>. doi: <pub-id pub-id-type="doi">10.1099/MGEN.0.000695/CITE/REFWORKS</pub-id></citation></ref>
<ref id="ref30"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Phelan</surname> <given-names>J. E.</given-names></name> <name><surname>Coll</surname> <given-names>F.</given-names></name> <name><surname>Bergval</surname> <given-names>I.</given-names></name> <name><surname>Anthony</surname> <given-names>R. M.</given-names></name> <name><surname>Warren</surname> <given-names>R.</given-names></name> <name><surname>Sampson</surname> <given-names>S. L.</given-names></name> <etal/></person-group>. (<year>2016</year>). <article-title>Recombination in pe/ppe genes contributes to genetic variation in <italic>Mycobacterium tuberculosis</italic> lineages</article-title>. <source>BMC Genomics</source> <volume>17</volume>:<fpage>41</fpage>. doi: <pub-id pub-id-type="doi">10.1186/S12864-016-2467-Y</pub-id>, PMID: <pub-id pub-id-type="pmid">26923687</pub-id></citation></ref>
<ref id="ref31"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Phelan</surname> <given-names>J. E.</given-names></name> <name><surname>O&#x2019;Sullivan</surname> <given-names>D. M.</given-names></name> <name><surname>Machado</surname> <given-names>D.</given-names></name> <name><surname>Ramos</surname> <given-names>J.</given-names></name> <name><surname>Oppong</surname> <given-names>Y. E. A.</given-names></name> <name><surname>Campino</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>Integrating informatics tools and portable sequencing technology for rapid detection of resistance to anti-tuberculous drugs</article-title>. <source>Genome Med.</source> <volume>11</volume>, <fpage>1</fpage>&#x2013;<lpage>7</lpage>. doi: <pub-id pub-id-type="doi">10.1186/S13073-019-0650-X/TABLES/3</pub-id></citation></ref>
<ref id="ref32"><citation citation-type="other"><person-group person-group-type="author"><collab id="coll1">R Core Team</collab></person-group> (<year>2019</year>). R: A Language and Environment for Statistical Computing. Available at: <ext-link xlink:href="https://www.R-project.org/" ext-link-type="uri">https://www.R-project.org/</ext-link> (Accessed November 20, 2022).</citation></ref>
<ref id="ref33"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rhoads</surname> <given-names>A.</given-names></name> <name><surname>Au</surname> <given-names>K. F.</given-names></name></person-group> (<year>2015</year>). <article-title>PacBio sequencing and its applications</article-title>. <source>Genom. Proteom. Bioinform.</source> <volume>13</volume>, <fpage>278</fpage>&#x2013;<lpage>289</lpage>. doi: <pub-id pub-id-type="doi">10.1016/J.GPB.2015.08.002</pub-id>, PMID: <pub-id pub-id-type="pmid">26542840</pub-id></citation></ref>
<ref id="ref34"><citation citation-type="other"><person-group person-group-type="author"><collab id="coll2">RStudio Team</collab></person-group> (<year>2019</year>). RStudio: Integrated Development Environment for R. Available at: <ext-link xlink:href="http://www.rstudio.com/" ext-link-type="uri">http://www.rstudio.com/</ext-link> (Accessed November 20, 2022).</citation></ref>
<ref id="ref35"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>South</surname> <given-names>A.</given-names></name> <name><surname>Dippenaar</surname> <given-names>A.</given-names></name> <name><surname>Grobbelaar</surname> <given-names>M.</given-names></name> <name><surname>Phd</surname> <given-names>W.</given-names></name> <name><surname>Hall</surname> <given-names>M. B.</given-names></name> <name><surname>Rabodoarivelo</surname> <given-names>M. S.</given-names></name> <etal/></person-group>. (<year>2022</year>). <article-title>Evaluation of Nanopore sequencing for <italic>Mycobacterium tuberculosis</italic> drug susceptibility testing and outbreak investigation: a genomic analysis</article-title>. <source>Lancet Microbe</source>. doi: <pub-id pub-id-type="doi">10.1016/S2666-5247(22)00301-9</pub-id> [Epub ahead of print], PMID: <pub-id pub-id-type="pmid">36549315</pub-id></citation></ref>
<ref id="ref36"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Villa</surname> <given-names>S.</given-names></name> <name><surname>Tagliani</surname> <given-names>E.</given-names></name> <name><surname>Borroni</surname> <given-names>E.</given-names></name> <name><surname>Castellotti</surname> <given-names>P. F.</given-names></name> <name><surname>Ferrarese</surname> <given-names>M.</given-names></name> <name><surname>Ghodousi</surname> <given-names>A.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>Outbreak of pre- and extensively drug-resistant tuberculosis in northern Italy: urgency of cross-border, multidimensional, surveillance systems</article-title>. <source>Eur. Respir. J.</source> <volume>58</volume>:<fpage>2100839</fpage>. doi: <pub-id pub-id-type="doi">10.1183/13993003.00839-2021</pub-id>, PMID: <pub-id pub-id-type="pmid">34049944</pub-id></citation></ref>
<ref id="ref37"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Walker</surname> <given-names>B. J.</given-names></name> <name><surname>Abeel</surname> <given-names>T.</given-names></name> <name><surname>Shea</surname> <given-names>T.</given-names></name> <name><surname>Priest</surname> <given-names>M.</given-names></name> <name><surname>Abouelliel</surname> <given-names>A.</given-names></name> <name><surname>Sakthikumar</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2014</year>). <article-title>Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement</article-title>. <source>PLoS One</source> <volume>9</volume>:<fpage>e112963</fpage>. doi: <pub-id pub-id-type="doi">10.1371/JOURNAL.PONE.0112963</pub-id>, PMID: <pub-id pub-id-type="pmid">25409509</pub-id></citation></ref>
<ref id="ref38"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Walker</surname> <given-names>T. M.</given-names></name> <name><surname>Fowler</surname> <given-names>P. W.</given-names></name> <name><surname>Knaggs</surname> <given-names>J.</given-names></name> <name><surname>Hunt</surname> <given-names>M.</given-names></name> <name><surname>Peto</surname> <given-names>T. E.</given-names></name> <name><surname>Walker</surname> <given-names>A. S.</given-names></name> <etal/></person-group>. (<year>2022</year>). <article-title>The 2021 WHO catalogue of <italic>Mycobacterium tuberculosis</italic> complex mutations associated with drug resistance: a genotypic analysis</article-title>. <source>Lancet Microbe</source> <volume>3</volume>, <fpage>e265</fpage>&#x2013;<lpage>e273</lpage>. doi: <pub-id pub-id-type="doi">10.1016/S2666-5247(21)00301-3</pub-id>, PMID: <pub-id pub-id-type="pmid">35373160</pub-id></citation></ref>
<ref id="ref39"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wick</surname> <given-names>R. R.</given-names></name> <name><surname>Holt</surname> <given-names>K. E.</given-names></name> <name><surname>Zimin</surname> <given-names>A. V.</given-names></name> <name><surname>Salzberg</surname> <given-names>S. L.</given-names></name> <name><surname>Hopkins</surname> <given-names>J.</given-names></name> <name><surname>Vaser</surname> <given-names>R.</given-names></name></person-group> (<year>2021a</year>). <article-title>Benchmarking of long-read assemblers for prokaryote whole genome sequencing</article-title>. <source>F1000Res.</source> <volume>8</volume>:<fpage>2138</fpage>. doi: <pub-id pub-id-type="doi">10.12688/f1000research.21782.4</pub-id></citation></ref>
<ref id="ref40"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wick</surname> <given-names>R. R.</given-names></name> <name><surname>Judd</surname> <given-names>L. M.</given-names></name> <name><surname>Cerdeira</surname> <given-names>L. T.</given-names></name> <name><surname>Hawkey</surname> <given-names>J.</given-names></name> <name><surname>M&#x00E9;ric</surname> <given-names>G.</given-names></name> <name><surname>Vezina</surname> <given-names>B.</given-names></name> <etal/></person-group>. (<year>2021b</year>). <article-title>Trycycler: consensus long-read assemblies for bacterial genomes</article-title>. <source>Genome Biol.</source> <volume>22</volume>, <fpage>1</fpage>&#x2013;<lpage>17</lpage>. doi: <pub-id pub-id-type="doi">10.1186/S13059-021-02483-Z/FIGURES/4</pub-id></citation></ref>
<ref id="ref41"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wick</surname> <given-names>R. R.</given-names></name> <name><surname>Judd</surname> <given-names>L. M.</given-names></name> <name><surname>Gorrie</surname> <given-names>C. L.</given-names></name> <name><surname>Holt</surname> <given-names>K. E.</given-names></name></person-group> (<year>2017</year>). <article-title>Unicycler: resolving bacterial genome assemblies from short and long sequencing reads</article-title>. <source>PLoS Comput. Biol.</source> <volume>13</volume>:<fpage>e1005595</fpage>. doi: <pub-id pub-id-type="doi">10.1371/JOURNAL.PCBI.1005595</pub-id>, PMID: <pub-id pub-id-type="pmid">28594827</pub-id></citation></ref>
<ref id="ref42"><citation citation-type="other"><person-group person-group-type="author"><collab id="coll3">World Health Organization</collab></person-group> (<year>2021a</year>). TB deaths and incidence. <italic>Global tuberculosis report</italic>, 13&#x2013;14.</citation></ref>
<ref id="ref43"><citation citation-type="other"><person-group person-group-type="author"><collab id="coll4">World Health Organization</collab></person-group> (<year>2021b</year>). Catalogue of mutations in <italic>Mycobacterium tuberculosis</italic> complex and their association with drug resistance. Available at: <ext-link xlink:href="https://www.who.int/publications/i/item/9789240028173" ext-link-type="uri">https://www.who.int/publications/i/item/9789240028173</ext-link> (Accessed August 24, 2022).</citation></ref>
<ref id="ref44"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhou</surname> <given-names>Z.</given-names></name> <name><surname>Alikhan</surname> <given-names>N. F.</given-names></name> <name><surname>Sergeant</surname> <given-names>M. J.</given-names></name> <name><surname>Luhmann</surname> <given-names>N.</given-names></name> <name><surname>Vaz</surname> <given-names>C.</given-names></name> <name><surname>Francisco</surname> <given-names>A. P.</given-names></name> <etal/></person-group>. (<year>2018</year>). <article-title>GrapeTree: visualization of core genomic relationships among 100,000 bacterial pathogens</article-title>. <source>Genome Res.</source> <volume>28</volume>, <fpage>1395</fpage>&#x2013;<lpage>1404</lpage>. doi: <pub-id pub-id-type="doi">10.1101/GR.232397.117</pub-id>, PMID: <pub-id pub-id-type="pmid">30049790</pub-id></citation></ref>
</ref-list>
</back>
</article>