<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Archiving and Interchange DTD v2.3 20070202//EN" "archivearticle.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="methods-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Microbiol.</journal-id>
<journal-title>Frontiers in Microbiology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Microbiol.</abbrev-journal-title>
<issn pub-type="epub">1664-302X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fmicb.2018.01294</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Microbiology</subject>
<subj-group>
<subject>Methods</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Whole-Cell MALDI-TOF MS Versus 16S rRNA Gene Analysis for Identification and Dereplication of Recurrent Bacterial Isolates</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>Strejcek</surname> <given-names>Michal</given-names></name>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/199207/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Smrhova</surname> <given-names>Tereza</given-names></name></contrib>
<contrib contrib-type="author">
<name><surname>Junkova</surname> <given-names>Petra</given-names></name>
<uri xlink:href="http://loop.frontiersin.org/people/522579/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Uhlik</surname> <given-names>Ondrej</given-names></name>
<uri xlink:href="http://loop.frontiersin.org/people/98869/overview"/>
</contrib>
</contrib-group>
<aff><institution>Department of Biochemistry and Microbiology, Faculty of Food and Biochemical Technology, University of Chemistry and Technology</institution>, <addr-line>Prague</addr-line>, <country>Czechia</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Parag Vaishampayan, Jet Propulsion Laboratory, United States</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Isao Yumoto, National Institute of Advanced Industrial Science and Technology (AIST), Japan; Yonghui Zeng, Aarhus University, Denmark</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Michal Strejcek <email>michal.strejcek&#x00040;vscht.cz</email></corresp>
<fn fn-type="other" id="fn001"><p>This article was submitted to Systems Microbiology, a section of the journal Frontiers in Microbiology</p></fn></author-notes>
<pub-date pub-type="epub">
<day>19</day>
<month>06</month>
<year>2018</year>
</pub-date>
<pub-date pub-type="collection">
<year>2018</year>
</pub-date>
<volume>9</volume>
<elocation-id>1294</elocation-id>
<history>
<date date-type="received">
<day>13</day>
<month>09</month>
<year>2017</year>
</date>
<date date-type="accepted">
<day>28</day>
<month>05</month>
<year>2018</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2018 Strejcek, Smrhova, Junkova and Uhlik.</copyright-statement>
<copyright-year>2018</copyright-year>
<copyright-holder>Strejcek, Smrhova, Junkova and Uhlik</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract><p>Many ecological experiments are based on the extraction and downstream analyses of microorganisms from different environmental samples. Due to its high throughput, cost-effectiveness and rapid performance, Matrix Assisted Laser Desorption/Ionization Mass Spectrometry with Time-of-Flight detector (MALDI-TOF MS), which has been proposed as a promising tool for bacterial identification and classification, could be advantageously used for dereplication of recurrent bacterial isolates. In this study, we compared whole-cell MALDI-TOF MS-based analyses of 49 bacterial cultures to two well-established bacterial identification and classification methods based on nearly complete 16S rRNA gene sequence analyses: a phylotype-based approach, using a closest type strain assignment, and a sequence similarity-based approach involving a 98.65% sequence similarity threshold, which has been found to best delineate bacterial species. Culture classification using reference-based MALDI-TOF MS was comparable to that yielded by phylotype assignment up to the genus level. At the species level, agreement between 16S rRNA gene analysis and MALDI-TOF MS was found to be limited, potentially indicating that spectral reference databases need to be improved. We also evaluated the mass spectral similarity technique for species-level delineation which can be used independently of reference databases. We established optimal mass spectral similarity thresholds which group MALDI-TOF mass spectra of common environmental isolates analogically to phylotype- and sequence similarity-based approaches. When using a mass spectrum similarity approach, we recommend a mass range of 4&#x02013;10 kDa for analysis, which is populated with stable mass signals and contains the majority of phylotype-determining peaks. We show that a cosine similarity (CS) threshold of 0.79 differentiate mass spectra analogously to 98.65% species-level delineation sequence similarity threshold, with corresponding precision and recall values of 0.70 and 0.73, respectively. When matched to species-level phylotype assignment, an optimal CS threshold of 0.92 was calculated, with associated precision and recall values of 0.83 and 0.64, respectively. Overall, our research indicates that a similarity-based MALDI-TOF MS approach can be routinely used for efficient dereplication of isolates for downstream analyses, with minimal loss of unique organisms. In addition, MALDI-TOF MS analysis has further improvement potential unlike 16S rRNA gene analysis, whose methodological limits have reached a plateau.</p></abstract>
<kwd-group>
<kwd>bacterial isolation</kwd>
<kwd>bacterial identification</kwd>
<kwd>16S rRNA gene</kwd>
<kwd>MALDI-TOF mass spectrometry (MS)</kwd>
<kwd>MALDI BioTyper</kwd>
<kwd>dereplication of isolates</kwd>
<kwd>species delineation</kwd>
</kwd-group>
<contract-num rid="cn001">17-00227S</contract-num>
<contract-sponsor id="cn001">Grantová Agentura České Republiky<named-content content-type="fundref-id">10.13039/501100001824</named-content></contract-sponsor>
<counts>
<fig-count count="4"/>
<table-count count="2"/>
<equation-count count="0"/>
<ref-count count="78"/>
<page-count count="13"/>
<word-count count="9958"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>Introduction</title>
<p>Many microbial ecological studies rely on the extraction of bacteria from soil, water, and other environmental samples. In such cases, the number of unique organisms among the hundreds or thousands of isolates is usually limited. It is therefore desirable to separate isolates into bins with common characteristics, i.e., dereplicate them, in order to avoid time-consuming, expensive and, in particular, redundant downstream analyses of each isolate. This dereplication of recurrent bacterial isolates can be achieved by analyzing phenotypic, chemotaxonomic, genotypic, and phylogenetic data (Schleifer, <xref ref-type="bibr" rid="B53">2009</xref>); this can be done by using the following exemplary techniques: fatty acid methyl ester (FAME) profiling of cell membrane lipids or genomic fingerprinting based on repetitive sequence-based polymerase chain reaction&#x02014;(GTG)<sub>5</sub>-PCR (Versalovic, <xref ref-type="bibr" rid="B71">1994</xref>; Vancanneyti et al., <xref ref-type="bibr" rid="B70">1996</xref>; De Clerck and De Vos, <xref ref-type="bibr" rid="B17">2002</xref>; Coorevits et al., <xref ref-type="bibr" rid="B14">2008</xref>). Small-subunit ribosomal RNA (specifically 16S rRNA) gene sequencing is then often employed to identify and classify representative bacterial isolates (Janda and Abbott, <xref ref-type="bibr" rid="B28">2007</xref>; Kim et al., <xref ref-type="bibr" rid="B30">2012</xref>). The key to the success of 16S rRNA gene sequencing is its applicability across whole bacterial and archaeal domains (Woese, <xref ref-type="bibr" rid="B76">1987</xref>; Munoz et al., <xref ref-type="bibr" rid="B43">2011</xref>). The identification process involves assigning the sequence to a taxonomic bin (phylotype) based on known references either by classification (Wang et al., <xref ref-type="bibr" rid="B73">2007</xref>) or identification of the closest type strain (Kim et al., <xref ref-type="bibr" rid="B30">2012</xref>). Sequence similarity of the 16S rRNA gene can also be used as a proxy for bacterial species; a 98.65% sequence similarity threshold was calculated to best match bacterial species demarcation based on the analysis of 6,787 genomes (Kim et al., <xref ref-type="bibr" rid="B29">2014</xref>).</p>
<p>Ever since its first applications for identification purposes (Claydon et al., <xref ref-type="bibr" rid="B12">1996</xref>; Holland et al., <xref ref-type="bibr" rid="B26">1996</xref>), MALDI-TOF MS has been proposed as a promising alternative for the dereplication of recurrent bacterial isolates (Dieckmann et al., <xref ref-type="bibr" rid="B18">2005</xref>; Ghyselinck et al., <xref ref-type="bibr" rid="B24">2011</xref>; Spitaels et al., <xref ref-type="bibr" rid="B60">2016</xref>) and has been used as a cost- and time-effective alternative to 16S rRNA gene sequencing (Mellmann et al., <xref ref-type="bibr" rid="B41">2008</xref>; Uhl&#x000ED;k et al., <xref ref-type="bibr" rid="B68">2011</xref>; Koubek et al., <xref ref-type="bibr" rid="B32">2012</xref>; Wieser et al., <xref ref-type="bibr" rid="B75">2012</xref>; Seng et al., <xref ref-type="bibr" rid="B58">2013</xref>). MALDI-TOF MS-based identification of microorganisms involves the generation of mass spectra from whole-cell material or extracted intracellular content which are then matched to known database references (Fenselau and Demirev, <xref ref-type="bibr" rid="B19">2001</xref>; Lay, <xref ref-type="bibr" rid="B37">2001</xref>). Accurate identification depends on two factors: adequate spectrum quality and close database reference matches. Several successful commercial platforms, such as Biotyper (Bruker Daltonics) and Vitek MS (BioM&#x000E9;rieux), are mainly used to identify clinically important species. However, these systems, whose reference databases cover only a small fraction of the vast range of microbial diversity, often fail to function when applied to environmental isolates. Mass spectra generated from whole-cell and cell extract measurements are abundant in ribosomal protein peaks (Ryzhov and Fenselau, <xref ref-type="bibr" rid="B50">2001</xref>; Suarez et al., <xref ref-type="bibr" rid="B64">2013</xref>). Like ribosomal RNA, ribosomal proteins are universally conserved in both prokaryotes and eukaryotes and can be used to reconstruct phylogeny (Yutin et al., <xref ref-type="bibr" rid="B78">2012</xref>).</p>
<p>In this study, we aimed to discuss the current state of reference-based MS classification. In particular, we established mass similarity thresholds which mimic 16S rRNA gene analyses used for species-level delineation based on (i) assigning the closest type strain (herein referred to as phylotype-based approach); and (ii) the 98.65% sequence similarity threshold (herein referred to as sequence similarity-based approach).</p>
</sec>
<sec sec-type="materials and methods" id="s2">
<title>Materials and methods</title>
<sec>
<title>Culture collection</title>
<p>The bacterial cultures used in this study consisted of 49 isolates typically found in soils and sediments (Table <xref ref-type="table" rid="T1">1</xref>). Twelve strains were previously used as a mock community for error evaluation of high-throughput 16S rRNA gene sequencing analysis (Fraraccio et al., <xref ref-type="bibr" rid="B21">2017</xref>). The other 37 cultures were composed of environmental isolates collected previously by soil/sediment microbial extraction in the authors&#x00027; laboratory. The bacterial set consisted of three major bacterial phyla (Proteobacteria, Actinobacteria and Firmicutes), spanning five classes, and 22 genera. All cultures were grown on Plate Count Agar (PCA, Oxoid, UK) at 28&#x000B0;C for 24 h.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Collection of isolates used in this study.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Culture designation</bold></th>
<th valign="top" align="left"><bold>16S rRNA gene analysis Closest type strain (similarity %)</bold></th>
<th valign="top" align="left"><bold>MALDI BioTyper&#x02122; Identification (Score)</bold></th>
<th valign="top" align="left"><bold>Origin</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Rho1</td>
<td valign="top" align="left"><italic>Rhodococcus erythropolis</italic> NBRC 15567<sup>T</sup> (99.85)</td>
<td valign="top" align="left"><italic>Rhodococcus erythropolis</italic> (2.39)</td>
<td valign="top" align="left">Compost soil</td>
</tr>
<tr>
<td valign="top" align="left">Rho2</td>
<td valign="top" align="left"><italic><bold>Rhodococcus jostii</bold></italic> <bold>RHA1</bold><break/> <italic>Rhodococcus jostii</italic> DSM 44719<sup>T</sup> (99.93)</td>
<td valign="top" align="left"><italic>Rhodococcus imtechensis</italic> (2.41)</td>
<td valign="top" align="left">Strain Collection</td>
</tr>
<tr>
<td valign="top" align="left">Rho3</td>
<td valign="top" align="left"><italic>Rhodococcus pedocola</italic> UC12<sup>T</sup> (100)</td>
<td valign="top" align="left"><italic>Rhodococcus</italic> sp. (1.71)</td>
<td valign="top" align="left">Compost soil</td>
</tr>
<tr>
<td valign="top" align="left">Art1</td>
<td valign="top" align="left"><italic>Arthrobacter oryzae</italic> NRRL B-24478<sup>T</sup> (99.42)<break/> <italic>Arthrobacter humicola</italic> KV-653<sup>T</sup> (99.42)</td>
<td valign="top" align="left"><italic>Arthrobacter</italic> sp. (1.91)</td>
<td valign="top" align="left">Rhizosphere 1</td>
</tr>
<tr>
<td valign="top" align="left">Art2</td>
<td valign="top" align="left"><italic>Arthrobacter pascens</italic> DSM 20545<sup>T</sup> (98.76)</td>
<td valign="top" align="left"><italic>Arthrobacter oxydans</italic> (2.08)</td>
<td valign="top" align="left">Rhizosphere 1</td>
</tr>
<tr>
<td valign="top" align="left">Art3</td>
<td valign="top" align="left" rowspan="2" style="border: 1px solid black"><italic>Arthrobacter halophytocola</italic> KLBMP 5180<sup>T</sup> (100)<break/> <italic>Glutamicibacter arilaitensis</italic> Re117<sup>T</sup> (100)</td>
<td valign="top" align="left"><italic>Arthrobacter</italic> sp. (1.87)</td>
<td valign="top" align="left">Rhizosphere 1</td>
</tr>
<tr>
<td valign="top" align="left">Glu1</td>
<td valign="top" align="left"><italic>Arthrobacter arilaitensis</italic> (2.56) [<italic>Glutamicibacter arilaitensis</italic>]</td>
<td valign="top" align="left">Rhizosphere 1</td>
</tr>
<tr>
<td valign="top" align="left">Mic1</td>
<td valign="top" align="left"><italic><bold>Micrococcus luteus</bold></italic> <bold>NCTC 2665</bold><sup>T</sup></td>
<td valign="top" align="left"><italic>Micrococcus luteus</italic> (2.53)</td>
<td valign="top" align="left">Strain collection</td>
</tr>
<tr>
<td valign="top" align="left">Paa1<break/> Paa2<break/> Paa3</td>
<td valign="top" align="left" style="border: 1px solid black"><italic>Paenarthrobacter ilicis</italic> DSM 20138<sup>T</sup> (99.49)<break/> <italic>Paenarthrobacter nitroguajacolicus</italic> G2-1<sup>T</sup> (99.93)<break/> <italic>Paenarthrobacter nitroguajacolicus</italic> G2-1<sup>T</sup> (99.93)</td>
<td valign="top" align="left"><italic>Arthrobacter ilicis</italic> (2.62) [<italic>Paenarthrobacter ilicisi</italic>]<break/> <italic>Arthrobacter aurescens</italic> (2.41) [<italic>Paenarthrobacter aurescens</italic>]<break/> <italic>Arthrobacter aurescens</italic> (2.45) [<italic>Paenarthrobacter aurescens</italic>]</td>
<td valign="top" align="left">Rhizosphere 1<break/> Rhizosphere 1<break/> Rhizosphere 1</td>
</tr>
<tr>
<td valign="top" align="left">Psa1<break/> Psa2</td>
<td valign="top" align="left" style="border: 1px solid black"><italic><bold>Pseudarthrobacter chlorophenolicus</bold></italic> <bold>A6</bold><sup>T</sup><break/> <italic>Pseudarthrobacter equi</italic> IMMIB L-1606<sup>T</sup> (99.93) <italic>Pseudarthrobacter oxydans</italic> KCTC 3383<sup>T</sup> (99.93)</td>
<td valign="top" align="left"><italic>Arthrobacter chlorophenolicus</italic> (2.42) [<italic>Pseudarthrobacter chlorophenolicus</italic>]<break/> <italic>Arthrobacter chlorophenolicus</italic> (2.12) [<italic>Pseudarthrobacter chlorophenolicus</italic>]</td>
<td valign="top" align="left">Strain collection<break/> Rhizosphere 1</td>
</tr>
<tr>
<td valign="top" align="left">Psa3<break/> Psa4</td>
<td valign="top" align="left" style="border: 1px solid black"><break/> <italic>Pseudarthrobacter oxydans</italic> KCTC 3383<sup>T</sup> (100)<break/> <italic>Pseudarthrobacter siccitolerans</italic> 4J27<sup>T</sup> (99.49)</td>
<td valign="top" align="left"><italic>Arthrobacter oxydans</italic> (2.47) [<italic>Pseudarthrobacter oxydans</italic>]<break/> <italic>Arthrobacter polychromogenes</italic> (2.38) [<italic>Pseudarthrobacter polychromogenes</italic>]</td>
<td valign="top" align="left">Rhizosphere 1<break/> Rhizosphere 1</td>
</tr>
<tr>
<td valign="top" align="left">Oer1</td>
<td valign="top" align="left"><italic>Oerskovia turbata</italic> NRRL B-8019<sup>T</sup> (99.85)</td>
<td valign="top" align="left"><italic>Oerskovia</italic> sp. (1.86)</td>
<td valign="top" align="left">Rhizosphere 1</td>
</tr>
<tr>
<td valign="top" align="left">Bac1</td>
<td valign="top" align="left"><italic>Bacillus paralicheniformis</italic> KJ-16<sup>T</sup> (99.92)</td>
<td valign="top" align="left"><italic>Bacillus licheniformis</italic> (2.33)</td>
<td valign="top" align="left">Compost soil</td>
</tr>
<tr>
<td valign="top" align="left">Bac2</td>
<td valign="top" align="left"><italic>Bacillus rhizosphaerae</italic> SC-N012<sup>T</sup>) (99.5)<break/> <italic>Bacillus clausii</italic> DSM 8716<sup>T</sup> (99.5)</td>
<td valign="top" align="left"><italic>Bacillus</italic> sp. (1.96)</td>
<td valign="top" align="left">Compost soil</td>
</tr>
<tr>
<td valign="top" align="left">Bac3</td>
<td valign="top" align="left"><italic>Bacillus subtilis</italic> subsp. <italic>inaquosorum</italic> KCTC 13429<sup>T</sup><break/> (100) <italic>Bacillus aryabhattai</italic> B8W22<sup>T</sup> (100)</td>
<td valign="top" align="left"><italic>Bacillus megaterium</italic> (2.25)</td>
<td valign="top" align="left">Compost soil</td>
</tr>
<tr>
<td valign="top" align="left">Bac4</td>
<td valign="top" align="left"><italic>Bacillus tequilensis</italic> KCTC 13622<sup>T</sup> (99.93)</td>
<td valign="top" align="left"><italic>Bacillus subtilis</italic> (2.22)</td>
<td valign="top" align="left">Compost soil</td>
</tr>
<tr>
<td valign="top" align="left">Bac5<break/> Bac6</td>
<td valign="top" align="left" style="border: 1px solid black"><italic>Bacillus safensis</italic> FO-36b<sup>T</sup> (100)<break/> <italic><bold>Bacillus pumilus</bold></italic> <bold>SAFR-032</bold><break/> <italic>Bacillus zhangzhouensis</italic> DW5-4<sup>T</sup> (99.79)<break/> <italic>Bacillus pumilus</italic> ATCC 7061<sup>T</sup> (99.79)</td>
<td valign="top" align="left"><italic>Bacillus pumilus</italic> (2.08)<break/><break/><italic>Bacillus pumilus</italic> (2.31)</td>
<td valign="top" align="left">Compost soil<break/><break/> Strain collection</td>
</tr>
<tr>
<td valign="top" align="left">Bre1</td>
<td valign="top" align="left"><italic>Brevibacterium frigoritolerans</italic> DSM 8801<sup>T</sup> (100) [<italic>Bacillus</italic> sp.]</td>
<td valign="top" align="left"><italic>Bacillus simplex</italic> (2.2)</td>
<td valign="top" align="left">Compost soil</td>
</tr>
<tr>
<td valign="top" align="left">Bre2</td>
<td valign="top" align="left"><italic>Brevibacillus borstelensis</italic> NRRL NRS-818<sup>T</sup> (99.85)</td>
<td valign="top" align="left"><italic>Brevibacillus borstelensis</italic> (2.38)</td>
<td valign="top" align="left">Compost soil</td>
</tr>
<tr>
<td valign="top" align="left">Bre3</td>
<td valign="top" align="left"><italic>Brevibacillus panacihumi</italic> DCY35<sup>T</sup> (100)</td>
<td valign="top" align="left">NA (&#x0003C; 1.7)</td>
<td valign="top" align="left">Compost soil</td>
</tr>
<tr>
<td valign="top" align="left">Pab1</td>
<td valign="top" align="left"><italic>Paenibacillus lactis</italic> MB 1871<sup>T</sup> (99.86)</td>
<td valign="top" align="left"><italic>Paenibacillus lactis</italic> (2.4)</td>
<td valign="top" align="left">Compost soil</td>
</tr>
<tr>
<td valign="top" align="left">Lys1</td>
<td valign="top" align="left"><italic>Lysinibacillus halotolerans</italic> LAM612<sup>T</sup> (97.86)</td>
<td valign="top" align="left"><italic>Lysinibacillus</italic> sp. (1.73)</td>
<td valign="top" align="left">Compost soil</td>
</tr>
<tr>
<td valign="top" align="left">Lys2</td>
<td valign="top" align="left"><italic>Lysinibacillus halotolerans</italic> LAM612<sup>T</sup> (99.36)</td>
<td valign="top" align="left"><italic>Lysinibacillus</italic> sp. (1.72)</td>
<td valign="top" align="left">Compost soil</td>
</tr>
<tr>
<td valign="top" align="left">Lys3<break/> Lys4</td>
<td valign="top" align="left" style="border: 1px solid black"><italic>Lysinibacillus xylanilyticus</italic> DSM 23493<sup>T</sup> (99.57)<break/> <italic>Lysinibacillus xylanilyticus</italic> DSM 23493<sup>T</sup> (99.22)<break/> <italic>Lysinibacillus pakistanensis</italic> JCM 18776<sup>T</sup> (99.22)</td>
<td valign="top" align="left"><italic>Lysinibacillus</italic> sp. (1.75)<break/> <italic>Lysinibacillus fusiformis</italic> (2.01)</td>
<td valign="top" align="left">Compost soil<break/> Compost soil</td>
</tr>
<tr>
<td valign="top" align="left">Sol1</td>
<td valign="top" align="left"><italic>Solibacillus isronensis</italic> B3W22<sup>T</sup> (100)</td>
<td valign="top" align="left"><italic>Solibacillus silvestris</italic> (2.4)</td>
<td valign="top" align="left">Compost soil</td>
</tr>
<tr>
<td valign="top" align="left">Spo1</td>
<td valign="top" align="left"><italic>Sporosarcina koreensis</italic> F73<sup>T</sup> (99.71)</td>
<td valign="top" align="left">NA (&#x0003C; 1.7)</td>
<td valign="top" align="left">Compost soil</td>
</tr>
<tr>
<td valign="top" align="left">Bos1</td>
<td valign="top" align="left"><italic>Bosea robiniae</italic> DSM 26672<sup>T</sup> (99.48)</td>
<td valign="top" align="left">NA (&#x0003C; 1.7)</td>
<td valign="top" align="left">Rhizosphere 1</td>
</tr>
<tr>
<td valign="top" align="left">Met1</td>
<td valign="top" align="left"><italic><bold>Methylobacterium radiotolerans</bold></italic> <bold>JCM 2831</bold><sup>T</sup></td>
<td valign="top" align="left"><italic>Methylobacterium radiotolerans</italic> (2.22)</td>
<td valign="top" align="left">Strain collection</td>
</tr>
<tr>
<td valign="top" align="left">Rhi1</td>
<td valign="top" align="left"><italic><bold>Agrobacterium fabrum</bold></italic> <bold>strain C58</bold><break/> <italic>Rhizobium pusense</italic> LMG 25623<sup>T</sup> (99.33)</td>
<td valign="top" align="left"><italic>Rhizobium radiobacter</italic> (2.22)</td>
<td valign="top" align="left">Strain collection</td>
</tr>
<tr>
<td valign="top" align="left">Ach1</td>
<td valign="top" align="left"><italic><bold>Achromobacter xylosoxidans</bold></italic> <bold>A8</bold><break/> <italic>Achromobacter marplatensis</italic> B2<sup>T</sup> (99.85)</td>
<td valign="top" align="left"><italic>Achromobacter xylosoxidans</italic> (2.16)</td>
<td valign="top" align="left">Strain collection</td>
</tr>
<tr>
<td valign="top" align="left">Pan1</td>
<td valign="top" align="left"><italic><bold>Pandoraea pnomenusa</bold></italic> <bold>B-356</bold><break/> <italic>Pandoraea pnomenusa</italic> DSM 16536<sup>T</sup> (99.93)</td>
<td valign="top" align="left"><italic>Pandoraea pnomenusa</italic> (2.4)</td>
<td valign="top" align="left">Strain collection</td>
</tr>
<tr>
<td valign="top" align="left">Par1</td>
<td valign="top" align="left"><italic><bold>Paraburkholderia xenovorans</bold></italic> <bold>LB400</bold><sup>T</sup></td>
<td valign="top" align="left"><italic>Burkholderia xenovorans</italic> (2.49) [<italic>Paraburkholderia xenovorans</italic>]</td>
<td valign="top" align="left">Strain collection</td>
</tr>
<tr>
<td valign="top" align="left">Cup1</td>
<td valign="top" align="left"><italic><bold>Cupriavidus necator</bold></italic> <bold>H850</bold><break/> <italic>Cupriavidus necator</italic> N-1<sup>T</sup> (99.93)</td>
<td valign="top" align="left"><italic>Cupriavidus necator (2.36)</italic></td>
<td valign="top" align="left">Strain collection</td>
</tr>
<tr>
<td valign="top" align="left">Psm1<break/> Psm2</td>
<td valign="top" align="left" style="border: 1px solid black"><italic><bold>Pseudomonas alcaliphila</bold></italic> <bold>JCM 10630</bold><sup>T</sup><break/> <italic><bold>Pseudomonas alcaliphila</bold></italic> <bold>JAB1</bold><break/> <italic>Pseudomonas chengduensis</italic> MBR<sup>T</sup> (99.93)</td>
<td valign="top" align="left"><italic>Pseudomonas alcaliphila</italic> (2.41)<break/> <italic>Pseudomonas alcaliphila</italic> (2.24)</td>
<td valign="top" align="left">Strain collection<break/> Strain collection</td>
</tr>
<tr>
<td valign="top" align="left">Psm3</td>
<td valign="top" align="left"><italic>Pseudomonas anguilliseptica</italic> NCIMB 1949<sup>T</sup> (99.33)</td>
<td valign="top" align="left"><italic>Pseudomonas anguilliseptica</italic> (2.02)</td>
<td valign="top" align="left">Rhizosphere 1</td>
</tr>
<tr>
<td valign="top" align="left">Psm4<break/> Psm5</td>
<td valign="top" align="left" style="border: 1px solid black"><italic>Pseudomonas extremaustralis</italic> 14-3<sup>T</sup> (99.86)<break/> <italic>Pseudomonas gessardii</italic> DSM 17152<sup>T</sup> (99.93)</td>
<td valign="top" align="left"><italic>Pseudomonas veronii</italic> (2.37)<break/> <italic>Pseudomonas gessardii</italic> (2.25)</td>
<td valign="top" align="left">Sediment 1<break/> Sediment 1</td>
</tr>
<tr>
<td valign="top" align="left">Psm6<break/> Psm7<break/> Psm8<break/> Psm9</td>
<td valign="top" align="left" style="border: 1px solid black"><italic>Pseudomonas hunanensis</italic> LV<sup>T</sup> (99.64)<break/> <italic><bold>Pseudomonas putida</bold></italic> <bold>JB</bold> <italic>Pseudomonas hunanensis</italic> LV<sup>T</sup> (99.93)<break/> <italic>Pseudomonas hunanensis</italic> LV<sup>T</sup> (99.93)<break/> <italic>Pseudomonas taiwanensis</italic> BCRC 17751<sup>T</sup> (98.92)</td>
<td valign="top" align="left"><italic>Pseudomonas</italic> sp[2] (2.31)<break/> <italic>Pseudomonas putida</italic> (2.42)<break/> <italic>Pseudomonas putida</italic> (2.45)<break/> <italic>Pseudomonas</italic> sp. (1.79)</td>
<td valign="top" align="left">Contaminated soil<break/> Rhizosphere 2<break/> Sediment 2<break/> Sediment 1</td>
</tr>
<tr>
<td/>
<td valign="top" align="left"><italic>Pseudomonas plecoglossicida</italic> NBRC 103162<sup>T</sup> (98.92)</td>
<td/>
<td/>
</tr>
<tr>
<td valign="top" align="left">Psm10</td>
<td valign="top" align="left"><italic><bold>Pseudomonas stutzeri</bold></italic> <bold>JM300</bold><break/> <italic>Pseudomonas songnenensis</italic> NEAU-ST5-5<sup>T</sup> (99.06)</td>
<td valign="top" align="left">NA (&#x0003C; 1.7)</td>
<td valign="top" align="left">Strain collection</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>Identification results based on 16S rRNA gene (EzBioCloud Identify Service) and MALDI BioTyper (v3.1 equipped with MBT 6903, covering 2,226 unique bacterial species) analyses. Entries in bold are known bacterial strains. Cultures in rectangles were grouped by 98.65% similarity of the 16S rRNA gene using the UPGMA clustering method. Entries in square brackets are updated bacterial nomenclature entries (see Materials and Methods section). Origin: compost soil&#x02014;compost soil for gardening purposes, Central Bohemia, Czech Republic; rhizophere 1&#x02014;PCB-contaminated soil with horseradish vegetation, South Bohemia, Czech Republic (Uhl&#x000ED;k et al., <xref ref-type="bibr" rid="B68">2011</xref>); rhizosphere 2&#x02014;PCB-contaminated soil with nightshade vegetation, Northern Bohemia, Czech Republic (Kurzawov&#x000E1; et al., <xref ref-type="bibr" rid="B34">2012</xref>); contaminated soil&#x02014;PCB-contaminated soil, Czech Republic (Nov&#x000E1;kov&#x000E1; et al., <xref ref-type="bibr" rid="B46">2002</xref>); sediment 1&#x02014;PAH-contaminated sediment, Romania (Wald et al., <xref ref-type="bibr" rid="B72">2015</xref>); sediment 2&#x02014;PCB-contaminated sediment, Strazsky kanal, Slovakia (Koubek et al., <xref ref-type="bibr" rid="B32">2012</xref>). All sequences of environmental isolates were deposited in the NCBI Nucleotide database under PopSet number 1315444717. Accession numbers for the strains obtained from microbial collections are as follows: Achromobacter xylosoxidans A8 (NC_014640); Agrobacterium fabrum strain C58 (NC_003062); Bacillus pumilus SAFR-032 (NC_009848); Cupriavidus necator H850 (MG708169); Methylobacterium radiotolerans JCM 2831<sup>T</sup> (NC_010505); Micrococcus luteus NCTC 2665<sup>T</sup> (NC_012803); Pandoraea pnomenusa B-356 (EF596910); Paraburkholderia xenovorans LB400<sup>T</sup> (NC_007951); Pseudarthrobacter chlorophenolicus A6<sup>T</sup> (NC_011886); Pseudomonas alcaliphila JAB1 (NZ_CP016162); Pseudomonas alcaliphila JCM 10630<sup>T</sup> (NR_024734); Pseudomonas putida JB (NZ_CP016212); Pseudomonas stutzeri JM300 (MG708165) and Rhodococcus jostii RHA1 (NC_008268)</italic>.</p>
</table-wrap-foot>
</table-wrap>
</sec>
<sec>
<title>DNA isolation and 16S rRNA gene PCR amplification</title>
<p>Genomic DNA was isolated from pure cultures using thermal lysis. Briefly, an entire loop of cell material was resuspended in molecular grade water (Sigma-Aldrich, USA) and incubated at 99&#x000B0;C for 15 min. The lysates were pelleted, and the supernatant was used as a template DNA source. The PCR mixture, with a total volume of 15 &#x003BC;L, was prepared using the KAPA HiFi HotStart ReadyMix kit (Kapa Biosystems, USA) and 16S rRNA gene primers 27fM, 5&#x02032;-AGAGTTTGATCMTGGCTCAG-3&#x02032; and 1492rY, 5&#x02032;-GYTACCTTGTTACGACTT-3&#x02032; (Lane, <xref ref-type="bibr" rid="B35">1991</xref>). The PCR thermal profile was set to 95&#x000B0;C for 5 min, followed by 25 cycles of 98&#x000B0;C for 20 s, 56&#x000B0;C for 15 s, and 72&#x000B0;C for 45 s, and concluded with a final elongation step at 72&#x000B0;C for 5 min. After the PCR products were evaluated by 1% agarose gel electrophoresis, 3&#x02013;6 additional cycles of reconditioning PCR (Thompson et al., <xref ref-type="bibr" rid="B66">2002</xref>) were performed with 5 &#x003BC;L of PCR product as a template DNA to obtain a final volume of 50 &#x003BC;L. Samples were purified with the Genomic DNA Clean &#x00026; Concentrator&#x02122;-10 Kit (Zymo Research, USA) following the manufacturer&#x00027;s instructions. Sanger sequencing was performed bidirectionally using both forward and reverse primers at GATC BIOTECH, Konstanz, Germany. Sanger sequencing chromatograms were manually inspected with the aid of MEGA7 software (Kumar et al., <xref ref-type="bibr" rid="B33">2016</xref>), converted into sequences and both reads were then merged into a nearly full-length sequence. All sequences were trimmed to the corresponding <italic>Escherichia coli</italic> 16S rRNA gene positions 57 to 1449 and were deposited in the NCBI nucleotide database under PopSet number 1315444717.</p>
</sec>
<sec>
<title>16S rRNA gene sequence analysis</title>
<p>Almost full-length 16S rRNA gene sequences were uploaded to EzBioCloud (Yoon et al., <xref ref-type="bibr" rid="B77">2017</xref>) and classified using the Identify service (Version 2017.05). The closest type strain match was used for potential species identification (Kim et al., <xref ref-type="bibr" rid="B29">2014</xref>). Sequences sharing the assigned closest type strain are herein designated as species-level phylotypes and are, for simplicity, referred to as phylotypes throughout this study. All multiple type strains with the same percent similarity to the culture tested were reported (Table <xref ref-type="table" rid="T1">1</xref>).</p>
<p>As a complementary approach to grouping closely related bacterial cultures without reliance on referential databases, a similarity-based clustering was employed. Sequence pairwise similarities of 16S rRNA genes were obtained by creating global pairwise alignments (Needleman and Wunsch, <xref ref-type="bibr" rid="B45">1970</xref>) and calculating their percent sequence identity using the Bioconductor R package (Huber et al., <xref ref-type="bibr" rid="B27">2015</xref>). In accordance with the techniques outlined by Kim et al. (<xref ref-type="bibr" rid="B29">2014</xref>), the internal gap positions were not included in the similarity calculations. Operational taxonomic units were constructed using UPGMA cluster analysis, with a distance cutoff of 98.65% sequence similarity, which was previously reported as the closest proxy of species (Kim et al., <xref ref-type="bibr" rid="B29">2014</xref>) and were further labeled as OTUs<sub>[98.65%]</sub>.</p>
</sec>
<sec>
<title>MALDI-TOF MS sample preparation and spectra acquisition</title>
<p>Prior to MALDI-TOF MS measurement, bacterial isolates were freshly inoculated on PCA (Oxoid, UK) and cultivated for 24 h at 28&#x000B0;C. The common direct transfer protocol (commonly referred to as whole-cell or intact-cell measurement) was followed to obtain mass spectra. Briefly, &#x0007E;0.1 mg of cell material was directly transferred from a bacterial colony (if possible) or smear of colonies to a MALDI target spot. After drying at laboratory temperature, sample spots were overlaid with 1 &#x003BC;L of matrix solution (10 mg/mL &#x003B1;-cyano-4-hydroxycinnamic acid in 50% acetonitrile and 2.5% trifluoroacetic acid). To determine mass spectra generation reproducibility, all cultures were cultivated independently four times (biological replicates); each measurement was carried out in triplicate (technical replicates). MS analysis was performed on an Autoflex MALDI-TOF mass spectrometer (Bruker Daltonics, Germany) using Flex Control 3.4 software (Bruker Daltonics, Germany). Calibration was carried out with the use of the Bacterial Test Standard (Bruker Daltonics, Germany).</p>
<p>All MS spectra were measured automatically using Flex Control software according to the standard measurement method for microbial identification. Specifically, our set-up values in linear positive mode were as follows: ion source 1 voltage, 20 kV; ion source 2 voltage, 19 kV; lens voltage, 6.5 kV; mass range, 2&#x02013;20 kDa; the final spectrum was the sum of 10 single spectra, each obtained by 200 laser shots on random target spot positions. With regard to the functioning of MALDI-TOF MS, by which &#x0002B;1 ions are predominantly generated and detected, Da is used as a unit of <italic>m</italic>/<italic>z</italic> throughout the study.</p>
</sec>
<sec>
<title>Bruker biotyper bacterial classification and identification</title>
<p>For bacterial classification using BioTyper 3.1 software (Bruker Daltonics, Germany) equipped with MBT 6903 MPS Library (released in April 2016), the MALDI Biotyper Preprocessing Standard Method and the MALDI Biotyper MSP Identification Standard Method adjusted by the manufacturer (Bruker Daltonics, Germany) were used. All identifications were reported with the following score values: &#x0003C; 1.7 was interpreted as an unreliable identification; 1.7&#x02013;2.0 as a probable genus identification; 2.0&#x02013;2.3 as a secure genus identification and probable species identification; and &#x0003E;2.3 was regarded as a highly probable species identification. Only the highest score value of all mass spectra belonging to individual cultures (biological and technical replicates) was recorded. Mismatched identifications between MALDI BioTyper and 16S rRNA gene analyses, which could be resolved by recent nomenclature changes in the EzBioCloud database, as well as the special case of culture Bre1, were not regarded as misidentifications. Nomenclature changes included genera <italic>Arthrobacter</italic> (Busse, <xref ref-type="bibr" rid="B10">2016</xref>), <italic>Burkholderia</italic> (Sawana et al., <xref ref-type="bibr" rid="B52">2014</xref>), and <italic>Agrobacterium</italic> (Lassalle et al., <xref ref-type="bibr" rid="B36">2011</xref>). Culture Bre1, which showed 100% 16S rRNA gene sequence similarity to type strain <italic>Brevibacterium frigoritolerans</italic> DSM 8801<sup>T</sup> using the EzBioCloud Identify service, was, however, identified as <italic>Bacillus simplex</italic> using the MALDI BioTyper method (score of 2.2). Further inspection carried out in-house by DSMZ culture collection, based on multiple taxonomic tests including DNA-DNA hybridization experiments, revealed that the strain DSM 8801<sup>T</sup> is actually a member of <italic>Bacillus</italic> sp. (personal communication).</p>
</sec>
<sec>
<title>Mass spectra preprocessing</title>
<p>All MS data were processed in R language (R. Core Team, <xref ref-type="bibr" rid="B49">2017</xref>) with the aid of the <italic>MALDIquant</italic> R package (Gibb and Strimmer, <xref ref-type="bibr" rid="B25">2012</xref>). The workflow followed standard spectral data preprocessing procedures adopted from the <italic>MALDIquant</italic> package: (i) square root intensity transformation; (ii) mass range trimming of 4&#x02013;10 kDa (see results for details); (iii) Savicky-Golay intensity smoothing (Savitzky and Golay, <xref ref-type="bibr" rid="B51">1964</xref>) with a half-window size of 20; (iv) baseline correction of spectra by the SNIP algorithm (Morhac, <xref ref-type="bibr" rid="B42">2009</xref>) with 50 iterations; (v) total ion current normalization; (vi) peak detection using the SuperSmoother noise estimation algorithm (Friedman, <xref ref-type="bibr" rid="B22">1984</xref>), with a signal-to-noise ratio of 3 and a half-window size set to 20; and (vii) peak binning with 0.002 tolerance.</p>
<p>Peak lists of individual spectra were transformed into a feature matrix with mass signal positions marked in columns. In cases where spectra were lacking specific peaks, the corresponding intensity values of preprocessed spectra was used. Spectral pairwise similarities were calculated as cosine similarities (Stein and Scott, <xref ref-type="bibr" rid="B62">1994</xref>). For fast and efficient computing, the <italic>cosine()</italic> function implemented in the <italic>coop</italic> R package (Schmidt, <xref ref-type="bibr" rid="B54">2016</xref>) was used. If required, distance was calculated using the formula 1 &#x02013; the CS.</p>
<sec>
<title>MALDI-TOF MS reproducibility assessment</title>
<p>Prior to analysis, low quality spectra were identified by calculating the average cosine similarity (ACS) between each spectrum and its corresponding technical replicates. A 0.9 cutoff was derived from the shape of the distribution of these values to determine technical outliers. Out of the mass spectra totaling 588, three were discarded. The reproducibility of the MS measurements was evaluated by calculating ACS in groups of technical and biological replicates. In addition, the full 2 to 20 kDa mass range was split into 1 kDa intervals, for each of which we calculated: (i) the number of unique mass signals; (ii) the summed signal intensity; and (iii) the mean of the ACS values calculated for all mass spectra belonging to each individual culture (12 spectra; Figure <xref ref-type="fig" rid="F1">1</xref>).</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>Analysis of 1 kDa mass intervals across all 585 mass spectra. <italic>Gray bars</italic>&#x02014;number of detected mass signals per interval; <italic>blue bars</italic>&#x02014;number of mass signals identified by shrinkage discriminant analysis as useful for species prediction based on assigning the closest type strain; <italic>green area</italic>&#x02014;proportional mass signal intensity; <italic>red line</italic>&#x02014;mean value of average cosine similarity between biological and technical replicates of individual cultures (4 &#x000D7; 3 &#x0003D; 12 spectra). All values are normalized by maxima of the respective variable.</p></caption>
<graphic xlink:href="fmicb-09-01294-g0001.tif"/>
</fig>
</sec>
</sec>
<sec>
<title>Identification of phylotype-predicting mass signals</title>
<p>To identify species-level phylotype-predicting mass signals, shrinkage discriminant analysis with correlation-adjusted <italic>t</italic>-score variable selection (Ahdesmaki and Strimmer, <xref ref-type="bibr" rid="B1">2010</xref>) as implemented in the <italic>sda</italic> R package (Ahdesmaki et al., <xref ref-type="bibr" rid="B2">2015</xref>) was carried out. The signals were detected in the whole 2 to 20 kDa mass range. All peaks were ranked on a mutual information entropy basis, and selection was controlled by the false non-discovery rate. All peaks with a local false discovery rate of less than 0.2 were selected as phylotype predictors. Prediction accuracy was estimated using 10 &#x000D7; 10-fold cross validation of all MS data with the aid of the <italic>crossval</italic> R package (Strimmer, <xref ref-type="bibr" rid="B63">2015</xref>) as described in <italic>sda</italic> documentation.</p>
</sec>
<sec>
<title>Optimal cosine similarity threshold for species-like separation based on 16S rRNA gene analysis</title>
<p>Cosine similarity (CS) was chosen as a measure of similarity between mass spectra. Geometrically, it is interpreted as a cosine of the angle between two vectorized mass spectra. It is calculated as a normalized inner product, with CS values ranging between 0 and 1, as mass intensities are always positive.</p>
<p>A dataset containing all MS measurement pairs was constructed, and spectral CS was computed for each pair. If a sample-sample pair was assigned to the same closest type strain, the pair was labeled as <italic>intra</italic>-related; otherwise, it was labeled as <italic>inter</italic>-related. To determine the optimal threshold for mass spectra cosine similarity (<italic>T</italic><sub><italic>CS</italic></sub>), the precision, recall and F<sub>1</sub> scores were calculated for each CS threshold value (0&#x02013;1 with 0.01 steps) and evaluated with 2-fold cross validation as described by Kim et al. (<xref ref-type="bibr" rid="B29">2014</xref>). All sample-sample pairs were tested for species-like relatedness and were designated as: True Positive (<italic>TP</italic>) if CS &#x0003E; &#x0003D; <italic>T</italic><sub><italic>CS</italic></sub> &#x00026; <italic>intra</italic>; True Negative (<italic>TN</italic>) if CS &#x0003C; <italic>T</italic><sub><italic>CS</italic></sub> &#x00026; <italic>inter</italic>; False Positive (<italic>FP</italic>) if CS &#x0003E; &#x0003D; <italic>T</italic><sub><italic>CS</italic></sub> &#x00026; <italic>inter</italic>; and False Negative (<italic>FN</italic>) in CS &#x0003C; <italic>T</italic><sub><italic>CS</italic></sub> &#x00026; <italic>intra</italic>. The dataset was then randomly split into two partitions, and the precision [<italic>TP / (TP</italic> &#x0002B; <italic>FP)</italic>], recall [<italic>TP / (TP</italic> &#x0002B; <italic>FN)</italic>] and F<sub>1</sub> scores [<italic>2</italic> &#x000D7; <italic>(precision</italic> &#x000D7; <italic>recall) / (precision</italic> &#x0002B; <italic>recall)</italic>] were calculated for each CS threshold value in relation to each partition. Optimal <italic>T</italic><sub><italic>CS</italic></sub> was selected as the mean of the thresholds with the highest F<sub>1</sub> score from both cross validation training partitions. The precision and recall scores of the thresholds selected were calculated on the basis of the corresponding test partition. Similarly, the whole procedure was performed for species delineation using the sequence similarity approach. When sample-sample pairs shared 16S rRNA gene sequence similarity &#x02265;98.65%, the pair was labeled as <italic>intra</italic>-related; otherwise, it was labeled as <italic>inter</italic>-related.</p>
<p>Operational taxonomic units were constructed using UPGMA cluster analysis on MS data with specified CS threshold and were herein labeled as OTUs<sub>[CS threshold]</sub>.</p>
</sec>
<sec>
<title>Bacterial ribosomal protein molecular weights</title>
<p>UniProtKB protein database (UniProt Consortium, <xref ref-type="bibr" rid="B69">2017</xref>) was searched for &#x0201C;taxonomy:bacteria family:ribosomal&#x0201D; protein entries. In total, 761,208 proteins were found including entries from both reviewed (Swiss-Prot) and unreviewed (TrEMBL) sources, and their calculated molecular masses were downloaded. No post-transcriptional or other modifications were applied in the mass calculations.</p>
</sec>
<sec>
<title>R data analysis scripts deposition</title>
<p>All scripts used for analyses in R are available at the authors&#x00027; GitHub repository (<ext-link ext-link-type="uri" xlink:href="https://github.com/strejcem/MALDIvs16S/">https://github.com/strejcem/MALDIvs16S/</ext-link>).</p>
</sec>
</sec>
<sec sec-type="results" id="s3">
<title>Results</title>
<sec>
<title>Classification of cultures based on 16S rRNA gene and MALDI-TOF MS reference databases</title>
<p>With the aid of the EzBioCloud Identify service, the culture set was found to consist of 43 phylotypes (Table <xref ref-type="table" rid="T1">1</xref>). Bruker MALDI BioTyper software with a reference database was used to identify and classify the cultures according to their mass spectra (Table <xref ref-type="table" rid="T1">1</xref>). Of the 49 cultures studied, 45 were reliably identified at the probable genus level, with BioTyper scores of &#x0003E;1.7. After taking into account recent taxonomy changes and corrections described in the Materials and Methods section, the MALDI BioTyper and phylotype-based identification methods coincided up to the genus level. With respect to only those cases where MALDI BioTyper identifications reached scores of &#x0003E;2.3 (highly probable species identification; 23 cultures), 12 cultures were assigned to the same species as by the phylotype-based approach. Lowering the score cutoff to 2.0 (secure genus identification and probable species identifications; 36 cultures) resulted in 15 concordant species assignments. With regard to all 49 cultures, both identification methods yielded the same overall genus and species assignments in 92 and 35% of cases, respectively.</p>
</sec>
<sec>
<title>Similarity-based analysis of whole-cell mass spectra: mass range determination</title>
<p>The entire set of mass spectra was transformed into a feature matrix and the number of descriptive statistics was calculated for each 1 kDa interval in the full 2&#x02013;20 kDa mass window (Figure <xref ref-type="fig" rid="F1">1</xref>). It is important to note that up to 94% of summed signal intensities were in the 2&#x02013;10 kDa range. The mean of ACS values, which were highly consistent (ACS &#x0003E; 0.9) up to 11 kDa followed by a rapid deterioration, showed a similar trend. Shrinkage discriminant analysis was also performed to identify specific protein signals for species assignment using the phylotype-based approach (Figure <xref ref-type="fig" rid="F1">1</xref>). Out of 1,101 unique protein signals, 150 were found to be adequate for phylotype prediction. Prediction accuracy, calculated by cross validation, was 0.999, meaning that on average less than one out of 585 cases was incorrectly predicted. The ratio between phylotype-specific and total mass signals increased significantly in the 4&#x02013;10 kDa range (Figure <xref ref-type="fig" rid="F1">1</xref>).</p>
<p>Although the 10&#x02013;20 kDa mass range was characterized by many mass signals, their summed intensity was 6% of full 2&#x02013;20 kDa range signal intensity. Protein signals in the 2&#x02013;4 kDa mass range accounted for 29% of full 2&#x02013;20 kDa range signal intensity. Overall, the 2&#x02013;4 kDa mass range contained 186 unique mass signals across all spectra, with an average of 12.2 mass signals per spectrum; however, only 17 (9%) unique signals had phylotype-discriminating capacity. By comparison, the mass range of 4&#x02013;10 kDa accounted for 65% of all intensities, with an average of 22.9 mass signals per spectrum. Out of 324 unique peaks localized in this MS range, 127 (39%) were found to have phylotype-discriminating capacity. Analysis of 761,208 bacterial ribosomal proteins downloaded from the UniProtKB protein database also showed that only 123 (0.01%) proteins had a calculated molecular mass of less than 4 kDa (Figure <xref ref-type="fig" rid="F2">2</xref>). In light of these findings, a mass range of 4&#x02013;10 kDa was used for the analyses described below in order to reduce data complexity and signal noise. Further evaluation of ACS per culture using the full 2&#x02013;20 kDa range as opposed to the restricted 4&#x02013;10 kDa range indicated that the mass range restriction had a largely positive impact (Supplementary Figure 1).</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>Histogram of calculated molecular masses of bacterial ribosomal proteins. Only 123 (0.01%) proteins out of 761,208 had a molecular mass of less than 4 kDa. Molecular masses were calculated and downloaded from the UniProtKB protein database. The histogram is made up of bins of 100 Da, and only proteins with a mass of less than 20 kDa are shown.</p></caption>
<graphic xlink:href="fmicb-09-01294-g0002.tif"/>
</fig>
</sec>
<sec>
<title>Reproducibility of mass spectra</title>
<p>The ACS calculated between technical replicates varied from 0.916 to 0.997, thus indicating a high level of overall mass spectra reproducibility when the same cell material was analyzed. However, the ACS calculated over all 12 spectra belonging to each individual culture revealed significant misalignment between a certain number of biological replicates. ACS values for the biological replicates of all 49 cultures were in a 0.756 to 0.985 range, with eight cultures showing an ACS of less than 0.9. These cultures (mean ACS &#x000B1; std. dev.) included: Bac1 (0.813 &#x000B1; 0.162), Bac2 (0.800 &#x000B1; 0.107), Bac3 (0.814 &#x000B1; 0.143), Bac4 (0.756 &#x000B1; 0.193), Bac5 (0.814 &#x000B1; 0.178), and Lys3 (0.774 &#x000B1; 0.203) belonging to bacterial class Bacilli (6 out of 16); Met1 (0.881 &#x000B1; 0.066) belonging to class Alphaproteobacteria (1 out of 3); and Psm7 (0.853 &#x000B1; 0.138) belonging to class Gammaproteobacteria (1 out of 10). No misalignment of biological replicates between individual cultures was detected with respect to Actinobacteria (0 out of 16) or Betaproteobacteria (0 out of 4). On the whole, Gram-negative cultures were found to be less affected than Gram-positives. No linear dependency was observed between mass spectra CS and 16S rRNA gene sequence similarities (Figure <xref ref-type="fig" rid="F3">3</xref>). All mass spectra pairs with a spectral similarity of over 0.60 coincided in terms of family and deeper taxonomic ranks, while pairwise mass spectra similarity of cultures of the same species-level phylotype ranged from as low as 0.232 to 0.998 (Figure <xref ref-type="fig" rid="F3">3</xref>).</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>Plot showing pairwise relationship of 16S rRNA gene sequence similarity between two cultures and their mass spectra cosine similarity. Horizontal error bars represent standard deviations of mass spectra cosine similarities calculated for all technical and biological replications (<italic>n</italic> &#x0003D; 3 &#x000D7; 4 &#x0003D; 12). The colors of the data points represent the lowest taxonomy rank that is shared between a pair of microorganisms. The taxonomical classification and species assignment by the closest type strain were carried out using the EzCloud Identify Service. <italic>Vertical solid line</italic>&#x02014;calculated optimal cosine similarity threshold based on 98.65% 16S rRNA gene similarity; <italic>vertical dashed line</italic>&#x02014;calculated optimal cosine similarity threshold based on assigning the closest type strain.</p></caption>
<graphic xlink:href="fmicb-09-01294-g0003.tif"/>
</fig>
</sec>
<sec>
<title>Optimal CS threshold to delineate species analogically to the phylotype-based approach</title>
<p>Using 2-fold cross validation, the CS threshold calculated on an F<sub>1</sub> score basis was 0.92 and differentiated mass spectra analogously to the phylotype-based approach. The corresponding precision and recall values were 0.83 and 0.64 (Figure <xref ref-type="fig" rid="F4">4</xref>).</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p>Precision, recall and F<sub>1</sub> score curves for species classification by mass spectra cosine similarity (<italic>x</italic>-axis) as compared to the two commonly used 16S rRNA gene species demarcation analyses: <italic>dashed lines</italic>&#x02014;species separation by 98.65% 16S rRNA gene similarity with an optimal analogous cosine similarity threshold of 0.79; <italic>solid lines</italic>&#x02014;species assignment by the closest type strain (EzCloud Identify Service) with an optimal analogous cosine similarity threshold of 0.92.</p></caption>
<graphic xlink:href="fmicb-09-01294-g0004.tif"/>
</fig>
<p>Altogether, the 49 different cultures in four biological replicates used in this study represented a mass spectra dataset of 196 biological samples. Using 16S rRNA gene analysis, the collection was found to be composed of 45 unique phylotypes. UPGMA cluster analysis with a CS threshold of 0.92 resulted in the generation of 76 clusters (OTUs<sub>[CS0.92]</sub>). Of these, 32 OTUs<sub>[CS0.92]</sub> were actually duplicated (redundant) due to the biological variability of the mass spectra. While leaving out redundant clusters, 39 out of 49 cultures were separated analogically using both methods, five cultures were separated into more phylotypes than OTUs<sub>[CS0.92]</sub> and, finally, five cultures were separated into more OTUs<sub>[CS0.92]</sub> than phylotypes (Table <xref ref-type="table" rid="T2">2</xref>).</p>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Comparison of MALDI-TOF MS and 16S rRNA gene analysis methods for dereplication of recurrent bacterial isolates.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th valign="top" align="left"><bold>Cosine Similarity threshold</bold></th>
<th valign="top" align="left"><bold>16S rRNA gene analysis</bold></th>
<th valign="top" align="center"><bold>Number of clusters MS/rRNA</bold></th>
<th valign="top" align="center"><bold>Dereplication rate MS/rRNA (% of samples)</bold></th>
<th valign="top" align="center"><bold>Redundant MS clusters (% of clusters)</bold></th>
<th valign="top" align="center" colspan="3" style="border-bottom: thin solid #000000;"><bold>Cultures separated by</bold><xref ref-type="table-fn" rid="TN1"><sup>&#x0002A;</sup></xref>:</th>
</tr>
<tr style="border-bottom: thin solid #000000;">
<th/>
<th/>
<th/>
<th/>
<th/>
<th valign="top" align="left"><bold>Both approaches<sup>a</sup></bold></th>
<th valign="top" align="left"><bold>rRNA<sup>b</sup></bold></th>
<th valign="top" align="left"><bold>MS<sup>c</sup></bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">0.79</td>
<td valign="top" align="left">98.65% similarity</td>
<td valign="top" align="center">46/37</td>
<td valign="top" align="center">23%/19%</td>
<td valign="top" align="center">8 (17%)</td>
<td valign="top" align="center">35</td>
<td valign="top" align="center">6</td>
<td valign="top" align="center">8</td>
</tr>
<tr>
<td valign="top" align="left">0.92</td>
<td valign="top" align="left">Closest type strain</td>
<td valign="top" align="center">76/43</td>
<td valign="top" align="center">39%/22%</td>
<td valign="top" align="center">32 (42%)</td>
<td valign="top" align="center">39</td>
<td valign="top" align="center">5</td>
<td valign="top" align="center">5</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>Bacterial set samples are represented by 4 biological replications of 49 cultures (196 samples in total)</italic>.</p>
<fn id="TN1"><label>&#x0002A;</label><p><italic>Number of cultures that were (a) separated in the same way by both MALDI-TOF MS and 16S rRNA gene analysis, (b) separated into more clusters by 16S rRNA gene analysis and (c) separated into more clusters by MALDI-TOF MS</italic>.</p></fn>
</table-wrap-foot>
</table-wrap>
</sec>
<sec>
<title>Optimal CS threshold to delineate species analogically to the sequence similarity-based approach</title>
<p>The optimal CS threshold corresponding to the 98.65% sequence similarity threshold (Kim et al., <xref ref-type="bibr" rid="B29">2014</xref>) was calculated as 0.79, with precision and recall values of 0.70 and 0.73, respectively (Figure <xref ref-type="fig" rid="F4">4</xref>). UPGMA cluster analysis resulted in the generation of 37 OTUs<sub>[98.65%]</sub> using 16S rRNA gene data and of 46 OTUs<sub>[CS0.79]</sub> using MS data, 8 of which were redundant. Leaving out redundant OTUs<sub>[CS0.79]</sub>, 35 out of 49 cultures were separated in the same way by both methods, 6 cultures were grouped into more OTUs<sub>[98.65%]</sub> than OTUs<sub>[CS0.79]</sub> and 8 cultures were separated into more OTUs<sub>[CS0.79]</sub> than OTUs<sub>[98.65%]</sub> (Table <xref ref-type="table" rid="T2">2</xref>). Supplementary Figure 2 shows a mass spectrum UPGMA dendrogram, with clusters marked for each cutoff. In summary, the MALDI-TOF MS technique, when used to dereplicate bacterial isolates, leads to a reduction in the number of isolates for downstream analyses, with minimal loss of unique organisms.</p>
</sec>
</sec>
<sec sec-type="discussion" id="s4">
<title>Discussion</title>
<sec>
<title>Bacterial identification based on reference databases</title>
<p>MALDI-TOF mass spectrometry is now well-established as a fast and reliable technique in clinical laboratories to identify bacterial species (Janda and Abbott, <xref ref-type="bibr" rid="B28">2007</xref>; Croxatto et al., <xref ref-type="bibr" rid="B15">2012</xref>; Seng et al., <xref ref-type="bibr" rid="B58">2013</xref>; Shin et al., <xref ref-type="bibr" rid="B59">2015</xref>; Buckwalter et al., <xref ref-type="bibr" rid="B9">2016</xref>), although its application in the field of microbial ecology has been more limited. In this study, we compared whole-cell MALDI-TOF MS analysis of environmental isolates to the standard 16S rRNA gene sequencing method for identification and characterization of bacteria.</p>
<p>In environmental studies, fast and effective bacterial classification is principally based on matching sample 16S rRNA gene sequences to known references in databases such as the SILVA rRNA database project (Quast et al., <xref ref-type="bibr" rid="B48">2013</xref>), whose latest release 128 consists of 1,922,213 high-quality full-length 16S rRNA sequences, and the Ribosomal Database Project (Cole et al., <xref ref-type="bibr" rid="B13">2014</xref>), whose release 11.5 contains 1,502,575 high quality, aligned and annotated 16S rRNA sequences. The manually curated 16S rRNA gene database EzBioCloud, which is used for the closest type strain bacterial identification (Kim et al., <xref ref-type="bibr" rid="B30">2012</xref>), contains almost 60,000 bacterial type strains and uncultivated phylotypes in version 2017.05. By comparison, the latest MALDI BioTyper library, released in April 2016, contains 6,127 reference mass spectra (main spectrum projections, MSP) with 2,226 unique bacterial species. Both the EzBioCloud Identify service and the MALDI Biotyper database closely coincided in terms of identifying the cultures examined at the genus level, with all cultures matching, except for 4 unreliably identified by the MALDI BioTyper method. Reference-based MALDI-TOF MS classification thus proved to be a reliable technique for bacterial identification at the genus level provided a wide coverage of reference mass spectra is available. However, at the species level, only 15 (&#x0007E;35%) of the cultures identified coincided with those identified by 16S rRNA gene sequencing analysis. These findings are in contrast to previous studies of clinically important bacteria where concordant species identifications between MALDI BioTyper and 16S rRNA gene analysis were reported to be in the range 41&#x02013;92.2% of samples (Mellmann et al., <xref ref-type="bibr" rid="B41">2008</xref>; Bizzini et al., <xref ref-type="bibr" rid="B6">2011</xref>; Schmitt et al., <xref ref-type="bibr" rid="B55">2013</xref>; Cheng et al., <xref ref-type="bibr" rid="B11">2015</xref>; Fykse et al., <xref ref-type="bibr" rid="B23">2015</xref>; Schulthess et al., <xref ref-type="bibr" rid="B56">2016</xref>). This discrepancy is most likely due to the insufficient coverage of bacterial species in the databases. Even though users of commercial databases can create local repositories (Schmitt et al., <xref ref-type="bibr" rid="B55">2013</xref>; Cheng et al., <xref ref-type="bibr" rid="B11">2015</xref>; Svobodova et al., <xref ref-type="bibr" rid="B65">2017</xref>), there is no centralized pooling of new references collected by a broad spectrum of researchers. Although open access microbial MS databases such as SpectraBank (Bohme et al., <xref ref-type="bibr" rid="B7">2012</xref>) and Spectra (spectra.folkhalsomyndigheten.se/) have existed for many years, growth in the number of uploaded spectra has been slow or stagnant. The lack of widely accepted guidelines on the production of MALDI-TOF mass spectra (Liu et al., <xref ref-type="bibr" rid="B38">2007</xref>) or on the data format to be adopted&#x02014;SpectraBank, with plain text peak lists without intensities or SpectraBank, with its Bruker MSP proprietary format&#x02014;may hinder further progress in the adoption of the MALDI-TOF MS method for bacterial classification and identification, especially in environmental studies.</p>
</sec>
<sec>
<title>Mass spectra preprocessing</title>
<p>One of the main goals of our study was to provide parameters which could result in the efficient use of the MALDI-TOF MS without reliance on mass spectra reference databases for the dereplication of recurrent bacterial isolates from environmental samples in such a way it would be analogical to 16S rRNA gene-based analyses. We attempted to identify mass range with stable and predictive protein signals prior to CS calculation, which resulted in the selection of a mass range of 4&#x02013;10 kDa. Mass signals in the 10&#x02013;20 kDa range were unlikely to be reproducible protein peaks, as shown by the decreasing ACS values between spectra assigned to the same culture (Figure <xref ref-type="fig" rid="F1">1</xref>). Although the largest number of mass signals were in the mass range of 2&#x02013;4 kDa, the frequency of phylotype-predictive signals detected by shrinkage discriminant analysis, was low, suggesting that incorporation of this region into the calculation of similarity measures is not essential. Several other studies suggest that the 3.5&#x02013;4 kDa mass range is the lower boundary where important signals are located (Arnold and Reilly, <xref ref-type="bibr" rid="B5">1998</xref>; Fenselau and Demirev, <xref ref-type="bibr" rid="B19">2001</xref>). Dieckmann et al. (<xref ref-type="bibr" rid="B18">2005</xref>) only considered high intensity and stable signals with a mass of less than 4 kDa. These findings are further corroborated by analysis of bacterial ribosomal proteins extracted from the UniProtKB protein database which showed a very limited number of such proteins with a molecular mass of under 4 kDa (Figure <xref ref-type="fig" rid="F2">2</xref>).</p>
</sec>
<sec>
<title>Cosine similarity thresholds vs. 16S rRNA gene analysis</title>
<p>Optimal CS thresholds delineating cultures analogously to both phylotype- and sequence similarity-based 16S rRNA gene analysis approaches were identified based on the F<sub>1</sub> score which is defined as the harmonic mean of precision and recall values. Within the scope of this study, following dereplication by MALDI-TOF MS, high precision values would translate into a slight loss of unique phylotypes/OTUs<sub>[98.65%]</sub> identified by the 16S rRNA gene analysis, while high recall values would translate into a limited number of redundant clusters of the same phylotype/OTUs<sub>[98.65%]</sub>.</p>
<p>On the basis of a 2-fold cross validation and F<sub>1</sub> score calculation, optimal CS thresholds of 0.79 and 0.92 were identified to best mimic species separation defined by the phylotype-based and sequence similarity-based approaches, respectively. The precision values for the respective thresholds were 0.70 and 0.84. In order to dereplicate recurrent bacterial isolates, a further increase in the CS threshold might yield a higher level of precision, although this would be at the expense of a lower recall rate (Figure <xref ref-type="fig" rid="F4">4</xref>). The recall values for the CS thresholds of 0.79 and 0.92 were 0.77 and 0.55, respectively. These values would, on average, imply 23 and 45% redundant clusters, respectively, upon dereplication. These recall values might be negatively influenced by two major factors: biological reproducibility (see below) and the fact, that these values relate to 16S rRNA gene analysis which is used only <italic>as a proxy</italic> for bacterial species delineation and should be applied with caution. The overall conserved character of the 16S rRNA gene, making it applicable to virtually all prokaryotic organisms, does not allow for subspecies separation and, in some cases, not even for species separation (Fox et al., <xref ref-type="bibr" rid="B20">1992</xref>). Prokaryotic species are nowadays defined using whole-genome-based techniques, such as average nucleotide identity (ANI) or DNA-DNA hybridization (Stackebrandt and Goebel, <xref ref-type="bibr" rid="B61">1994</xref>; Konstantinidis et al., <xref ref-type="bibr" rid="B31">2006</xref>; Tindall et al., <xref ref-type="bibr" rid="B67">2010</xref>). Kim et al. (<xref ref-type="bibr" rid="B29">2014</xref>) have reported precision and recall values of 0.922 and 0.986, respectively, when a 98.65% 16S rRNA gene sequence similarity threshold was used to delineate species defined by 95% ANI. If the actual species-defining approaches were used as reference methods, the recall values for MALDI-TOF MS analysis would very likely increase. In this study, cultures Lys3 and Lys4 of single phylotype <italic>Lysinibacillus xylanilyticus</italic> shared 99.2% similarity of 16S rRNA gene sequences, while their ACS was 0.410 &#x000B1; 0.167. Similarly, <italic>Arthrobacter</italic> Art3 and <italic>Glutamicibacter</italic> Glu1 shared 99.2% sequence similarity between their 16S rRNA genes, while their ACS was 0.330 &#x000B1; 0.057. This strongly suggests that the resolution of the MALDI-TOF MS technique is superior to that of 16S rRNA gene analysis in particular cases, as was also described elsewhere (Murray, <xref ref-type="bibr" rid="B44">2010</xref>; B&#x000F6;hme et al., <xref ref-type="bibr" rid="B8">2013</xref>). Taking all this into account, further in-depth research into cultures with known genomes is required in order to provide more robust similarity threshold values for species demarcation by MALDI-TOF MS.</p>
</sec>
<sec>
<title>Effect of biological variation</title>
<p>While the technical replicates of MALDI-TOF MS measurement showed a high level of reproducibility, the biological replicates of some culture mass spectra deviated significantly. These deviations distorted the F<sub>1</sub> score curves (Figure <xref ref-type="fig" rid="F4">4</xref>) and artificially lowered the CS threshold calculated for species delineation. Enhanced precision and recall could be expected if a higher level of biological reproducibility was achieved. Oberle et al. (<xref ref-type="bibr" rid="B47">2016</xref>), after studying the technical, biological and interlaboratory reproducibility yielded by MALDI-TOF MS cell analysis, came to conclusions which are in line with our findings. Using 12 <italic>E</italic>. <italic>coli</italic> strains and standard operating procedures, they reported satisfactory technical, but insufficient biological reproducibility with regard to similarity-based analyses. Despite this low level of biological reproducibility, they were able to identify cluster-determining peaks which facilitated accurate classification of all samples. Using shrinkage discriminant analysis, we were able to identify 150 phylotype-specific mass signals in our dataset. Using these 150 mass signals, it was possible to predict the assigned species of the cultures with a high degree of accuracy (0.999) as revealed by cross validation. The analogical classification used by algorithms applied in databases such as the MALDI BioTyper database enabled protein signal consistency to be incorporated into the calculations in order to increase reproducibility (Maier et al., <xref ref-type="bibr" rid="B39">2006</xref>). Indeed, Mellmann et al. (<xref ref-type="bibr" rid="B40">2009</xref>) and Westblade et al. (<xref ref-type="bibr" rid="B74">2015</xref>) reported very high reproducibility levels for species designation when MALDI-TOF MS reference-based classification was used.</p>
<p>Biological variations in the bacterial mass fingerprint have been insufficiently studied when all samples are subject to the same cultivation, time and sample preparation conditions. However, Arnold et al. (<xref ref-type="bibr" rid="B4">1999</xref>) found that the age of a culture significantly influences the protein profile in the mass spectra of <italic>E. coli</italic> strain K-12. The presence and intensity of different peaks were observed to vary during an 84-h cultivation experiment. Interestingly, the 22&#x02013;30 h cultivation time frame, corresponding to a middle stationary growth phase of <italic>E. coli</italic>, was found to be unstable in terms of protein expression. Significant changes were detected within subsequent 2-h time windows. Such protein variations over time are regarded as organism-specific, indicating that the uniform cultivation time prior to sample preparation for MALDI-TOF MS analysis could have an unfavorable impact. The application of a protein extraction step in sample preparation has been reported to affect the quality of mass spectra to some degree. A positive effect was mainly found in analyses of Gram-positive bacteria (Dai et al., <xref ref-type="bibr" rid="B16">1999</xref>; Alatoom et al., <xref ref-type="bibr" rid="B3">2011</xref>; Schulthess et al., <xref ref-type="bibr" rid="B57">2013</xref>). However, the relationship between protein extraction and direct transfer in terms of biological reproducibility or mass signal stability is not discussed in any of the studies mentioned. In addition, various extraction protocols have been found to prolong sample preparation time which is noticeable when analyzing several hundred isolates.</p>
</sec>
</sec>
<sec sec-type="conclusions" id="s5">
<title>Conclusion</title>
<p>Our study highlights the limitations of MALDI-TOF MS whole-cell analysis when used for bacterial classification and identification of environmental isolates at the species level due to the lack of references in available databases. When used to dereplicate recurrent bacterial isolates, similarity-based analysis is preferable; we demonstrate that this method leads to a significant reduction in recurrent isolates, with only slightly lower precision reported as compared to the 16S rRNA gene-based approaches. It is noteworthy that the presented cosine similarity thresholds should be applied with care as they were derived from a limited sample of 49 cultures. However, our data indicate that the optimal threshold definition is primarily influenced by the biological reproducibility. Therefore, approaches that lead to high whole-cell MALDI-TOF mass spectra generation reproducibility need to be developed/established before refining the optimal threshold any further. Taking into account time and cost considerations, we concluded that MALDI-TOF MS can successfully rival the 16S rRNA gene approach in terms of high-throughput bacterial isolate binning. MALDI-TOF MS analysis also has further improvement potential unlike 16S rRNA gene analysis, whose methodological limits have plateaued. Thus, the relationship between whole-cell mass spectra and the average nucleotide identity of orthologous genes, as well as biological reproducibility issues need to be addressed in the future in order to maximize the benefits of similarity-based and reference-free approaches.</p>
</sec>
<sec id="s6">
<title>Author contributions</title>
<p>Experimental design: MS and OU. Performed the experiments: MS, TS, and PJ. Analyzed the data: MS and TS. Wrote the paper: MS and OU.</p>
<sec>
<title>Conflict of interest statement</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
</sec>
</body>
<back>
<ack><p>We wish to thank the Czech Science Foundation for funding this research (project no. 17-00227S), Michael O&#x00027;Shea for proof-reading the manuscript, as well as two reviewers of the manuscript for their valuable comments.</p>
</ack>
<sec sec-type="supplementary-material" id="s7">
<title>Supplementary material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fmicb.2018.01294/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fmicb.2018.01294/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="Image_1.PDF" id="SM1" mimetype="application/pdf" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Image_2.PDF" id="SM2" mimetype="application/pdf" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ahdesmaki</surname> <given-names>M.</given-names></name> <name><surname>Strimmer</surname> <given-names>K.</given-names></name></person-group> (<year>2010</year>). <article-title>Feature selection in omics prediction problems using cat scores and false nondiscovery rate control</article-title>. <source>Ann. Appl. Stat.</source> <volume>4</volume>, <fpage>503</fpage>&#x02013;<lpage>519</lpage>. <pub-id pub-id-type="doi">10.1214/09-AOAS277</pub-id></citation></ref>
<ref id="B2">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Ahdesmaki</surname> <given-names>M.</given-names></name> <name><surname>Zuber</surname> <given-names>V.</given-names></name> <name><surname>Gibb</surname> <given-names>S.</given-names></name> <name><surname>Strimmer</surname> <given-names>K.</given-names></name></person-group> (<year>2015</year>). <source>sda: Shrinkage Discriminant Analysis and CAT Score Variable Selection. R package version 1.3.7 [Online]</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://CRAN.R-project.org/package=sda">https://CRAN.R-project.org/package=sda</ext-link>.</citation></ref>
<ref id="B3">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Alatoom</surname> <given-names>A. A.</given-names></name> <name><surname>Cunningham</surname> <given-names>S. A.</given-names></name> <name><surname>Ihde</surname> <given-names>S. M.</given-names></name> <name><surname>Mandrekar</surname> <given-names>J.</given-names></name> <name><surname>Patel</surname> <given-names>R.</given-names></name></person-group> (<year>2011</year>). <article-title>Comparison of direct colony method versus extraction method for identification of gram-positive cocci by use of Bruker Biotyper matrix-assisted laser desorption ionization-time of flight mass spectrometry</article-title>. <source>J. Clin. Microbiol.</source> <volume>49</volume>, <fpage>2868</fpage>&#x02013;<lpage>2873</lpage>. <pub-id pub-id-type="doi">10.1128/JCM.00506-11</pub-id><pub-id pub-id-type="pmid">21613431</pub-id></citation></ref>
<ref id="B4">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Arnold</surname> <given-names>R. J.</given-names></name> <name><surname>Karty</surname> <given-names>J. A.</given-names></name> <name><surname>Ellington</surname> <given-names>A. D.</given-names></name> <name><surname>Reilly</surname> <given-names>J. P.</given-names></name></person-group> (<year>1999</year>). <article-title>Monitoring the growth of a bacteria culture by MALDI-MS of whole cells</article-title>. <source>Anal. Chem.</source> <volume>71</volume>, <fpage>1990</fpage>&#x02013;<lpage>1996</lpage>. <pub-id pub-id-type="doi">10.1021/ac981196c</pub-id><pub-id pub-id-type="pmid">10361498</pub-id></citation></ref>
<ref id="B5">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Arnold</surname> <given-names>R. J.</given-names></name> <name><surname>Reilly</surname> <given-names>J. P.</given-names></name></person-group> (<year>1998</year>). <article-title>Fingerprint matching of <italic>E.coli</italic> strains with matrix-assisted laser desorption/ionization time-of-flight mass spectrometry of whole cells using a modified correlation approach</article-title>. <source>Rapid Commun. Mass Spectrom.</source> <volume>12</volume>, <fpage>630</fpage>&#x02013;<lpage>636</lpage>. <pub-id pub-id-type="pmid">9621446</pub-id></citation></ref>
<ref id="B6">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bizzini</surname> <given-names>A.</given-names></name> <name><surname>Jaton</surname> <given-names>K.</given-names></name> <name><surname>Romo</surname> <given-names>D.</given-names></name> <name><surname>Bille</surname> <given-names>J.</given-names></name> <name><surname>Prod&#x00027;hom</surname> <given-names>G.</given-names></name> <name><surname>Greub</surname> <given-names>G.</given-names></name></person-group> (<year>2011</year>). <article-title>Matrix-assisted laser desorption ionization-time of flight mass spectrometry as an alternative to 16S rRNA gene sequencing for identification of difficult-to-identify bacterial strains</article-title>. <source>J. Clin. Microbiol.</source> <volume>49</volume>, <fpage>693</fpage>&#x02013;<lpage>696</lpage>. <pub-id pub-id-type="doi">10.1128/JCM.01463-10</pub-id><pub-id pub-id-type="pmid">21106794</pub-id></citation></ref>
<ref id="B7">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>B&#x000F6;hme</surname> <given-names>K.</given-names></name> <name><surname>Fern&#x000E1;ndez-No</surname> <given-names>I. C.</given-names></name> <name><surname>Barros-Vel&#x000E1;zquez</surname> <given-names>J.</given-names></name> <name><surname>Gallardo</surname> <given-names>J. M.</given-names></name> <name><surname>Ca&#x000F1;as</surname> <given-names>B.</given-names></name> <name><surname>Calo-Mata</surname> <given-names>P.</given-names></name></person-group> (<year>2012</year>). <article-title>SpectraBank: an open access tool for rapid microbial identification by MALDI-TOF MS fingerprinting</article-title>. <source>Electrophoresis</source> <volume>33</volume>, <fpage>2138</fpage>&#x02013;<lpage>2142</lpage>. <pub-id pub-id-type="doi">10.1002/elps.201200074</pub-id><pub-id pub-id-type="pmid">22821489</pub-id></citation></ref>
<ref id="B8">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>B&#x000F6;hme</surname> <given-names>K.</given-names></name> <name><surname>Fern&#x000E1;ndez-No</surname> <given-names>I. C.</given-names></name> <name><surname>Pazos</surname> <given-names>M.</given-names></name> <name><surname>Gallardo</surname> <given-names>J. M.</given-names></name> <name><surname>Barros-Vel&#x000E1;zquez</surname> <given-names>J.</given-names></name> <name><surname>Ca&#x000F1;as</surname> <given-names>B.</given-names></name> <etal/></person-group>. (<year>2013</year>). <article-title>Identification and classification of seafood-borne pathogenic and spoilage bacteria: 16S rRNA sequencing versus MALDI-TOF MS fingerprinting</article-title>. <source>Electrophoresis</source> <volume>34</volume>, <fpage>877</fpage>&#x02013;<lpage>887</lpage>. <pub-id pub-id-type="doi">10.1002/elps.201200532</pub-id><pub-id pub-id-type="pmid">23334977</pub-id></citation></ref>
<ref id="B9">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Buckwalter</surname> <given-names>S. P.</given-names></name> <name><surname>Olson</surname> <given-names>S. L.</given-names></name> <name><surname>Connelly</surname> <given-names>B. J.</given-names></name> <name><surname>Lucas</surname> <given-names>B. C.</given-names></name> <name><surname>Rodning</surname> <given-names>A. A.</given-names></name> <name><surname>Walchak</surname> <given-names>R. C.</given-names></name> <etal/></person-group>. (<year>2016</year>). <article-title>Evaluation of matrix-assisted laser desorption ionization-time of flight mass spectrometry for identification of <italic>Mycobacterium</italic> species, <italic>Nocardia</italic> species, and other aerobic Actinomycetes</article-title>. <source>J. Clin. Microbiol.</source> <volume>54</volume>, <fpage>376</fpage>&#x02013;<lpage>384</lpage>. <pub-id pub-id-type="doi">10.1128/JCM.02128-15</pub-id><pub-id pub-id-type="pmid">26637381</pub-id></citation></ref>
<ref id="B10">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Busse</surname> <given-names>H. J.</given-names></name></person-group> (<year>2016</year>). <article-title>Review of the taxonomy of the genus <italic>Arthrobacter</italic>, emendation of the genus <italic>Arthrobacter sensu lato</italic>, proposal to reclassify selected species of the genus <italic>Arthrobacter</italic> in the novel genera <italic>Glutamicibacter</italic> gen. nov., <italic>Paeniglutamicibacter</italic> gen. nov., <italic>Pseudoglutamicibacter</italic> gen. nov., <italic>Paenarthrobacter</italic> gen. nov. and <italic>Pseudarthrobacter</italic> gen. nov., and emended description of <italic>Arthrobacter roseus</italic></article-title>. <source>Int. J. Syst. Evol. Microbiol.</source> <volume>66</volume>, <fpage>9</fpage>&#x02013;<lpage>37</lpage>. <pub-id pub-id-type="doi">10.1099/ijsem.0.000702</pub-id><pub-id pub-id-type="pmid">26486726</pub-id></citation></ref>
<ref id="B11">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cheng</surname> <given-names>W. C.</given-names></name> <name><surname>Jan</surname> <given-names>I. S.</given-names></name> <name><surname>Chen</surname> <given-names>J. M.</given-names></name> <name><surname>Teng</surname> <given-names>S. H.</given-names></name> <name><surname>Teng</surname> <given-names>L. J.</given-names></name> <name><surname>Sheng</surname> <given-names>W. H.</given-names></name> <etal/></person-group>. (<year>2015</year>). <article-title>Evaluation of the Bruker Biotyper matrix-assisted laser desorption ionization-time of flight mass spectrometry system for identification of blood isolates of <italic>Vibrio</italic> species</article-title>. <source>J. Clin. Microbiol.</source> <volume>53</volume>, <fpage>1741</fpage>&#x02013;<lpage>1744</lpage>. <pub-id pub-id-type="doi">10.1128/JCM.00105-15</pub-id><pub-id pub-id-type="pmid">25740773</pub-id></citation></ref>
<ref id="B12">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Claydon</surname> <given-names>M. A.</given-names></name> <name><surname>Davey</surname> <given-names>S. N.</given-names></name> <name><surname>Edwards-Jones</surname> <given-names>V.</given-names></name> <name><surname>Gordon</surname> <given-names>D. B.</given-names></name></person-group> (<year>1996</year>). <article-title>The rapid identification of intact microorganisms using mass spectrometry</article-title>. <source>Nat. Biotechnol.</source> <volume>14</volume>, <fpage>1584</fpage>&#x02013;<lpage>1586</lpage>. <pub-id pub-id-type="doi">10.1038/nbt1196-1584</pub-id><pub-id pub-id-type="pmid">9634826</pub-id></citation></ref>
<ref id="B13">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cole</surname> <given-names>J. R.</given-names></name> <name><surname>Wang</surname> <given-names>Q.</given-names></name> <name><surname>Fish</surname> <given-names>J. A.</given-names></name> <name><surname>Chai</surname> <given-names>B.</given-names></name> <name><surname>Mcgarrell</surname> <given-names>D. M.</given-names></name> <name><surname>Sun</surname> <given-names>Y.</given-names></name> <etal/></person-group>. (<year>2014</year>). <article-title>Ribosomal Database Project: data and tools for high throughput rRNA analysis</article-title>. <source>Nucleic Acids Res.</source> <volume>42</volume>, <fpage>D633</fpage>&#x02013;<lpage>D642</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkt1244</pub-id><pub-id pub-id-type="pmid">24288368</pub-id></citation></ref>
<ref id="B14">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Coorevits</surname> <given-names>A.</given-names></name> <name><surname>De Jonghe</surname> <given-names>V.</given-names></name> <name><surname>Vandroemme</surname> <given-names>J.</given-names></name> <name><surname>Reekmans</surname> <given-names>R.</given-names></name> <name><surname>Heyrman</surname> <given-names>J.</given-names></name> <name><surname>Messens</surname> <given-names>W.</given-names></name> <etal/></person-group>. (<year>2008</year>). <article-title>Comparative analysis of the diversity of aerobic spore-forming bacteria in raw milk from organic and conventional dairy farms</article-title>. <source>Syst. Appl. Microbiol.</source> <volume>31</volume>, <fpage>126</fpage>&#x02013;<lpage>140</lpage>. <pub-id pub-id-type="doi">10.1016/j.syapm.2008.03.002</pub-id><pub-id pub-id-type="pmid">18406093</pub-id></citation></ref>
<ref id="B15">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Croxatto</surname> <given-names>A.</given-names></name> <name><surname>Prod&#x00027;hom</surname> <given-names>G.</given-names></name> <name><surname>Greub</surname> <given-names>G.</given-names></name></person-group> (<year>2012</year>). <article-title>Applications of MALDI-TOF mass spectrometry in clinical diagnostic microbiology</article-title>. <source>FEMS Microbiol. Rev.</source> <volume>36</volume>, <fpage>380</fpage>&#x02013;<lpage>407</lpage>. <pub-id pub-id-type="doi">10.1111/j.1574-6976.2011.00298.x</pub-id><pub-id pub-id-type="pmid">22092265</pub-id></citation></ref>
<ref id="B16">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dai</surname> <given-names>Y.</given-names></name> <name><surname>Li</surname> <given-names>L.</given-names></name> <name><surname>Roser</surname> <given-names>D. C.</given-names></name> <name><surname>Long</surname> <given-names>S. R.</given-names></name></person-group> (<year>1999</year>). <article-title>Detection and identification of low-mass peptides and proteins from solvent suspensions of <italic>Escherichia coli</italic> by high performance liquid chromatography fractionation and matrix-assisted laser desorption/ionization mass spectrometry</article-title>. <source>Rapid Commun. Mass Spectrom.</source> <volume>13</volume>, <fpage>73</fpage>&#x02013;<lpage>78</lpage>. <pub-id pub-id-type="doi">10.1002/(SICI)1097-0231(19990115)13:1&#x0003C;73::AID-RCM454&#x0003E;3.0.CO</pub-id><pub-id pub-id-type="pmid">9921691</pub-id></citation></ref>
<ref id="B17">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>De Clerck</surname> <given-names>E.</given-names></name> <name><surname>De Vos</surname> <given-names>P.</given-names></name></person-group> (<year>2002</year>). <article-title>Study of the bacterial load in a gelatine production process focussed on bacillus and related endosporeforming genera</article-title>. <source>Syst. Appl. Microbiol.</source> <volume>25</volume>, <fpage>611</fpage>&#x02013;<lpage>617</lpage>. <pub-id pub-id-type="doi">10.1078/07232020260517751</pub-id><pub-id pub-id-type="pmid">12583722</pub-id></citation></ref>
<ref id="B18">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dieckmann</surname> <given-names>R.</given-names></name> <name><surname>Graeber</surname> <given-names>I.</given-names></name> <name><surname>Kaesler</surname> <given-names>I.</given-names></name> <name><surname>Szewzyk</surname> <given-names>U.</given-names></name> <name><surname>von D&#x000F6;hren</surname> <given-names>H.</given-names></name></person-group> (<year>2005</year>). <article-title>Rapid screening and dereplication of bacterial isolates from marine sponges of the Sula Ridge by Intact-Cell-MALDI-TOF mass spectrometry (ICM-MS)</article-title>. <source>Appl. Microbiol. Biotechnol.</source> <volume>67</volume>, <fpage>539</fpage>&#x02013;<lpage>548</lpage>. <pub-id pub-id-type="doi">10.1007/s00253-004-1812-2</pub-id><pub-id pub-id-type="pmid">15614563</pub-id></citation></ref>
<ref id="B19">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fenselau</surname> <given-names>C.</given-names></name> <name><surname>Demirev</surname> <given-names>P. A.</given-names></name></person-group> (<year>2001</year>). <article-title>Characterization of intact microorganisms by MALDI mass spectrometry</article-title>. <source>Mass Spectrom. Rev.</source> <volume>20</volume>, <fpage>157</fpage>&#x02013;<lpage>171</lpage>. <pub-id pub-id-type="doi">10.1002/mas.10004</pub-id><pub-id pub-id-type="pmid">11835304</pub-id></citation></ref>
<ref id="B20">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fox</surname> <given-names>G. E.</given-names></name> <name><surname>Wisotzkey</surname> <given-names>J. D.</given-names></name> <name><surname>Jurtshuk</surname> <given-names>P.</given-names></name></person-group> (<year>1992</year>). <article-title>How close is close: 16S rRNA sequence identity may not be sufficient to guarantee species identity</article-title>. <source>Int. J. Syst. Evol. Microbiol.</source> <volume>42</volume>, <fpage>166</fpage>&#x02013;<lpage>170</lpage>. <pub-id pub-id-type="doi">10.1099/00207713-42-1-166</pub-id></citation></ref>
<ref id="B21">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fraraccio</surname> <given-names>S.</given-names></name> <name><surname>Strejcek</surname> <given-names>M.</given-names></name> <name><surname>Dolinova</surname> <given-names>I.</given-names></name> <name><surname>Macek</surname> <given-names>T.</given-names></name> <name><surname>Uhlik</surname> <given-names>O.</given-names></name></person-group> (<year>2017</year>). <article-title>Secondary compound hypothesis revisited: selected plant secondary metabolites promote bacterial degradation of <italic>cis</italic>-1,2-dichloroethylene (cDCE)</article-title>. <source>Sci. Rep.</source> <volume>7</volume>:<fpage>8406</fpage>. <pub-id pub-id-type="doi">10.1038/s41598-017-07760-1</pub-id><pub-id pub-id-type="pmid">28814712</pub-id></citation></ref>
<ref id="B22">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Friedman</surname> <given-names>J. H.</given-names></name></person-group> (<year>1984</year>). <source>&#x0201C;A Variable Span Smoother&#x0201D;</source>. <publisher-name>Stanford University</publisher-name>. Available online at: <ext-link ext-link-type="uri" xlink:href="http://www.dtic.mil/docs/citations/ADA148241">http://www.dtic.mil/docs/citations/ADA148241</ext-link>.</citation></ref>
<ref id="B23">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fykse</surname> <given-names>E. M.</given-names></name> <name><surname>Tjarnhage</surname> <given-names>T.</given-names></name> <name><surname>Humppi</surname> <given-names>T.</given-names></name> <name><surname>Eggen</surname> <given-names>V. S.</given-names></name> <name><surname>Ingebretsen</surname> <given-names>A.</given-names></name> <name><surname>Skogan</surname> <given-names>G.</given-names></name> <etal/></person-group>. (<year>2015</year>). <article-title>Identification of airborne bacteria by 16S rDNA sequencing, MALDI-TOF MS and the MIDI microbial identification system</article-title>. <source>Aerobiologia</source> <volume>31</volume>, <fpage>271</fpage>&#x02013;<lpage>281</lpage>. <pub-id pub-id-type="doi">10.1007/s10453-015-9363-9</pub-id></citation></ref>
<ref id="B24">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ghyselinck</surname> <given-names>J.</given-names></name> <name><surname>van Hoorde</surname> <given-names>K.</given-names></name> <name><surname>Hoste</surname> <given-names>B.</given-names></name> <name><surname>Heylen</surname> <given-names>K.</given-names></name> <name><surname>De Vos</surname> <given-names>P.</given-names></name></person-group> (<year>2011</year>). <article-title>Evaluation of MALDI-TOF MS as a tool for high-throughput dereplication</article-title>. <source>J. Microbiol. Methods</source> <volume>86</volume>, <fpage>327</fpage>&#x02013;<lpage>336</lpage>. <pub-id pub-id-type="doi">10.1016/j.mimet.2011.06.004</pub-id><pub-id pub-id-type="pmid">21699925</pub-id></citation></ref>
<ref id="B25">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gibb</surname> <given-names>S.</given-names></name> <name><surname>Strimmer</surname> <given-names>K.</given-names></name></person-group> (<year>2012</year>). <article-title>MALDIquant: a versatile R package for the analysis of mass spectrometry data</article-title>. <source>Bioinformatics</source> <volume>28</volume>, <fpage>2270</fpage>&#x02013;<lpage>2271</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/bts447</pub-id><pub-id pub-id-type="pmid">22796955</pub-id></citation></ref>
<ref id="B26">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Holland</surname> <given-names>R. D.</given-names></name> <name><surname>Wilkes</surname> <given-names>J. G.</given-names></name> <name><surname>Rafii</surname> <given-names>F.</given-names></name> <name><surname>Sutherland</surname> <given-names>J. B.</given-names></name> <name><surname>Persons</surname> <given-names>C. C.</given-names></name> <name><surname>Voorhees</surname> <given-names>K. J.</given-names></name> <etal/></person-group>. (<year>1996</year>). <article-title>Rapid identification of intact whole bacteria based on spectral patterns using matrix-assisted laser desorption/ionization with time-of-flight mass spectrometry</article-title>. <source>Rapid Commun. Mass Spectrom.</source> <volume>10</volume>, <fpage>1227</fpage>&#x02013;<lpage>1232</lpage>. <pub-id pub-id-type="doi">10.1002/(SICI)1097-0231(19960731)10:10&#x0003C;1227::AID-RCM659&#x0003E;3.0.CO;2-6</pub-id><pub-id pub-id-type="pmid">8759332</pub-id></citation></ref>
<ref id="B27">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Huber</surname> <given-names>W.</given-names></name> <name><surname>Carey</surname> <given-names>V. J.</given-names></name> <name><surname>Gentleman</surname> <given-names>R.</given-names></name> <name><surname>Anders</surname> <given-names>S.</given-names></name> <name><surname>Carlson</surname> <given-names>M.</given-names></name> <name><surname>Carvalho</surname> <given-names>B. S.</given-names></name> <etal/></person-group>. (<year>2015</year>). <article-title>Orchestrating high-throughput genomic analysis with Bioconductor</article-title>. <source>Nat. Methods</source> <volume>12</volume>, <fpage>115</fpage>&#x02013;<lpage>121</lpage>. <pub-id pub-id-type="doi">10.1038/nmeth.3252</pub-id><pub-id pub-id-type="pmid">25633503</pub-id></citation></ref>
<ref id="B28">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Janda</surname> <given-names>J. M.</given-names></name> <name><surname>Abbott</surname> <given-names>S. L.</given-names></name></person-group> (<year>2007</year>). <article-title>16S rRNA gene sequencing for bacterial identification in the diagnostic laboratory: pluses, perils, and pitfalls</article-title>. <source>J. Clin. Microbiol.</source> <volume>45</volume>, <fpage>2761</fpage>&#x02013;<lpage>2764</lpage>. <pub-id pub-id-type="doi">10.1128/JCM.01228-07</pub-id><pub-id pub-id-type="pmid">17626177</pub-id></citation></ref>
<ref id="B29">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kim</surname> <given-names>M.</given-names></name> <name><surname>Oh</surname> <given-names>H. S.</given-names></name> <name><surname>Park</surname> <given-names>S. C.</given-names></name> <name><surname>Chun</surname> <given-names>J.</given-names></name></person-group> (<year>2014</year>). <article-title>Towards a taxonomic coherence between average nucleotide identity and 16S rRNA gene sequence similarity for species demarcation of prokaryotes</article-title>. <source>Int. J. Syst. Evol. Microbiol.</source> <volume>64</volume>, <fpage>346</fpage>&#x02013;<lpage>351</lpage>. <pub-id pub-id-type="doi">10.1099/ijs.0.059774-0</pub-id><pub-id pub-id-type="pmid">24505072</pub-id></citation></ref>
<ref id="B30">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kim</surname> <given-names>O. S.</given-names></name> <name><surname>Cho</surname> <given-names>Y. J.</given-names></name> <name><surname>Lee</surname> <given-names>K.</given-names></name> <name><surname>Yoon</surname> <given-names>S. H.</given-names></name> <name><surname>Kim</surname> <given-names>M.</given-names></name> <name><surname>Na</surname> <given-names>H.</given-names></name> <etal/></person-group>. (<year>2012</year>). <article-title>Introducing EzTaxon-e: a prokaryotic 16S rRNA gene sequence database with phylotypes that represent uncultured species</article-title>. <source>Int. J. Syst. Evol. Microbiol.</source> <volume>62</volume>, <fpage>716</fpage>&#x02013;<lpage>721</lpage>. <pub-id pub-id-type="doi">10.1099/ijs.0.038075-0</pub-id><pub-id pub-id-type="pmid">22140171</pub-id></citation></ref>
<ref id="B31">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Konstantinidis</surname> <given-names>K. T.</given-names></name> <name><surname>Ramette</surname> <given-names>A.</given-names></name> <name><surname>Tiedje</surname> <given-names>J. M.</given-names></name></person-group> (<year>2006</year>). <article-title>The bacterial species definition in the genomic era</article-title>. <source>Philos. Trans. R. Soc. Lond. B Biol. Sci.</source> <volume>361</volume>, <fpage>1929</fpage>&#x02013;<lpage>1940</lpage>. <pub-id pub-id-type="doi">10.1098/rstb.2006.1920</pub-id><pub-id pub-id-type="pmid">17062412</pub-id></citation></ref>
<ref id="B32">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Koubek</surname> <given-names>J.</given-names></name> <name><surname>Uhl&#x000ED;k</surname> <given-names>O.</given-names></name> <name><surname>Je&#x0010D;n&#x000E1;</surname> <given-names>K.</given-names></name> <name><surname>Junkov&#x000E1;</surname> <given-names>P.</given-names></name> <name><surname>Vrkoslavov&#x000E1;</surname> <given-names>J.</given-names></name> <name><surname>Lipov</surname> <given-names>J.</given-names></name> <etal/></person-group>. (<year>2012</year>). <article-title>Whole-cell MALDI-TOF: rapid screening method in environmental microbiology</article-title>. <source>Int. Biodeterior. Biodegradation</source> <volume>69</volume>, <fpage>82</fpage>&#x02013;<lpage>86</lpage>. <pub-id pub-id-type="doi">10.1016/j.ibiod.2011.12.007</pub-id></citation></ref>
<ref id="B33">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kumar</surname> <given-names>S.</given-names></name> <name><surname>Stecher</surname> <given-names>G.</given-names></name> <name><surname>Tamura</surname> <given-names>K.</given-names></name></person-group> (<year>2016</year>). <article-title>MEGA7: molecular evolutionary genetics analysis version 7.0 for bigger datasets</article-title>. <source>Mol. Biol. Evol.</source> <volume>33</volume>, <fpage>1870</fpage>&#x02013;<lpage>1874</lpage>. <pub-id pub-id-type="doi">10.1093/molbev/msw054</pub-id><pub-id pub-id-type="pmid">27004904</pub-id></citation></ref>
<ref id="B34">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kurzawova</surname> <given-names>V.</given-names></name> <name><surname>Stursa</surname> <given-names>P.</given-names></name> <name><surname>Uhlik</surname> <given-names>O.</given-names></name> <name><surname>Norkova</surname> <given-names>K.</given-names></name> <name><surname>Strohalm</surname> <given-names>M.</given-names></name> <name><surname>Lipov</surname> <given-names>J.</given-names></name> <etal/></person-group>. (<year>2012</year>). <article-title>Plant-microorganism interactions in bioremediation of polychlorinated biphenyl-contaminated soil</article-title>. <source>N. Biotechnol.</source> <volume>30</volume>, <fpage>15</fpage>&#x02013;<lpage>22</lpage>. <pub-id pub-id-type="doi">10.1016/j.nbt.2012.06.004</pub-id><pub-id pub-id-type="pmid">22728721</pub-id></citation></ref>
<ref id="B35">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Lane</surname> <given-names>D. J.</given-names></name></person-group> (<year>1991</year>). <article-title>16S/23S rRNA sequencing</article-title>, in <source>Nucleic Acid Techniques in Bacterial Systematics</source>, eds <person-group person-group-type="editor"><name><surname>Stackebrandt</surname> <given-names>E.</given-names></name> <name><surname>Goodfellow</surname> <given-names>M.</given-names></name></person-group> (<publisher-loc>New York, NY</publisher-loc>: <publisher-name>John Wiley and Sons</publisher-name>), <fpage>115</fpage>&#x02013;<lpage>175</lpage>.</citation></ref>
<ref id="B36">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lassalle</surname> <given-names>F.</given-names></name> <name><surname>Campillo</surname> <given-names>T.</given-names></name> <name><surname>Vial</surname> <given-names>L.</given-names></name> <name><surname>Baude</surname> <given-names>J.</given-names></name> <name><surname>Costechareyre</surname> <given-names>D.</given-names></name> <name><surname>Chapulliot</surname> <given-names>D.</given-names></name> <etal/></person-group>. (<year>2011</year>). <article-title>Genomic species are ecological species as revealed by comparative genomics in <italic>Agrobacterium tumefaciens</italic></article-title>. <source>Genome. Biol. Evol</source>. <volume>3</volume>, <fpage>762</fpage>&#x02013;<lpage>781</lpage>. <pub-id pub-id-type="doi">10.1093/gbe/evr070</pub-id><pub-id pub-id-type="pmid">21795751</pub-id></citation></ref>
<ref id="B37">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lay</surname> <given-names>J. O.</given-names> <suffix>Jr.</suffix></name></person-group> (<year>2001</year>). <article-title>MALDI-TOF mass spectrometry of bacteria</article-title>. <source>Mass Spectrom. Rev.</source> <volume>20</volume>, <fpage>172</fpage>&#x02013;<lpage>194</lpage>. <pub-id pub-id-type="doi">10.1002/mas.10003</pub-id><pub-id pub-id-type="pmid">11835305</pub-id></citation></ref>
<ref id="B38">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>H.</given-names></name> <name><surname>Du</surname> <given-names>Z.</given-names></name> <name><surname>Wang</surname> <given-names>J.</given-names></name> <name><surname>Yang</surname> <given-names>R.</given-names></name></person-group> (<year>2007</year>). <article-title>Universal sample preparation method for characterization of bacteria by matrix-assisted laser desorption ionization-time of flight mass spectrometry</article-title>. <source>Appl. Environ. Microbiol.</source> <volume>73</volume>, <fpage>1899</fpage>&#x02013;<lpage>1907</lpage>. <pub-id pub-id-type="doi">10.1128/AEM.02391-06</pub-id><pub-id pub-id-type="pmid">17277202</pub-id></citation></ref>
<ref id="B39">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Maier</surname> <given-names>T.</given-names></name> <name><surname>Klepel</surname> <given-names>S.</given-names></name> <name><surname>Renner</surname> <given-names>U.</given-names></name> <name><surname>Kostrzewa</surname> <given-names>M.</given-names></name></person-group> (<year>2006</year>). <article-title>Fast and reliable MALDI-TOF MS-based microorganism identification</article-title>. <source>Nat. Methods</source> <volume>3</volume>:<fpage>328</fpage>. <pub-id pub-id-type="doi">10.1038/nmeth870</pub-id></citation></ref>
<ref id="B40">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mellmann</surname> <given-names>A.</given-names></name> <name><surname>Bimet</surname> <given-names>F.</given-names></name> <name><surname>Bizet</surname> <given-names>C.</given-names></name> <name><surname>Borovskaya</surname> <given-names>A. D.</given-names></name> <name><surname>Drake</surname> <given-names>R. R.</given-names></name> <name><surname>Eigner</surname> <given-names>U.</given-names></name> <etal/></person-group>. (<year>2009</year>). <article-title>High interlaboratory reproducibility of matrix-assisted laser desorption ionization-time of flight mass spectrometry-based species identification of nonfermenting bacteria</article-title>. <source>J. Clin. Microbiol.</source> <volume>47</volume>, <fpage>3732</fpage>&#x02013;<lpage>3734</lpage>. <pub-id pub-id-type="doi">10.1128/JCM.00921-09</pub-id><pub-id pub-id-type="pmid">19776231</pub-id></citation></ref>
<ref id="B41">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mellmann</surname> <given-names>A.</given-names></name> <name><surname>Cloud</surname> <given-names>J.</given-names></name> <name><surname>Maier</surname> <given-names>T.</given-names></name> <name><surname>Keckevoet</surname> <given-names>U.</given-names></name> <name><surname>Ramminger</surname> <given-names>I.</given-names></name> <name><surname>Iwen</surname> <given-names>P.</given-names></name> <etal/></person-group>. (<year>2008</year>). <article-title>Evaluation of matrix-assisted laser desorption ionization-time-of-flight mass spectrometry in comparison to 16S rRNA gene sequencing for species identification of nonfermenting bacteria</article-title>. <source>J. Clin. Microbiol.</source> <volume>46</volume>, <fpage>1946</fpage>&#x02013;<lpage>1954</lpage>. <pub-id pub-id-type="doi">10.1128/JCM.00157-08</pub-id><pub-id pub-id-type="pmid">18400920</pub-id></citation></ref>
<ref id="B42">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Morhac</surname> <given-names>M.</given-names></name></person-group> (<year>2009</year>). <article-title>An algorithm for determination of peak regions and baseline elimination in spectroscopic data</article-title>. <source>Nucl. Instrum. Methods Phys. Res. Sect. A</source> <volume>600</volume>, <fpage>478</fpage>&#x02013;<lpage>487</lpage>. <pub-id pub-id-type="doi">10.1016/j.nima.2008.11.132</pub-id></citation></ref>
<ref id="B43">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Munoz</surname> <given-names>R.</given-names></name> <name><surname>Yarza</surname> <given-names>P.</given-names></name> <name><surname>Ludwig</surname> <given-names>W.</given-names></name> <name><surname>Euzeby</surname> <given-names>J.</given-names></name> <name><surname>Amann</surname> <given-names>R.</given-names></name> <name><surname>Schleifer</surname> <given-names>K. H.</given-names></name> <etal/></person-group>. (<year>2011</year>). <article-title>Release LTPs104 of the All-species living tree</article-title>. <source>Syst. Appl. Microbiol.</source> <volume>34</volume>, <fpage>169</fpage>&#x02013;<lpage>170</lpage>. <pub-id pub-id-type="doi">10.1016/j.syapm.2011.03.001</pub-id><pub-id pub-id-type="pmid">21497273</pub-id></citation></ref>
<ref id="B44">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Murray</surname> <given-names>P. R.</given-names></name></person-group> (<year>2010</year>). <article-title>Matrix-assisted laser desorption ionization time-of-flight mass spectrometry: usefulness for taxonomy and epidemiology</article-title>. <source>Clin. Microbiol. Infect.</source> <volume>16</volume>, <fpage>1626</fpage>&#x02013;<lpage>1630</lpage>. <pub-id pub-id-type="doi">10.1111/j.1469-0691.2010.03364.x</pub-id><pub-id pub-id-type="pmid">20825435</pub-id></citation></ref>
<ref id="B45">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Needleman</surname> <given-names>S. B.</given-names></name> <name><surname>Wunsch</surname> <given-names>C. D.</given-names></name></person-group> (<year>1970</year>). <article-title>A general method applicable to the search for similarities in the amino acid sequence of two proteins</article-title>. <source>J. Mol. Biol.</source> <volume>48</volume>, <fpage>443</fpage>&#x02013;<lpage>453</lpage>. <pub-id pub-id-type="doi">10.1016/0022-2836(70)90057-4</pub-id><pub-id pub-id-type="pmid">5420325</pub-id></citation></ref>
<ref id="B46">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nov&#x000E1;kov&#x000E1;</surname> <given-names>H.</given-names></name> <name><surname>Vo&#x00161;ahlikov&#x000E1;</surname> <given-names>M.</given-names></name> <name><surname>Pazlarov&#x000E1;</surname> <given-names>J.</given-names></name> <name><surname>Mackov&#x000E1;</surname> <given-names>M.</given-names></name> <name><surname>Burkhard</surname> <given-names>J.</given-names></name> <name><surname>Demnerov&#x000E1;</surname> <given-names>K.</given-names></name></person-group> (<year>2002</year>). <article-title>PCB metabolism by <italic>Pseudomonas</italic> sp. P2</article-title>. <source>Int. Biodeterior. Biodegradation</source> <volume>50</volume>, <fpage>47</fpage>&#x02013;<lpage>54</lpage>. <pub-id pub-id-type="doi">10.1016/S0964-8305(02)00058-6</pub-id></citation></ref>
<ref id="B47">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Oberle</surname> <given-names>M.</given-names></name> <name><surname>Wohlwend</surname> <given-names>N.</given-names></name> <name><surname>Jonas</surname> <given-names>D.</given-names></name> <name><surname>Maurer</surname> <given-names>F. P.</given-names></name> <name><surname>Jost</surname> <given-names>G.</given-names></name> <name><surname>Tschudin-Sutter</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2016</year>). <article-title>The Technical and Biological Reproducibility of Matrix-Assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry (MALDI-TOF MS) based typing: employment of bioinformatics in a multicenter study</article-title>. <source>PLoS ONE</source> <volume>11</volume>:<fpage>e0164260</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0164260</pub-id><pub-id pub-id-type="pmid">27798637</pub-id></citation></ref>
<ref id="B48">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Quast</surname> <given-names>C.</given-names></name> <name><surname>Pruesse</surname> <given-names>E.</given-names></name> <name><surname>Yilmaz</surname> <given-names>P.</given-names></name> <name><surname>Gerken</surname> <given-names>J.</given-names></name> <name><surname>Schweer</surname> <given-names>T.</given-names></name> <name><surname>Yarza</surname> <given-names>P.</given-names></name> <etal/></person-group>. (<year>2013</year>). <article-title>The SILVA ribosomal RNA gene database project: improved data processing and web-based tools</article-title>. <source>Nucleic Acids Res.</source> <volume>41</volume>, <fpage>D590</fpage>&#x02013;<lpage>D596</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gks1219</pub-id><pub-id pub-id-type="pmid">23193283</pub-id></citation></ref>
<ref id="B49">
<citation citation-type="book"><person-group person-group-type="author"><collab>R. Core Team</collab></person-group> (<year>2017</year>). <source>R: A Language and Environment for Statistical Computing</source>. <publisher-loc>Vienna</publisher-loc>: <publisher-name>R Foundation for Statistical Computing</publisher-name>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.R-project.org/">https://www.R-project.org/</ext-link>.</citation></ref>
<ref id="B50">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ryzhov</surname> <given-names>V.</given-names></name> <name><surname>Fenselau</surname> <given-names>C.</given-names></name></person-group> (<year>2001</year>). <article-title>Characterization of the protein subset desorbed by MALDI from whole bacterial cells</article-title>. <source>Anal. Chem.</source> <volume>73</volume>, <fpage>746</fpage>&#x02013;<lpage>750</lpage>. <pub-id pub-id-type="doi">10.1021/ac0008791</pub-id><pub-id pub-id-type="pmid">11248887</pub-id></citation></ref>
<ref id="B51">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Savitzky</surname> <given-names>A.</given-names></name> <name><surname>Golay</surname> <given-names>M. J. E.</given-names></name></person-group> (<year>1964</year>). <article-title>Smoothing &#x0002B; differentiation of data by simplified least squares procedures</article-title>. <source>Anal. Chem.</source> <volume>36</volume>, <fpage>1627</fpage>&#x02013;<lpage>1639</lpage>. <pub-id pub-id-type="doi">10.1021/ac60214a047</pub-id></citation></ref>
<ref id="B52">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sawana</surname> <given-names>A.</given-names></name> <name><surname>Adeolu</surname> <given-names>M.</given-names></name> <name><surname>Gupta</surname> <given-names>R. S</given-names></name></person-group>. (<year>2014</year>). <article-title>Molecular signatures and phylogenomic analysis of the genus <italic>Burkholderia</italic>: proposal for division of this genus into the emended genus <italic>Burkholderia</italic> containing pathogenic organisms and a new genus <italic>Paraburkholderia</italic> gen. nov. harboring environmental species</article-title>. <source>Front. Genet</source>. <volume>5</volume>:<fpage>429</fpage>. <pub-id pub-id-type="doi">10.3389/fgene.2014.00429</pub-id><pub-id pub-id-type="pmid">25566316</pub-id></citation></ref>
<ref id="B53">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schleifer</surname> <given-names>K. H.</given-names></name></person-group> (<year>2009</year>). <article-title>Classification of bacteria and archaea: past, present and future</article-title>. <source>Syst. Appl. Microbiol.</source> <volume>32</volume>, <fpage>533</fpage>&#x02013;<lpage>542</lpage>. <pub-id pub-id-type="doi">10.1016/j.syapm.2009.09.002</pub-id><pub-id pub-id-type="pmid">19819658</pub-id></citation></ref>
<ref id="B54">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Schmidt</surname> <given-names>D.</given-names></name></person-group> (<year>2016</year>). <source>Co-Operation: Fast Correlation, Covariance, and Cosine Similarity. R package version 0.6-0</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/package=coop">https://cran.r-project.org/package=coop</ext-link>.</citation></ref>
<ref id="B55">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schmitt</surname> <given-names>B. H.</given-names></name> <name><surname>Cunningham</surname> <given-names>S. A.</given-names></name> <name><surname>Dailey</surname> <given-names>A. L.</given-names></name> <name><surname>Gustafson</surname> <given-names>D. R.</given-names></name> <name><surname>Patel</surname> <given-names>R.</given-names></name></person-group> (<year>2013</year>). <article-title>Identification of anaerobic bacteria by Bruker Biotyper matrix-assisted laser desorption ionization-time of flight mass spectrometry with on-plate formic acid preparation</article-title>. <source>J. Clin. Microbiol.</source> <volume>51</volume>, <fpage>782</fpage>&#x02013;<lpage>786</lpage>. <pub-id pub-id-type="doi">10.1128/JCM.02420-12</pub-id><pub-id pub-id-type="pmid">23254126</pub-id></citation></ref>
<ref id="B56">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schulthess</surname> <given-names>B.</given-names></name> <name><surname>Bloemberg</surname> <given-names>G. V.</given-names></name> <name><surname>Zbinden</surname> <given-names>A.</given-names></name> <name><surname>Mouttet</surname> <given-names>F.</given-names></name> <name><surname>Zbinden</surname> <given-names>R.</given-names></name> <name><surname>Bottger</surname> <given-names>E. C.</given-names></name> <etal/></person-group>. (<year>2016</year>). <article-title>Evaluation of the Bruker MALDI biotyper for identification of fastidious gram-negative rods</article-title>. <source>J. Clin. Microbiol.</source> <volume>54</volume>, <fpage>543</fpage>&#x02013;<lpage>548</lpage>. <pub-id pub-id-type="doi">10.1128/JCM.03107-15</pub-id><pub-id pub-id-type="pmid">26659214</pub-id></citation></ref>
<ref id="B57">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schulthess</surname> <given-names>B.</given-names></name> <name><surname>Brodner</surname> <given-names>K.</given-names></name> <name><surname>Bloemberg</surname> <given-names>G. V.</given-names></name> <name><surname>Zbinden</surname> <given-names>R.</given-names></name> <name><surname>Bttger</surname> <given-names>E. C.</given-names></name> <name><surname>Hombach</surname> <given-names>M.</given-names></name></person-group> (<year>2013</year>). <article-title>Identification of gram-positive cocci by use of matrix-assisted laser desorption ionization-time of flight mass spectrometry: comparison of different preparation methods and implementation of a practical algorithm for routine diagnostics</article-title>. <source>J. Clin. Microbiol.</source> <volume>51</volume>, <fpage>1834</fpage>&#x02013;<lpage>1840</lpage>. <pub-id pub-id-type="doi">10.1128/JCM.02654-12</pub-id><pub-id pub-id-type="pmid">23554198</pub-id></citation></ref>
<ref id="B58">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Seng</surname> <given-names>P.</given-names></name> <name><surname>Abat</surname> <given-names>C.</given-names></name> <name><surname>Rolain</surname> <given-names>J. M.</given-names></name> <name><surname>Colson</surname> <given-names>P.</given-names></name> <name><surname>Lagier</surname> <given-names>J.-C.</given-names></name> <name><surname>Gouriet</surname> <given-names>F.</given-names></name> <etal/></person-group>. (<year>2013</year>). <article-title>Identification of rare pathogenic bacteria in a clinical microbiology laboratory: impact of MALDI-TOF mass spectrometry</article-title>. <source>J. Clin. Microbiol.</source> <volume>51</volume>, <fpage>2182</fpage>&#x02013;<lpage>2194</lpage>. <pub-id pub-id-type="doi">10.1128/JCM.00492-13</pub-id></citation></ref>
<ref id="B59">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shin</surname> <given-names>H. B.</given-names></name> <name><surname>Yoon</surname> <given-names>J.</given-names></name> <name><surname>Lee</surname> <given-names>Y.</given-names></name> <name><surname>Kim</surname> <given-names>M. S.</given-names></name> <name><surname>Lee</surname> <given-names>K.</given-names></name></person-group> (<year>2015</year>). <article-title>Comparison of MALDI-TOF MS, housekeeping gene sequencing, and 16S rRNA gene sequencing for identification of <italic>Aeromonas</italic> clinical isolates</article-title>. <source>Yonsei Med. J.</source> <volume>56</volume>, <fpage>550</fpage>&#x02013;<lpage>555</lpage>. <pub-id pub-id-type="doi">10.3349/ymj.2015.56.2.550</pub-id><pub-id pub-id-type="pmid">25684008</pub-id></citation></ref>
<ref id="B60">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Spitaels</surname> <given-names>F.</given-names></name> <name><surname>Wieme</surname> <given-names>A. D.</given-names></name> <name><surname>Vandamme</surname> <given-names>P.</given-names></name></person-group> (<year>2016</year>). <article-title>MALDI-TOF MS as a Novel Tool for Dereplication and Characterization of Microbiota in Bacterial Diversity Studies</article-title>, in <source>Applications of Mass Spectrometry in Microbiology: From Strain Characterization to Rapid Screening for Antibiotic Resistance</source>, eds <person-group person-group-type="editor"><name><surname>Demirev</surname> <given-names>P.</given-names></name> <name><surname>Sandrin</surname> <given-names>T.R.</given-names></name></person-group> (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer International Publishing</publisher-name>), <fpage>235</fpage>&#x02013;<lpage>256</lpage>.</citation></ref>
<ref id="B61">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Stackebrandt</surname> <given-names>E.</given-names></name> <name><surname>Goebel</surname> <given-names>B.</given-names></name></person-group> (<year>1994</year>). <article-title>Taxonomic note: a place for DNA-DNA reassociation and 16S rRNA sequence analysis in the present species definition in bacteriology</article-title>. <source>Int. J. Syst. Evol. Microbiol.</source> <volume>44</volume>, <fpage>846</fpage>&#x02013;<lpage>849</lpage>. <pub-id pub-id-type="doi">10.1099/00207713-44-4-846</pub-id></citation></ref>
<ref id="B62">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Stein</surname> <given-names>S. E.</given-names></name> <name><surname>Scott</surname> <given-names>D. R.</given-names></name></person-group> (<year>1994</year>). <article-title>Optimization and testing of mass spectral library search algorithms for compound identification</article-title>. <source>J. Am. Soc. Mass Spectrom.</source> <volume>5</volume>, <fpage>859</fpage>&#x02013;<lpage>866</lpage>. <pub-id pub-id-type="doi">10.1016/1044-0305(94)87009-8</pub-id><pub-id pub-id-type="pmid">24222034</pub-id></citation></ref>
<ref id="B63">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Strimmer</surname> <given-names>K.</given-names></name></person-group> (<year>2015</year>). <source>crossval: Generic Functions for Cross Validation. R Package Version 1.0.3</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://CRAN.R-project.org/package=crossval">https://CRAN.R-project.org/package=crossval</ext-link></citation></ref>
<ref id="B64">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Suarez</surname> <given-names>S.</given-names></name> <name><surname>Ferroni</surname> <given-names>A.</given-names></name> <name><surname>Lotz</surname> <given-names>A.</given-names></name> <name><surname>Jolley</surname> <given-names>K. A.</given-names></name> <name><surname>Guerin</surname> <given-names>P.</given-names></name> <name><surname>Leto</surname> <given-names>J.</given-names></name> <etal/></person-group>. (<year>2013</year>). <article-title>Ribosomal proteins as biomarkers for bacterial identification by mass spectrometry in the clinical microbiology laboratory</article-title>. <source>J. Microbiol. Methods</source> <volume>94</volume>, <fpage>390</fpage>&#x02013;<lpage>396</lpage>. <pub-id pub-id-type="doi">10.1016/j.mimet.2013.07.021</pub-id><pub-id pub-id-type="pmid">23916798</pub-id></citation></ref>
<ref id="B65">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Svobodova</surname> <given-names>B.</given-names></name> <name><surname>Vlach</surname> <given-names>J.</given-names></name> <name><surname>Junkova</surname> <given-names>P.</given-names></name> <name><surname>Karamonova</surname> <given-names>L.</given-names></name> <name><surname>Blazkova</surname> <given-names>M.</given-names></name> <name><surname>Fukal</surname> <given-names>L.</given-names></name></person-group> (<year>2017</year>). <article-title>Novel method for reliable identification of <italic>Siccibacter</italic> and <italic>Franconibacter</italic> strains: from &#x0201C;Pseudo-<italic>Cronobacter</italic>&#x0201D; to new <italic>Enterobacteriaceae</italic> genera</article-title>. <source>Appl. Environ. Microbiol.</source> <volume>83</volume>:<fpage>e00234</fpage>&#x02013;<lpage>17</lpage>. <pub-id pub-id-type="doi">10.1128/AEM.00234-17</pub-id><pub-id pub-id-type="pmid">28455327</pub-id></citation></ref>
<ref id="B66">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Thompson</surname> <given-names>J. R.</given-names></name> <name><surname>Marcelino</surname> <given-names>L. A.</given-names></name> <name><surname>Polz</surname> <given-names>M. F.</given-names></name></person-group> (<year>2002</year>). <article-title>Heteroduplexes in mixed-template amplifications: formation, consequence and elimination by &#x00027;reconditioning PCR&#x00027;</article-title>. <source>Nucleic Acids Res.</source> <volume>30</volume>, <fpage>2083</fpage>&#x02013;<lpage>2088</lpage>. <pub-id pub-id-type="doi">10.1093/nar/30.9.2083</pub-id><pub-id pub-id-type="pmid">11972349</pub-id></citation></ref>
<ref id="B67">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tindall</surname> <given-names>B. J.</given-names></name> <name><surname>Rossello-Mora</surname> <given-names>R.</given-names></name> <name><surname>Busse</surname> <given-names>H. J.</given-names></name> <name><surname>Ludwig</surname> <given-names>W.</given-names></name> <name><surname>Kampfer</surname> <given-names>P.</given-names></name></person-group> (<year>2010</year>). <article-title>Notes on the characterization of prokaryote strains for taxonomic purposes</article-title>. <source>Int. J. Syst. Evol. Microbiol.</source> <volume>60</volume>, <fpage>249</fpage>&#x02013;<lpage>266</lpage>. <pub-id pub-id-type="doi">10.1099/ijs.0.016949-0</pub-id><pub-id pub-id-type="pmid">19700448</pub-id></citation></ref>
<ref id="B68">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Uhl&#x000ED;k</surname> <given-names>O.</given-names></name> <name><surname>Strej&#x0010D;ek</surname> <given-names>M.</given-names></name> <name><surname>Junkov&#x000E1;</surname> <given-names>P.</given-names></name> <name><surname>&#x00160;anda</surname> <given-names>M.</given-names></name> <name><surname>Hroudov&#x000E1;</surname> <given-names>M.</given-names></name> <name><surname>Vl&#x0010D;ek</surname> <given-names>C.</given-names></name> <etal/></person-group>. (<year>2011</year>). <article-title>Matrix-assisted laser desorption ionization (MALDI)-time of flight mass spectrometry- and MALDI biotyper-based identification of cultured biphenyl-metabolizing bacteria from contaminated horseradish rhizosphere soil</article-title>. <source>Appl. Environ. Microbiol.</source> <volume>77</volume>, <fpage>6858</fpage>&#x02013;<lpage>6866</lpage>. <pub-id pub-id-type="doi">10.1128/AEM.05465-11</pub-id><pub-id pub-id-type="pmid">21821747</pub-id></citation></ref>
<ref id="B69">
<citation citation-type="journal"><person-group person-group-type="author"><collab>UniProt Consortium</collab></person-group> (<year>2017</year>). <article-title>UniProt: the universal protein knowledgebase</article-title>. <source>Nucleic Acids Res</source>. <volume>45</volume>, <fpage>D158</fpage>&#x02013;<lpage>D169</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkw1099</pub-id></citation></ref>
<ref id="B70">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vancanneyti</surname> <given-names>M.</given-names></name> <name><surname>Witt</surname> <given-names>S.</given-names></name> <name><surname>Abraham</surname> <given-names>W.-R.</given-names></name> <name><surname>Kersters</surname> <given-names>K.</given-names></name> <name><surname>Fredrickson</surname> <given-names>H. L.</given-names></name></person-group> (<year>1996</year>). <article-title>Fatty acid content in whole-cell hydrolysates and phospholipid and phospholipid fractions of pseudomonads: a taxonomic evaluation</article-title>. <source>Syst. Appl. Microbiol.</source> <volume>19</volume>, <fpage>528</fpage>&#x02013;<lpage>540</lpage>. <pub-id pub-id-type="doi">10.1016/S0723-2020(96)80025-7</pub-id></citation></ref>
<ref id="B71">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Versalovic</surname> <given-names>J.</given-names></name></person-group> (<year>1994</year>). <article-title>Genomic fingerprinting of bacteria using repetitive sequence-based polymerase chain reaction</article-title>. <source>Methods Mol. Cell. Biol.</source> <volume>5</volume>, <fpage>25</fpage>&#x02013;<lpage>40</lpage>.</citation></ref>
<ref id="B72">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wald</surname> <given-names>J.</given-names></name> <name><surname>Hroudov&#x000E1;</surname> <given-names>M.</given-names></name> <name><surname>Jansa</surname> <given-names>J.</given-names></name> <name><surname>Vrchotov&#x000E1;</surname> <given-names>B.</given-names></name> <name><surname>Macek</surname> <given-names>T.</given-names></name> <name><surname>Uhl&#x000ED;k</surname> <given-names>O.</given-names></name></person-group> (<year>2015</year>). <article-title>Pseudomonads rule degradation of polyaromatic hydrocarbons in aerated sediment</article-title>. <source>Front. Microbiol.</source> <volume>6</volume>:<fpage>1268</fpage>. <pub-id pub-id-type="doi">10.3389/fmicb.2015.01268</pub-id><pub-id pub-id-type="pmid">26635740</pub-id></citation></ref>
<ref id="B73">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>Q.</given-names></name> <name><surname>Garrity</surname> <given-names>G. M.</given-names></name> <name><surname>Tiedje</surname> <given-names>J. M.</given-names></name> <name><surname>Cole</surname> <given-names>J. R.</given-names></name></person-group> (<year>2007</year>). <article-title>Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy</article-title>. <source>Appl. Environ. Microbiol.</source> <volume>73</volume>, <fpage>5261</fpage>&#x02013;<lpage>5267</lpage>. <pub-id pub-id-type="doi">10.1128/AEM.00062-07</pub-id><pub-id pub-id-type="pmid">17586664</pub-id></citation></ref>
<ref id="B74">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Westblade</surname> <given-names>L. F.</given-names></name> <name><surname>Garner</surname> <given-names>O. B.</given-names></name> <name><surname>Macdonald</surname> <given-names>K.</given-names></name> <name><surname>Bradford</surname> <given-names>C.</given-names></name> <name><surname>Pincus</surname> <given-names>D. H.</given-names></name> <name><surname>Mochon</surname> <given-names>A. B.</given-names></name> <etal/></person-group>. (<year>2015</year>). <article-title>Assessment of reproducibility of matrix-assisted laser desorption ionization-time of flight mass spectrometry for bacterial and yeast identification</article-title>. <source>J. Clin. Microbiol.</source> <volume>53</volume>, <fpage>2349</fpage>&#x02013;<lpage>2352</lpage>. <pub-id pub-id-type="doi">10.1128/JCM.00187-15</pub-id><pub-id pub-id-type="pmid">25926486</pub-id></citation></ref>
<ref id="B75">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wieser</surname> <given-names>A.</given-names></name> <name><surname>Schneider</surname> <given-names>L.</given-names></name> <name><surname>Jung</surname> <given-names>J.</given-names></name> <name><surname>Schubert</surname> <given-names>S.</given-names></name></person-group> (<year>2012</year>). <article-title>MALDI-TOF MS in microbiological diagnostics-identification of microorganisms and beyond (mini review)</article-title>. <source>Appl. Microbiol. Biotechnol.</source> <volume>93</volume>, <fpage>965</fpage>&#x02013;<lpage>974</lpage>. <pub-id pub-id-type="doi">10.1007/s00253-011-3783-4</pub-id><pub-id pub-id-type="pmid">22198716</pub-id></citation></ref>
<ref id="B76">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Woese</surname> <given-names>C. R.</given-names></name></person-group> (<year>1987</year>). <article-title>Bacterial evolution</article-title>. <source>Microbiol. Rev.</source> <volume>51</volume>, <fpage>221</fpage>&#x02013;<lpage>271</lpage>. <pub-id pub-id-type="pmid">2439888</pub-id></citation></ref>
<ref id="B77">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yoon</surname> <given-names>S. H.</given-names></name> <name><surname>Ha</surname> <given-names>S. M.</given-names></name> <name><surname>Kwon</surname> <given-names>S.</given-names></name> <name><surname>Lim</surname> <given-names>J.</given-names></name> <name><surname>Kim</surname> <given-names>Y.</given-names></name> <name><surname>Seo</surname> <given-names>H.</given-names></name> <etal/></person-group>. (<year>2017</year>). <article-title>Introducing EzBioCloud: a taxonomically united database of 16S rRNA gene sequences and whole-genome assemblies</article-title>. <source>Int. J. Syst. Evol. Microbiol.</source> <volume>67</volume>, <fpage>1613</fpage>&#x02013;<lpage>1617</lpage>. <pub-id pub-id-type="doi">10.1099/ijsem.0.001755</pub-id><pub-id pub-id-type="pmid">28005526</pub-id></citation></ref>
<ref id="B78">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yutin</surname> <given-names>N.</given-names></name> <name><surname>Puigbo</surname> <given-names>P.</given-names></name> <name><surname>Koonin</surname> <given-names>E. V.</given-names></name> <name><surname>Wolf</surname> <given-names>Y. I.</given-names></name></person-group> (<year>2012</year>). <article-title>Phylogenomics of prokaryotic ribosomal proteins</article-title>. <source>PLoS ONE</source> <volume>7</volume>:<fpage>e36972</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0036972</pub-id><pub-id pub-id-type="pmid">22615861</pub-id></citation></ref>
</ref-list>
</back>
</article>