<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="research-article" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Mol. Biosci.</journal-id>
<journal-title>Frontiers in Molecular Biosciences</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Mol. Biosci.</abbrev-journal-title>
<issn pub-type="epub">2296-889X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">752797</article-id>
<article-id pub-id-type="doi">10.3389/fmolb.2021.752797</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Molecular Biosciences</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Which Is the Best <italic>In Silico</italic> Program for the Missense Variations in IDUA Gene? A Comparison of 33 Programs Plus a Conservation Score and Evaluation of 586 Missense Variants</article-title>
<alt-title alt-title-type="left-running-head">Borges et&#x20;al.</alt-title>
<alt-title alt-title-type="right-running-head">Predicting Missense Variations in IDUA</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Borges</surname>
<given-names>P&#xe2;mella</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Pasqualim</surname>
<given-names>Gabriela</given-names>
</name>
<xref ref-type="aff" rid="aff4">
<sup>4</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1475981/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Matte</surname>
<given-names>Ursula</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
<xref ref-type="aff" rid="aff5">
<sup>5</sup>
</xref>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1216987/overview"/>
</contrib>
</contrib-group>
<aff id="aff1">
<label>
<sup>1</sup>
</label>Cell, Tissue and Gene Laboratory, Clinicas Hospital of Porto Alegre (HCPA), <addr-line>Porto Alegre</addr-line>, <country>Brazil</country>
</aff>
<aff id="aff2">
<label>
<sup>2</sup>
</label>Bioinformatics Core, Experimental Research Centre, HCPA, <addr-line>Porto Alegre</addr-line>, <country>Brazil</country>
</aff>
<aff id="aff3">
<label>
<sup>3</sup>
</label>Graduate Programme in Genetics and Molecular Biology, Federal University of Rio Grande Do Sul (UFRGS), <addr-line>Porto Alegre</addr-line>, <country>Brazil</country>
</aff>
<aff id="aff4">
<label>
<sup>4</sup>
</label>Genetics Laboratory, Biological Sciences Institute, Federal University of Rio Grande (FURG), <addr-line>Porto Alegre</addr-line>, <country>Brazil</country>
</aff>
<aff id="aff5">
<label>
<sup>5</sup>
</label>Department of Genetics, UFRGS, <addr-line>Porto Alegre</addr-line>, <country>Brazil</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/31488/overview">Grzegorz Wegrzyn</ext-link>, University of Gdansk, Poland</p>
</fn>
<fn fn-type="edited-by">
<p>
<bold>Reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1435826/overview">Reidar Andreson</ext-link>, University of Tartu, Estonia</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/209710/overview">Christiane Susanne Hampe</ext-link>, University of Washington, United&#x20;States</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Ursula Matte, <email>umatte@hcpa.edu.br</email>
</corresp>
<fn fn-type="other">
<p>This article was submitted to Molecular Diagnostics and Therapeutics, a section of the journal Frontiers in Molecular Biosciences</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>21</day>
<month>10</month>
<year>2021</year>
</pub-date>
<pub-date pub-type="collection">
<year>2021</year>
</pub-date>
<volume>8</volume>
<elocation-id>752797</elocation-id>
<history>
<date date-type="received">
<day>03</day>
<month>08</month>
<year>2021</year>
</date>
<date date-type="accepted">
<day>14</day>
<month>09</month>
<year>2021</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2021 Borges, Pasqualim and Matte.</copyright-statement>
<copyright-year>2021</copyright-year>
<copyright-holder>Borges, Pasqualim and Matte</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these&#x20;terms.</p>
</license>
</permissions>
<abstract>
<p>Mucopolysaccharidosis type I (MPS I) is an autosomal recessive disease characterized by the deficiency of alpha-L-iduronidase (<italic>IDUA</italic>), an enzyme involved in glycosaminoglycan degradation. More than 200&#x20;disease-causing variants have been reported and characterized in the <italic>IDUA</italic> gene. It also has several variants of unknown significance (VUS) and literature conflicting interpretations of pathogenicity. This study evaluated 586 variants obtained from the literature review, five population databases, in addition to dbSNP, Human Genome Mutation Database (HGMD), and ClinVar. For the variants described in the literature, two datasets were created based on the strength of the criteria. The stricter criteria subset had 108 variants with expression study, analysis of healthy controls, and/or complete gene sequence. The less stringent criteria subset had additional 52 variants found in the literature review, HGMD or ClinVar, and dbSNP with an allele frequency higher than 0.001. The other 426 variants were considered VUS. The two strength criteria datasets were used to evaluate 33 programs plus a conservation score. BayesDel (addAF and noAF), PON-P2 (genome and protein), and ClinPred algorithms showed the best sensitivity, specificity, accuracy, and kappa value for both criteria subsets. The VUS were evaluated with these five algorithms. Based on the results, 122 variants had total consensus among the five predictors, with 57 classified as predicted deleterious and 65 as predicted neutral. For variants not included in PON-P2, 88 variants were considered deleterious and 92 neutral by all other predictors. The remaining 124 did not obtain a consensus among predictors.</p>
</abstract>
<kwd-group>
<kwd>mucopolysaccharidosis type I (MPS I)</kwd>
<kwd>missense variants</kwd>
<kwd>in silico predictions</kwd>
<kwd>VUS classifications</kwd>
<kwd>molecular diagnosis</kwd>
</kwd-group>
<contract-sponsor id="cn001">Associa&#xe7;&#xe3;o Fundo de Incentivo &#xe0; Pesquisa<named-content content-type="fundref-id">10.13039/501100006303</named-content>
</contract-sponsor>
<contract-sponsor id="cn002">Conselho Nacional de Desenvolvimento Cient&#xed;fico e Tecnol&#xf3;gico<named-content content-type="fundref-id">10.13039/501100003593</named-content>
</contract-sponsor>
</article-meta>
</front>
<body>
<sec id="s1">
<title>Introduction</title>
<p>Mucopolysaccharidosis type I (MPS I) is an autosomal recessive disease characterized by the deficiency of alpha-L-iduronidase (<italic>IDUA</italic>) involved in glycosaminoglycan (GAG) degradation (<xref ref-type="bibr" rid="B46">Scott et&#x20;al., 1991</xref>). This deficiency leads to progressive lysosomal accumulation of heparan and dermatan sulfate and causes a gradual deterioration of cells and tissues that culminate in early death in severe cases (<xref ref-type="bibr" rid="B31">Lehman et&#x20;al., 2011</xref>). MPS I has a considerable phenotypic variation, with an extensive range of clinical manifestations and well-defined extreme phenotypes. Scheie syndrome (MPS I-S; OMIM&#x23; 607016) is the attenuated phenotype and includes somatic involvement, while Hurler syndrome (MPS I-H; OMIM&#x23; 607014) is the severe phenotype with important neurological impairment, among other features (<xref ref-type="bibr" rid="B28">Kubaski et&#x20;al., 2020</xref>). All phenotypes exhibit excessive GAG accumulation and excretion in urine and are indistinguishable by routine biochemical tests (<xref ref-type="bibr" rid="B31">Lehman et&#x20;al., 2011</xref>; <xref ref-type="bibr" rid="B54">Viana et&#x20;al., 2011</xref>).</p>
<p>More than 200&#x20;disease-causing variants have been reported and characterized in the <italic>IDUA</italic> gene (<xref ref-type="bibr" rid="B5">Bertola et&#x20;al., 2011</xref>). In a 2019 study with data from the MPS I Registry, non-sense and missense variants corresponded, respectively, to 56.5 and 33.6% of the reported variants (<xref ref-type="bibr" rid="B57">Clarke et&#x20;al., 2019</xref>). Attenuated cases present at least one allele with residual activity, generally due to missense variants, regardless of the other alleles, and genotype&#x2013;phenotype correlation has been established for some missense pathogenic variants (<xref ref-type="bibr" rid="B18">Fuller et&#x20;al., 2005</xref>). Non-disease-causing missense variants, such as p.Arg105Gln, p.Gln63Pro (<xref ref-type="bibr" rid="B46">Scott et&#x20;al., 1991</xref>), p.His33Gln (<xref ref-type="bibr" rid="B47">Scott et&#x20;al., 1992</xref>), and p.Ala361Thr (<xref ref-type="bibr" rid="B11">Clarke and Scott, 1993</xref>), have also been described in the literature.</p>
<p>The broader use of massive parallel genetic sequencing increased the list of variants of unknown significance (VUS). Functional molecular assessments do not accompany the pace of detection of new genetic variants. Most variants present in the Exome Aggregation Consortium (ExAC) and Genome Aggregation Database (gnomAD) (<xref ref-type="bibr" rid="B32">Lek et&#x20;al., 2016</xref>; <xref ref-type="bibr" rid="B26">Karczewski et&#x20;al., 2020</xref>) have not yet been described or evaluated. Therefore, research and clinical laboratories use <italic>in silico</italic> strategies to help understand the biological significance of VUS. These methods are already considered in ACMG standard guidelines (<xref ref-type="bibr" rid="B43">Richards et&#x20;al., 2015</xref>) to indicate some evidence level when clinical information is insufficient or non-existent. Clinical laboratories also created their guideline on variant interpretation, named Sherloc (semi-quantitative, hierarchical evidence-based rules for locus interpretation) (<xref ref-type="bibr" rid="B39">Nykamp et&#x20;al., 2017</xref>).</p>
<p>Even though computational analysis is often used, results must be viewed with caution. Not only do different programs have discordant results for the same gene, but algorithms may also have different values of accuracy, specificity, and sensitivity depending on the characteristics of the gene or protein. Therefore, ideally, a performance assessment should be performed for each gene/protein to choose the best algorithm for variant prioritization. However, this also needs reliable standards as calibrators&#x2014;literature and curated databases also show divergence.</p>
<p>
<italic>This study aims to compare in&#x20;silico predictors using two datasets of variants with different degrees of confidence.</italic> Using the best predictors indicated by these two datasets, we evaluated the VUS present in the <italic>IDUA</italic> gene in population databases.</p>
</sec>
<sec sec-type="materials|methods" id="s2">
<title>Materials and Methods</title>
<sec id="s2-1">
<title>Curated Variant Selection</title>
<p>We created a database with missense variants described in the literature, in curated databases, and in population databases with frequencies greater than 0.001. To perform such studies, a number of benign and pathogenic variants are needed, and they can only be obtained with comprehensive review of the literature; therefore, we opted for a single gene study. We performed a manual review of all missense variants in the <italic>IDUA</italic> gene published between 1991 and 2019. According to the variant classification methods in each manuscript, variants from the literature were divided into two subsets (strong or weak evidence). Evidence was considered strong if at least one of the following was performed: expression study, evaluation of healthy controls, or complete gene sequence corroborating the pathogenic or non-pathogenic disease-causing variant status. The subset of variants with weak criteria comprised all variants in the strong subset plus the rest of missense variants described in the literature, variants from the HGMD (<xref ref-type="bibr" rid="B50">Stenson et&#x20;al., 2014</xref>) and ClinVar (with their classifications) (<xref ref-type="bibr" rid="B30">Landrum et&#x20;al., 2014</xref>), and variants in population databases with allele frequencies greater than 0.001. These two subsets were selected to evaluate the prediction programs&#x2019; characteristics and to compare the correlation between variants&#x2019; predictions and literature information. Variants that do not have any of these criteria were considered&#x20;VUS.</p>
</sec>
<sec id="s2-2">
<title>
<italic>In Silico</italic> Programs</title>
<p>We analyzed 33 prediction algorithms and one conservation score. For better comparison, all available training sets for each program were evaluated separately. We obtained prediction for SIFT (protein data training) (<xref ref-type="bibr" rid="B29">Kumar et&#x20;al., 2009</xref>), SIFT4G (<xref ref-type="bibr" rid="B53">Vaser et&#x20;al., 2016</xref>), PolyPhen2 (HDIV and HVAR) (<xref ref-type="bibr" rid="B1">Adzhubei et&#x20;al., 2013</xref>), LRT (<xref ref-type="bibr" rid="B10">Chun and Fay, 2009</xref>), MutationTaster2 (<xref ref-type="bibr" rid="B45">Schwarz et&#x20;al., 2010</xref>), MutationAssessor (<xref ref-type="bibr" rid="B42">Reva et&#x20;al., 2007</xref>), FATHMM (Coding Variants-Weighted, MKL coding, and XF coding) (<xref ref-type="bibr" rid="B49">Shihab et&#x20;al., 2013</xref>), MetaSVM/LR (<xref ref-type="bibr" rid="B13">Dong et&#x20;al., 2015</xref>), CADD (GRCh37/hg19 and GRCh38/hg38) (<xref ref-type="bibr" rid="B27">Kircher et&#x20;al., 2014</xref>), VEST4 (<xref ref-type="bibr" rid="B8">Carter et&#x20;al., 2013</xref>), PROVEAN (protein data training) (<xref ref-type="bibr" rid="B9">Choi et&#x20;al., 2012</xref>), fitCons x4 (<xref ref-type="bibr" rid="B20">Gulko et&#x20;al., 2015</xref>), LINSIGHT (<xref ref-type="bibr" rid="B22">Huang et&#x20;al., 2017</xref>), M-CAP (<xref ref-type="bibr" rid="B25">Jagadeesh et&#x20;al., 2016</xref>), REVEL (<xref ref-type="bibr" rid="B24">Ioannidis et&#x20;al., 2016</xref>), MutPred (<xref ref-type="bibr" rid="B33">Li et&#x20;al., 2009</xref>), PrimateAI (<xref ref-type="bibr" rid="B51">Sundaram et&#x20;al., 2019</xref>), BayesDel (addAF and noAF) (<xref ref-type="bibr" rid="B14">Feng, 2017</xref>), ClinPred (<xref ref-type="bibr" rid="B2">Alirezaie et&#x20;al., 2018</xref>), and LIST-S2 (<xref ref-type="bibr" rid="B35">Malhis et&#x20;al., 2020</xref>) prediction algorithms. We also tested the GERP&#x2b;&#x2b; conservation score (<xref ref-type="bibr" rid="B12">Davydov et&#x20;al., 2010</xref>) from dbNSFP v4.1a, a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome (<xref ref-type="bibr" rid="B34">Liu et&#x20;al., 2020</xref>).</p>
<p>The predictions of PhD-SNP (<xref ref-type="bibr" rid="B6">Capriotti et&#x20;al., 2006</xref>), PANTHER (<xref ref-type="bibr" rid="B52">Thomas et&#x20;al., 2003</xref>), SNPs&#x26;GO (<xref ref-type="bibr" rid="B7">Capriotti et&#x20;al., 2013</xref>), PredictSNP (<xref ref-type="bibr" rid="B4">Bendl et&#x20;al., 2016</xref>), CADD 1.2 (<xref ref-type="bibr" rid="B27">Kircher et&#x20;al., 2014</xref>), DANN (<xref ref-type="bibr" rid="B41">Quang et&#x20;al., 2015</xref>), FATHMM (Coding Variants-Unweighted) (<xref ref-type="bibr" rid="B49">Shihab et&#x20;al., 2013</xref>), FunSeq2 (<xref ref-type="bibr" rid="B17">Fu et&#x20;al., 2014</xref>), GWAVAE 1.0 (<xref ref-type="bibr" rid="B44">Ritchie et&#x20;al., 2014</xref>), SuSPect (<xref ref-type="bibr" rid="B55">Yates et&#x20;al., 2014</xref>), PMUT (<xref ref-type="bibr" rid="B15">Ferrer-Costa et&#x20;al., 2005</xref>), CONDEL (<xref ref-type="bibr" rid="B19">Gonz&#xe1;lez-P&#xe9;rez and L&#xf3;pez-Bigas, 2011</xref>), PROVEAN (genome data training) (<xref ref-type="bibr" rid="B9">Choi et&#x20;al., 2012</xref>), SIFT (genome data training) (<xref ref-type="bibr" rid="B29">Kumar et&#x20;al., 2009</xref>), PON-P2 (identifier, protein, and genome data training) (<xref ref-type="bibr" rid="B37">Niroula et&#x20;al., 2015</xref>), and MutPred (<xref ref-type="bibr" rid="B33">Li et&#x20;al., 2009</xref>) were obtained from the web-based application. The variant classifiers were used when provided by the algorithm. The scores of VEST4 (<xref ref-type="bibr" rid="B8">Carter et&#x20;al., 2013</xref>), REVEL (<xref ref-type="bibr" rid="B24">Ioannidis et&#x20;al., 2016</xref>), MutPred (<xref ref-type="bibr" rid="B33">Li et&#x20;al., 2009</xref>), CADD_raw, CADD_phred (<xref ref-type="bibr" rid="B27">Kircher et&#x20;al., 2014</xref>), integrated_fitCons (<xref ref-type="bibr" rid="B20">Gulko et&#x20;al., 2015</xref>), SuSPect (<xref ref-type="bibr" rid="B55">Yates et&#x20;al., 2014</xref>), and GERP&#x2b;&#x2b;_NR (<xref ref-type="bibr" rid="B12">Davydov et&#x20;al., 2010</xref>) were transformed in binary classification. The cutoff of 0.5 was applied for SuSPect (<xref ref-type="bibr" rid="B55">Yates et&#x20;al., 2014</xref>) and VEST4 (<xref ref-type="bibr" rid="B8">Carter et&#x20;al., 2013</xref>), 0.75 for MutPred (<xref ref-type="bibr" rid="B33">Li et&#x20;al., 2009</xref>) and REVEL (<xref ref-type="bibr" rid="B24">Ioannidis et&#x20;al., 2016</xref>), 20 for CADD_phred, zero for CADD_raw (<xref ref-type="bibr" rid="B27">Kircher et&#x20;al., 2014</xref>), 0.4 for fitCons x4 (<xref ref-type="bibr" rid="B20">Gulko et&#x20;al., 2015</xref>), and 0.047 for GERP&#x2b;&#x2b; (<xref ref-type="bibr" rid="B12">Davydov et&#x20;al., 2010</xref>) as suggested by the authors.</p>
</sec>
<sec id="s2-3">
<title>Variants of Unknown Significance</title>
<p>All missense variants in the canonical <italic>IDUA</italic> sequence present in ExAC v0.3.1 (<xref ref-type="bibr" rid="B32">Lek et&#x20;al., 2016</xref>), gnomAD v2.0.2 (<xref ref-type="bibr" rid="B26">Karczewski et&#x20;al., 2020</xref>), ABraOM (<xref ref-type="bibr" rid="B36">Naslavsky et&#x20;al., 2017</xref>), LOVD (<xref ref-type="bibr" rid="B16">Fokkema et&#x20;al., 2011</xref>), 1000 Genomes (<xref ref-type="bibr" rid="B3">1000 Genomes Project Consortium et&#x20;al., 2015</xref>), and dbSNP (<xref ref-type="bibr" rid="B48">Sherry et&#x20;al., 2001</xref>) with frequencies less than 0.0001, plus variants in the Human Genome Mutation Database (HGMD) (<xref ref-type="bibr" rid="B50">Stenson et&#x20;al., 2014</xref>) and ClinVar (<xref ref-type="bibr" rid="B30">Landrum et&#x20;al., 2014</xref>), were considered VUS. These variants were merged in a single database to remove duplicates and exclude those included in the datasets previously used to compare the algorithms.</p>
</sec>
<sec id="s2-4">
<title>Statistical Analysis</title>
<p>The statistical analysis was performed using SPSS (Statistical Package for the Social Sciences) and python algorithms. The sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, true-positive rate (TPR), false-positive rate (FPR), and Fisher&#x2019;s exact test were calculated on python with libraries matplotlib.pyplot (<xref ref-type="bibr" rid="B23">Hunter, 2007</xref>), sklearn.metrics (<xref ref-type="bibr" rid="B40">Pedregosa et&#x20;al., 2011</xref>), pandas (<xref ref-type="bibr" rid="B56">Zenodo, 2020</xref>), and NumPy (<xref ref-type="bibr" rid="B21">Harris et&#x20;al., 2020</xref>). The kappa value was generated with SPSS&#x20;18.03.</p>
</sec>
</sec>
<sec sec-type="results" id="s3">
<title>Results</title>
<p>A total of 586 unique variants were analyzed in this study obtained according to the workflow presented in <xref ref-type="fig" rid="F1">Figure&#x20;1</xref>. Each database&#x2019;s contribution can be seen in <xref ref-type="sec" rid="s11">Supplementary Figure S1</xref>. dbSNP (<xref ref-type="bibr" rid="B48">Sherry et&#x20;al., 2001</xref>) and gnomAD v2.0.2 (<xref ref-type="bibr" rid="B26">Karczewski et&#x20;al., 2020</xref>) databases had the larger number of variants, 363 and 316, respectively, with 83 and 86 exclusive ones. ExAC v0.3.1 (<xref ref-type="bibr" rid="B32">Lek et&#x20;al., 2016</xref>) contributed with 266 variants, with only six exclusive ones. LOVD (<xref ref-type="bibr" rid="B16">Fokkema et&#x20;al., 2011</xref>) presented 44 variants, with three exclusive ones, whereas HGMD (<xref ref-type="bibr" rid="B50">Stenson et&#x20;al., 2014</xref>) and ClinVar (<xref ref-type="bibr" rid="B30">Landrum et&#x20;al., 2014</xref>) contributed with 3 and 19 exclusive variants, respectively, from a total of 136 and 131. ABraOM (<xref ref-type="bibr" rid="B36">Naslavsky et&#x20;al., 2017</xref>) and 1000 Genomes (<xref ref-type="bibr" rid="B3">1000 Genomes Project Consortium et&#x20;al., 2015</xref>) presented 19 and 47 variants, respectively, but none was private.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption>
<p>Workflow chart showing variant retrieval and curation.</p>
</caption>
<graphic xlink:href="fmolb-08-752797-g001.tif"/>
</fig>
<p>First, 145 variants manually retrieved from the literature were combined with variants in curated databases and population databases with frequencies higher than 0.001. This formed a set of 160 unique variants used to compare the algorithms. Another 426 variants were obtained from population databases and considered&#x20;VUS.</p>
<p>According to the type of evidence used for their description, variants in the first set of 160 were divided into two subgroups. Out of the 145 variants from the literature, 108 had at least one of three measures that were considered strong evidence criteria (<xref ref-type="fig" rid="F2">Figure&#x20;2</xref>). In this group of variants with strong evidence, 91 were disease-causing, and of these, 19 variants did not have expression studies, 48 variants were not analyzed in healthy controls, and 50 variants were not described in studies with complete gene sequencing (<xref ref-type="sec" rid="s11">Supplementary Table S1</xref>). Of the 17 non-disease-causing variants in the group with strong evidence, only five were not analyzed by expression studies (<xref ref-type="sec" rid="s11">Supplementary Table&#x20;S2</xref>).</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption>
<p>Percentage of disease-causing and non-disease-causing variants in each evidence criterion: variants with expression study <bold>(A)</bold>, comparison with normal controls <bold>(B)</bold>, complete gene sequencing <bold>(C)</bold>, and absence of strong evidence <bold>(D)</bold>.</p>
</caption>
<graphic xlink:href="fmolb-08-752797-g002.tif"/>
</fig>
<p>The 160 variants (26 benign and 134 pathogenic) in the weak criteria subset and 108 variants (17 benign and 91 pathogenic) in the strong criteria subset were used for evaluating 33 prediction algorithms plus one conservation score. As one program may present more than one training dataset, a total of 51 estimates were obtained. SIFT (<xref ref-type="bibr" rid="B29">Kumar et&#x20;al., 2009</xref>), PROVEAN (<xref ref-type="bibr" rid="B9">Choi et&#x20;al., 2012</xref>), PolyPhen2 (<xref ref-type="bibr" rid="B1">Adzhubei et&#x20;al., 2013</xref>), BayesDel (<xref ref-type="bibr" rid="B14">Feng, 2017</xref>), CADD (<xref ref-type="bibr" rid="B27">Kircher et&#x20;al., 2014</xref>), FATHMM (<xref ref-type="bibr" rid="B49">Shihab et&#x20;al., 2013</xref>), fitCons (<xref ref-type="bibr" rid="B20">Gulko et&#x20;al., 2015</xref>), MutPred (<xref ref-type="bibr" rid="B33">Li et&#x20;al., 2009</xref>), and PON-P2 (<xref ref-type="bibr" rid="B37">Niroula et&#x20;al., 2015</xref>) were evaluated for every available training&#x20;set.</p>
<p>For the strong criteria subset, only BayesDel (addAF and noAF) (<xref ref-type="bibr" rid="B14">Feng, 2017</xref>), PON-P2 (genome, protein, and identifier) (<xref ref-type="bibr" rid="B37">Niroula et&#x20;al., 2015</xref>), and ClinPred (<xref ref-type="bibr" rid="B2">Alirezaie et&#x20;al., 2018</xref>) presented accuracy higher than 90% and kappa value higher than 0.6, with PON-P2 (genome database) (<xref ref-type="bibr" rid="B37">Niroula et&#x20;al., 2015</xref>), ClinPred (<xref ref-type="bibr" rid="B2">Alirezaie et&#x20;al., 2018</xref>), and BayesDel (addAF) (<xref ref-type="bibr" rid="B14">Feng, 2017</xref>) being the ones with the best relation between sensitivity and specificity and higher kappa values (0.692, 0.719, and 0.821) (<xref ref-type="sec" rid="s11">Supplementary Table S3</xref>). One PPV could not be calculated because FunSeq2 (<xref ref-type="bibr" rid="B17">Fu et&#x20;al., 2014</xref>) classified all variants as benign. Three algorithms (integrated_fitCons, GM12878_fitCons (<xref ref-type="bibr" rid="B20">Gulko et&#x20;al., 2015</xref>), and M-CAP (<xref ref-type="bibr" rid="B25">Jagadeesh et&#x20;al., 2016</xref>)) classified all variants as pathogenic and did not present an NPV. The kappa value also could not be calculated for these four predictors.</p>
<p>The smallest sensitivities (between 0 and 0.3) were observed in PrimateAI (<xref ref-type="bibr" rid="B51">Sundaram et&#x20;al., 2019</xref>) and SuSPect (<xref ref-type="bibr" rid="B55">Yates et&#x20;al., 2014</xref>) predictors. Excluding predictors that have maximum sensitivity and minimal specificity, the algorithms PolyPhen2 (HDIV) (<xref ref-type="bibr" rid="B1">Adzhubei et&#x20;al., 2013</xref>), MutationTaster2 (<xref ref-type="bibr" rid="B45">Schwarz et&#x20;al., 2010</xref>), MutationAssessor (<xref ref-type="bibr" rid="B42">Reva et&#x20;al., 2007</xref>), VEST4 (<xref ref-type="bibr" rid="B8">Carter et&#x20;al., 2013</xref>), BayesDel (addAF and noAF) (<xref ref-type="bibr" rid="B14">Feng, 2017</xref>), ClinPred (<xref ref-type="bibr" rid="B2">Alirezaie et&#x20;al., 2018</xref>), CADD (raw_hg38, phred_hg38, raw_hg19, phred_hg19) (<xref ref-type="bibr" rid="B27">Kircher et&#x20;al., 2014</xref>), FATHMM (Coding Variants-Weighted) (<xref ref-type="bibr" rid="B49">Shihab et&#x20;al., 2013</xref>), H1hESC_fitCons (<xref ref-type="bibr" rid="B20">Gulko et&#x20;al., 2015</xref>), GERP&#x2b;&#x2b; (<xref ref-type="bibr" rid="B12">Davydov et&#x20;al., 2010</xref>), CONDEL (<xref ref-type="bibr" rid="B19">Gonz&#xe1;lez-P&#xe9;rez and L&#xf3;pez-Bigas, 2011</xref>), and PON-P2 (identifier, protein, and genome) (<xref ref-type="bibr" rid="B37">Niroula et&#x20;al., 2015</xref>) presented large sensitivity (over 90%). Excluding FunSeq2 (<xref ref-type="bibr" rid="B17">Fu et&#x20;al., 2014</xref>), only SNPs&#x26;GO (<xref ref-type="bibr" rid="B7">Capriotti et&#x20;al., 2013</xref>) had specificity higher than 90%, and 14 algorithms had specificity between 80 and 90% (<xref ref-type="sec" rid="s11">Supplementary Table&#x20;S3</xref>).</p>
<p>The weak criteria subset showed similar patterns to the strong criteria subset despite obtaining a general reduction in the calculated values, except for the PON-P2 (identifier) algorithm (<xref ref-type="bibr" rid="B37">Niroula et&#x20;al., 2015</xref>) that showed an increased sensitivity. The same four algorithms classified all variants as only benign or pathogenic. In this subset, no algorithm had specificity higher than 90%, and nine algorithms had specificity between 80 and 90%, including PrimateAI (<xref ref-type="bibr" rid="B51">Sundaram et&#x20;al., 2019</xref>) and SNPs&#x26;GO (<xref ref-type="bibr" rid="B7">Capriotti et&#x20;al., 2013</xref>) (<xref ref-type="sec" rid="s11">Supplementary Figures S2A,B</xref>). In this subset, PON-P2 (genome database) (<xref ref-type="bibr" rid="B37">Niroula et&#x20;al., 2015</xref>), ClinPred (<xref ref-type="bibr" rid="B2">Alirezaie et&#x20;al., 2018</xref>), and BayesDel (addAF) (<xref ref-type="bibr" rid="B14">Feng, 2017</xref>) obtained accuracy higher than 90% (0.92, 0.91, and 0.93) and kappa values higher than 0.6 (0.666, 0.680, and 0.743) (<xref ref-type="fig" rid="F3">Figures 3A,B</xref>). All sensitivity, specificity, accuracy, PPV, NPV, FPR, and kappa values are displayed in <xref ref-type="sec" rid="s11">Supplementary Tables S3,4</xref> for the strong and weak criteria subsets.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption>
<p>Sensitivity and specificity <bold>(A)</bold> and accuracy and kappa value <bold>(B)</bold> of the top five classifiers in blue (BayesDel-addAF, PON-P2-genome, ClinPred, PON-P2-protein, and BayesDel-noAF algorithms) and the top six cited in yellow (SIFT, CADD, MutationTaster2, PANTHER, PolyPhen2, and PROVEAN) for the less stringent criteria subset.</p>
</caption>
<graphic xlink:href="fmolb-08-752797-g003.tif"/>
</fig>
<p>Fisher&#x2019;s exact test was performed to test if weak and strong subsets present statistical differences in predictors&#x2019; performance. The ratio of hits and errors for each program was compared between weak and strong subsets, and none presented statistically significant values (<xref ref-type="fig" rid="F4">Figure&#x20;4A</xref>). When we compared the same subset estimates, both subsets had the same pattern with several <italic>p</italic>-values lower than 0.05, as shown in <xref ref-type="fig" rid="F4">Figure&#x20;4B</xref> for the weak criteria subset.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption>
<p>
<italic>p</italic>-Value of Fisher&#x2019;s exact test comparing less stringent criteria and more stringent criteria <bold>(A)</bold> and the 51 estimates in a less stringent subset <bold>(B)</bold>.</p>
</caption>
<graphic xlink:href="fmolb-08-752797-g004.tif"/>
</fig>
<p>Not all 51 estimates were obtained for all 160 variants. MutationAssessor (<xref ref-type="bibr" rid="B42">Reva et&#x20;al., 2007</xref>), LRT (<xref ref-type="bibr" rid="B10">Chun and Fay, 2009</xref>), PrimateAI (<xref ref-type="bibr" rid="B51">Sundaram et&#x20;al., 2019</xref>), PANTHER (<xref ref-type="bibr" rid="B52">Thomas et&#x20;al., 2003</xref>), GWAVAE (<xref ref-type="bibr" rid="B44">Ritchie et&#x20;al., 2014</xref>), PMUT (<xref ref-type="bibr" rid="B15">Ferrer-Costa et&#x20;al., 2005</xref>), M-CAP (<xref ref-type="bibr" rid="B25">Jagadeesh et&#x20;al., 2016</xref>), MutPred (<xref ref-type="bibr" rid="B33">Li et&#x20;al., 2009</xref>), and all three PON-P2 (<xref ref-type="bibr" rid="B37">Niroula et&#x20;al., 2015</xref>) algorithms did not return a predicted classification for some variants (<xref ref-type="fig" rid="F5">Figure&#x20;5</xref>). All three PON-P2 (<xref ref-type="bibr" rid="B37">Niroula et&#x20;al., 2015</xref>) training sets were the predictors that contained the most unclassified variants, followed by MutPred (<xref ref-type="bibr" rid="B33">Li et&#x20;al., 2009</xref>) and predictions obtained from dbNSFP (<xref ref-type="bibr" rid="B34">Liu et&#x20;al., 2020</xref>). The algorithms LRT (<xref ref-type="bibr" rid="B10">Chun and Fay, 2009</xref>) (2), MutationAssessor (<xref ref-type="bibr" rid="B42">Reva et&#x20;al., 2007</xref>) (3), and PrimateAI (<xref ref-type="bibr" rid="B51">Sundaram et&#x20;al., 2019</xref>) (3) failed to classify variants in the first amino acid (MutationAssessor (<xref ref-type="bibr" rid="B42">Reva et&#x20;al., 2007</xref>) and PrimateAI (<xref ref-type="bibr" rid="B51">Sundaram et&#x20;al., 2019</xref>)) or at the end of the protein (LRT (<xref ref-type="bibr" rid="B10">Chun and Fay, 2009</xref>)).</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption>
<p>Number of unclassified variants per software for the less stringent criteria subset.</p>
</caption>
<graphic xlink:href="fmolb-08-752797-g005.tif"/>
</fig>
<p>For the strong criteria subset, all programs failed to report more pathogenic variants except for M-CAP (<xref ref-type="bibr" rid="B25">Jagadeesh et&#x20;al., 2016</xref>). MutationAssessor (<xref ref-type="bibr" rid="B42">Reva et&#x20;al., 2007</xref>), PrimateAI (<xref ref-type="bibr" rid="B51">Sundaram et&#x20;al., 2019</xref>), and PANTHER (<xref ref-type="bibr" rid="B52">Thomas et&#x20;al., 2003</xref>) presented the fewest number of unclassified variants, which are only pathogenic. MutPred (<xref ref-type="bibr" rid="B33">Li et&#x20;al., 2009</xref>) dbNSFP (<xref ref-type="bibr" rid="B34">Liu et&#x20;al., 2020</xref>) produced a larger number of unclassified variants that are both benign and pathogenic. For the weak criteria subset, MutPred (<xref ref-type="bibr" rid="B33">Li et&#x20;al., 2009</xref>) and dbNSFP (<xref ref-type="bibr" rid="B34">Liu et&#x20;al., 2020</xref>) increased the number of unclassified variants, exceeding the other programs (<xref ref-type="fig" rid="F4">Figure&#x20;4</xref>). LRT (<xref ref-type="bibr" rid="B10">Chun and Fay, 2009</xref>) and PMUT (<xref ref-type="bibr" rid="B15">Ferrer-Costa et&#x20;al., 2005</xref>) had one benign and one pathogenic uncategorized variant, respectively, in this subset. M-CAP (<xref ref-type="bibr" rid="B25">Jagadeesh et&#x20;al., 2016</xref>) continued to show more benign (8) than pathogenic (2) variants unclassified.</p>
<sec id="s3-1">
<title>
<italic>In Silico</italic> VUS Classification</title>
<p>Based on values present in both evaluation subsets, the 426 VUS were classified using the best five predictors: BayesDel (addAF and noAF) (<xref ref-type="bibr" rid="B14">Feng, 2017</xref>), PON-P2 (genome and protein) (<xref ref-type="bibr" rid="B37">Niroula et&#x20;al., 2015</xref>), and ClinPred algorithms (<xref ref-type="bibr" rid="B2">Alirezaie et&#x20;al., 2018</xref>). PON-P2 (genome and protein) (<xref ref-type="bibr" rid="B37">Niroula et&#x20;al., 2015</xref>) is the only of these five predictors that do not classify every variant, with both failing to classify 267 variants plus six unclassified variants exclusive to PON-P2-genome (<xref ref-type="bibr" rid="B37">Niroula et&#x20;al., 2015</xref>) and other six exclusive to PON-P2-protein (<xref ref-type="bibr" rid="B37">Niroula et&#x20;al., 2015</xref>). Out of the 426 variants, 57 obtained a total consensus of the five programs as pathogenic and 65 as benign. For variants not included in PON-P2 (<xref ref-type="bibr" rid="B37">Niroula et&#x20;al., 2015</xref>), 88 variants were considered pathogenic and 92 benign by all other predictors. The remaining 124 did not obtain a consensus among predictors (<xref ref-type="fig" rid="F6">Figure&#x20;6</xref>).</p>
<fig id="F6" position="float">
<label>FIGURE 6</label>
<caption>
<p>VUS classified by all best software.</p>
</caption>
<graphic xlink:href="fmolb-08-752797-g006.tif"/>
</fig>
</sec>
</sec>
<sec sec-type="discussion" id="s4">
<title>Discussion</title>
<p>In this study, we evaluated the prediction of 33 programs plus a conservation score for missense variants in the <italic>IDUA</italic> gene. Two datasets were created based on literature information and public databases: The first dataset was used to evaluate the best response predictors for missense <italic>IDUA</italic> variants. The second dataset comprised 426 VUS that were evaluated by the five best-performing algorithms. For the first dataset, two subsets were separated based on standards: modifications with specific literature information as a strong criteria dataset and all variants present in literature review plus databases with variant classification and high allele frequency. These variants were included to increase the amount of non-disease-causing mutations in the curated dataset.</p>
<p>The subsets did not demonstrate a notable difference, although the weak criteria subset presented lower overall values. The difference in performance may be explained by the lower classification confidence of the weak criteria subset. While the strong criteria subsets represent a supervised subset and include variants with a high confidence level of categorization, the weak and more flexible subset may contain incorrect classification. That may be due to the relatively small number of variants introduced in the weak subset (52 added to 108 in the strong subset).</p>
<p>Despite that, both comparison groups present the same predictors with the most satisfactory performances. BayesDel (<xref ref-type="bibr" rid="B14">Feng, 2017</xref>), the best performance predictor, is a met-score that combines deleteriousness predictors in the na&#xef;ve Bayesian approach and uses ClinVar (<xref ref-type="bibr" rid="B30">Landrum et&#x20;al., 2014</xref>) variants as a standard to determine the cutoff value. For this predictor, the set that integrates maximum and minor allele frequencies across populations (addAF) presents superior performance to that without allele frequencies (noAF) (<xref ref-type="bibr" rid="B14">Feng, 2017</xref>). ClinPred (<xref ref-type="bibr" rid="B2">Alirezaie et&#x20;al., 2018</xref>) had the second-highest value in the kappa test and used ClinVar (<xref ref-type="bibr" rid="B30">Landrum et&#x20;al., 2014</xref>) as a training dataset and combined two machine learning algorithms: random forest (cforest) and gradient boosted decision tree (xgboost) models (<xref ref-type="bibr" rid="B2">Alirezaie et&#x20;al., 2018</xref>). PON-P2 uses variation data from VariBench to train a random forest selection features predictor for pathogenicity association of amino acid substitutions and accept variations in multiple formats (<xref ref-type="bibr" rid="B37">Niroula et&#x20;al., 2015</xref>). The primer format (protein) is the most responsive, despite presenting a more modest performance than the genome format.</p>
<p>Classic and often used predictors such as SIFT (genome and protein) (<xref ref-type="bibr" rid="B29">Kumar et&#x20;al., 2009</xref>) and PolyPhen2 (HumDiv and HumVar) (<xref ref-type="bibr" rid="B1">Adzhubei et&#x20;al., 2013</xref>) did not perform well in both comparison subsets. For the strong criteria subset, PolyPhen2 (HDIV) (<xref ref-type="bibr" rid="B1">Adzhubei et&#x20;al., 2013</xref>), preferred for evaluating rare alleles, had good sensitivity (90%), accuracy (83%), and kappa value (0.372) but specificity lower than 50% (<xref ref-type="sec" rid="s11">Supplementary Table S3</xref>). The CADD (Combined Annotation Dependent Depletion) score integrates multiple annotations into one metric (<xref ref-type="bibr" rid="B27">Kircher et&#x20;al., 2014</xref>) and presents sensitivity higher than 90% and accuracy higher than 80% for GRCh37/hg19 and GRCh38/hg38. Unfortunately, it possessed one of the smallest specificities and kappa value between evaluated programs. A recently developed program, REVEL, an ensemble method that manages random forest (<xref ref-type="bibr" rid="B24">Ioannidis et&#x20;al., 2016</xref>), displays a compelling performance, despite not being one of the best ones, with higher specificity (88%) than sensitivity (75%).</p>
<p>Several predictors use ClinVar (<xref ref-type="bibr" rid="B30">Landrum et&#x20;al., 2014</xref>) and HGMD (<xref ref-type="bibr" rid="B50">Stenson et&#x20;al., 2014</xref>) databases as training datasets. Therefore, some hits in our datasets are reanalysis of training variants and not an accurate interpretation of pathogenicity, but this is not the case for all evaluated variants. Also, it is not likely that this would bias our analysis, even though we worked with variants native to these databases (<xref ref-type="fig" rid="F2">Figure&#x20;2</xref>), as the training datasets used for these programs incorporate many more variants in numerous&#x20;genes.</p>
<p>A recurrent problem in performance evaluation is the disproportionality of training and evaluation sets regarding the number of benign and pathogenic variants, a discrepancy also found in our datasets. We observed a minimal absolute difference between the properties of pathogenic and benign modifications, with the strong criteria subset having 15.74% of benign variants while the weak criteria subset had 16.25%. This minor difference demonstrates the difficulty of obtaining benign variants for composing sets, even implementing more comprehensive standards to evaluate these <italic>in silico</italic> predictors. It also reflects the fact that <italic>in silico</italic> software is mostly trained with disease-causing variants, which may cause a bias in the analysis. That was shown by <xref ref-type="bibr" rid="B38">Niroula and Vihinen (2019)</xref>, who compared ten predictors with a large set of non-pathogenic variants only and found specificity over 80% in just three predictors (PON-P2 (<xref ref-type="bibr" rid="B37">Niroula et&#x20;al., 2015</xref>), VEST (<xref ref-type="bibr" rid="B8">Carter et&#x20;al., 2013</xref>), and FATHMM (<xref ref-type="bibr" rid="B49">Shihab et&#x20;al., 2013</xref>)). In our study, despite both subsets presenting various programs with high specificity, the proportion of pathogenic and benign variants does not allow for a proper evaluation of specificity or to state which programs would exhibit significant differences in performance in a set of more benign variants.</p>
<p>This study does not replace the ACMG (<xref ref-type="bibr" rid="B43">Richards et&#x20;al., 2015</xref>) or Sherloc (<xref ref-type="bibr" rid="B39">Nykamp et&#x20;al., 2017</xref>) standards and guidelines. However, it increases confidence in one stage of the classification process (computational predictive programs), mainly when used in the absence of additional clinical information, as is the case of variants deposited in public databases. As we do not have access to any clinical information about the 426 variants identified in the public databases, these guidelines could not be applied. Therefore, we used only the classification given by the best five predictors previously selected. A classification of 122 variants (57 pathogenic and 65 benign variants) was obtained with a total consensus of the five programs. The other 304 variants were unclassified by PON-P2 (<xref ref-type="bibr" rid="B37">Niroula et&#x20;al., 2015</xref>) or did not reach an agreement. If PON-P2 (<xref ref-type="bibr" rid="B37">Niroula et&#x20;al., 2015</xref>) was excluded, then 311 variants reached a consensus (pathogenic and benign).</p>
<p>The difference between the number of variants with and without consensus is common and represents a recurrent finding when only information from computational predictive programs is available. This disagreement is probably caused by the metrics used by each predictor and can be a problem when no literature-based validation exists for that particular gene and predictor.</p>
</sec>
<sec sec-type="conclusion" id="s5">
<title>Conclusion</title>
<p>Variants in the <italic>IDUA</italic> gene were evaluated by 33 prediction algorithms and one conservation score for all available training sets. Two subsets were created using strong and weak criteria based on literature information available for each variant. The subsets demonstrated a small difference, with reduced values in the weak criteria subset but the same most accurate predictors. The five most significant predictors were used for evaluating 426 VUS obtained from public databases. Of these, 122 variants showed a total consensus of programs with high confidence in classification. The classification of the other 304 variants depends if researchers accept or not a reduction of confidence in classification using a simple consensus.</p>
</sec>
</body>
<back>
<sec id="s6">
<title>Data Availability Statement</title>
<p>The original contributions presented in the study are included in the article/<xref ref-type="sec" rid="s11">Supplementary Material</xref>, and further inquiries can be directed to the corresponding author.</p>
</sec>
<sec id="s7">
<title>Authors Contributions</title>
<p>UM and GP conceived the study. PB and GP collected the data. PB carried out the analysis and interpretation of data. PB and UM wrote the manuscript. UM, PB, and GP revised the manuscript. All authors read and approved the submitted version of the manuscript.</p>
</sec>
<sec id="s8">
<title>Funding</title>
<p>This work was supported by the Brazilian National Council for Technological and Scientific Development (CNPq) and the Research Incentive Fund of the Clinicas Hospital in Porto Alegre (FIPE/HCPA).</p>
</sec>
<sec sec-type="COI-statement" id="s9">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s10">
<title>Publisher&#x2019;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors, and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ack>
<p>The authors would like to thank the Research Incentive Fund of the Clinicas Hospital in Porto Alegre (<italic>Fundo de Incentivo &#xe0; Pesquisa do Hospital de Cl&#xed;nicas de Porto Alegre&#x2014;</italic>FIPE/HCPA).</p>
</ack>
<sec id="s11">
<title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fmolb.2021.752797/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fmolb.2021.752797/full&#x23;supplementary-material</ext-link>
</p>
<supplementary-material xlink:href="Image1.JPEG" id="SM1" mimetype="application/JPEG" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="DataSheet2.docx" id="SM2" mimetype="application/docx" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="DataSheet1.xlsx" id="SM3" mimetype="application/xlsx" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Adzhubei</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Jordan</surname>
<given-names>D. M.</given-names>
</name>
<name>
<surname>Sunyaev</surname>
<given-names>S. R.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>Predicting Functional&#x20;Effect of Human Missense Mutations Using PolyPhen&#x2010;2Current Protocols&#x20;in&#x20;Human Genetics</article-title>. <source>Curr. Protoc. Hum. Genet.</source> <volume>76</volume>, <fpage>7</fpage>. <pub-id pub-id-type="doi">10.1002/0471142905.hg0720s76</pub-id> </citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Alirezaie</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Kernohan</surname>
<given-names>K. D.</given-names>
</name>
<name>
<surname>Hartley</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Majewski</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Hocking</surname>
<given-names>T. D.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>ClinPred: Prediction Tool to Identify Disease-Relevant Nonsynonymous Single-Nucleotide Variants</article-title>. <source>Am. J.&#x20;Hum. Genet.</source> <volume>103</volume> (<issue>4</issue>), <fpage>474</fpage>&#x2013;<lpage>483</lpage>. <pub-id pub-id-type="doi">10.1016/j.ajhg.2018.08.005</pub-id> </citation>
</ref>
<ref id="B3">
<citation citation-type="journal">
<collab>1000 Genomes Project Consortium</collab>
<person-group person-group-type="author">
<name>
<surname>Auton</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Brooks</surname>
<given-names>L. D.</given-names>
</name>
<name>
<surname>Durbin</surname>
<given-names>R. M.</given-names>
</name>
<name>
<surname>Garrison</surname>
<given-names>E. P.</given-names>
</name>
<name>
<surname>Kang</surname>
<given-names>H. M.</given-names>
</name>
<name>
<surname>Korbel</surname>
<given-names>J.&#x20;O.</given-names>
</name>
<etal/>
</person-group> (<year>2015</year>). <article-title>A Global Reference for Human Genetic Variation</article-title>. <source>Nature</source> <volume>526</volume> (<issue>7571</issue>), <fpage>68</fpage>&#x2013;<lpage>74</lpage>. <pub-id pub-id-type="doi">10.1038/nature15393</pub-id> </citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bendl</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Musil</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>&#x160;toura&#x10d;</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Zendulka</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Damborsk&#xfd;</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Brezovsk&#xfd;</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>PredictSNP2: A Unified Platform for Accurately Evaluating SNP Effects by Exploiting the Different Characteristics of Variants in Distinct Genomic Regions</article-title>. <source>Plos Comput. Biol.</source> <volume>12</volume> (<issue>5</issue>), <fpage>e1004962</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pcbi.1004962</pub-id> </citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bertola</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Filocamo</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Casati</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Mort</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Rosano</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Tylki-Szymanska</surname>
<given-names>A.</given-names>
</name>
<etal/>
</person-group> (<year>2011</year>). <article-title>IDUA Mutational Profiling of a Cohort of 102 European Patients with Mucopolysaccharidosis Type I: Identification and Characterization of 35 Novel &#x3b1;-L-iduronidase (IDUA) Alleles</article-title>. <source>Hum. Mutat.</source> <volume>32</volume> (<issue>6</issue>), <fpage>E2189</fpage>&#x2013;<lpage>E2210</lpage>. <pub-id pub-id-type="doi">10.1002/humu.21479</pub-id> </citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Capriotti</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Calabrese</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Casadio</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2006</year>). <article-title>Predicting the Insurgence of Human Genetic Diseases Associated to Single point Protein Mutations with Support Vector Machines and Evolutionary Information</article-title>. <source>Bioinformatics</source> <volume>22</volume> (<issue>22</issue>), <fpage>2729</fpage>&#x2013;<lpage>2734</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btl423</pub-id> </citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Capriotti</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Calabrese</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Fariselli</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Martelli</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Altman</surname>
<given-names>R. B.</given-names>
</name>
<name>
<surname>Casadio</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>WS-SNPs&#x26;GO: a Web Server for Predicting the Deleterious Effect of Human Protein Variants Using Functional Annotation</article-title>. <source>BMC genomics</source> <volume>14</volume> (<issue>Suppl. 3</issue>), <fpage>S6</fpage>. <pub-id pub-id-type="doi">10.1186/1471-2164-14-S3-S6</pub-id> </citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Carter</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Douville</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Stenson</surname>
<given-names>P. D.</given-names>
</name>
<name>
<surname>Cooper</surname>
<given-names>D. N.</given-names>
</name>
<name>
<surname>Karchin</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>Identifying Mendelian Disease Genes with the Variant Effect Scoring Tool</article-title>. <source>BMC genomics</source> <volume>14</volume> (<issue>Suppl. 3</issue>), <fpage>S3</fpage>. <pub-id pub-id-type="doi">10.1186/1471-2164-14-S3-S3</pub-id> </citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Choi</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Sims</surname>
<given-names>G. E.</given-names>
</name>
<name>
<surname>Murphy</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Miller</surname>
<given-names>J.&#x20;R.</given-names>
</name>
<name>
<surname>Chan</surname>
<given-names>A. P.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>Predicting the Functional Effect of Amino Acid Substitutions and Indels</article-title>. <source>PloS one</source> <volume>7</volume> (<issue>10</issue>), <fpage>e46688</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0046688</pub-id> </citation>
</ref>
<ref id="B10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chun</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Fay</surname>
<given-names>J.&#x20;C.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>Identification of Deleterious Mutations within Three Human Genomes</article-title>. <source>Genome Res.</source> <volume>19</volume> (<issue>9</issue>), <fpage>1553</fpage>&#x2013;<lpage>1561</lpage>. <pub-id pub-id-type="doi">10.1101/gr.092619.109</pub-id> </citation>
</ref>
<ref id="B57">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Clarke</surname>
<given-names>L. A.</given-names>
</name>
<name>
<surname>Giugliani</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Guffon</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Jones</surname>
<given-names>S. A.</given-names>
</name>
<name>
<surname>Keenan</surname>
<given-names>H. A.</given-names>
</name>
<name>
<surname>Munoz-Rojas</surname>
<given-names>M. V.</given-names>
</name>
<etal/>
</person-group> (<year>2019</year>). <article-title>Genotype-Phenotype Relationships in Mucopolysaccharidosis Type I (MPS I): Insights From the International MPS I Registry</article-title>. <source>Clin. Genet.</source> <volume>96</volume> (<issue>4</issue>), <fpage>281</fpage>&#x2013;<lpage>289</lpage>. <pub-id pub-id-type="doi">10.1111/cge.13583</pub-id> </citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Clarke</surname>
<given-names>L. A.</given-names>
</name>
<name>
<surname>Scott</surname>
<given-names>H. S.</given-names>
</name>
</person-group> (<year>1993</year>). <article-title>Two Novel Mutations Causing Mucopolysaccharidosis Type I Detected by Single Strand Conformational Analysis of the &#x3b1;-L-iduronidase Gene</article-title>. <source>Hum. Mol. Genet.</source> <volume>2</volume> (<issue>8</issue>), <fpage>1311</fpage>&#x2013;<lpage>1312</lpage>. <pub-id pub-id-type="doi">10.1093/hmg/2.8.1311</pub-id> </citation>
</ref>
<ref id="B12">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Davydov</surname>
<given-names>E. V.</given-names>
</name>
<name>
<surname>Goode</surname>
<given-names>D. L.</given-names>
</name>
<name>
<surname>Sirota</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Cooper</surname>
<given-names>G. M.</given-names>
</name>
<name>
<surname>Sidow</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Batzoglou</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2010</year>). <article-title>Identifying a High Fraction of the Human Genome to Be under Selective Constraint Using GERP&#x2b;&#x2b;</article-title>. <source>Plos Comput. Biol.</source> <volume>6</volume> (<issue>12</issue>), <fpage>e1001025</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pcbi.1001025</pub-id> </citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dong</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Wei</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Jian</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Gibbs</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Boerwinkle</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>K.</given-names>
</name>
<etal/>
</person-group> (<year>2015</year>). <article-title>Comparison and Integration of Deleteriousness Prediction Methods for Nonsynonymous SNVs in Whole Exome Sequencing Studies</article-title>. <source>Hum. Mol. Genet.</source> <volume>24</volume> (<issue>8</issue>), <fpage>2125</fpage>&#x2013;<lpage>2137</lpage>. <pub-id pub-id-type="doi">10.1093/hmg/ddu733</pub-id> </citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Feng</surname>
<given-names>B.-J.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>PERCH: A Unified Framework for Disease Gene Prioritization</article-title>. <source>Hum. Mutat.</source> <volume>38</volume> (<issue>3</issue>), <fpage>243</fpage>&#x2013;<lpage>251</lpage>. <pub-id pub-id-type="doi">10.1002/humu.23158</pub-id> </citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ferrer-Costa</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Gelp&#xed;</surname>
<given-names>J.&#x20;L.</given-names>
</name>
<name>
<surname>Zamakola</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Parraga</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>de la Cruz</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Orozco</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2005</year>). <article-title>PMUT: a Web-Based Tool for the Annotation of Pathological Mutations on Proteins</article-title>. <source>Bioinformatics</source> <volume>21</volume> (<issue>14</issue>), <fpage>3176</fpage>&#x2013;<lpage>3178</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/bti486</pub-id> </citation>
</ref>
<ref id="B16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fokkema</surname>
<given-names>I. F. A. C.</given-names>
</name>
<name>
<surname>Taschner</surname>
<given-names>P. E. M.</given-names>
</name>
<name>
<surname>Schaafsma</surname>
<given-names>G. C. P.</given-names>
</name>
<name>
<surname>Celli</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Laros</surname>
<given-names>J.&#x20;F. J.</given-names>
</name>
<name>
<surname>den Dunnen</surname>
<given-names>J.&#x20;T.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>LOVD v.2.0: the Next Generation in Gene Variant Databases</article-title>. <source>Hum. Mutat.</source> <volume>32</volume> (<issue>5</issue>), <fpage>557</fpage>&#x2013;<lpage>563</lpage>. <pub-id pub-id-type="doi">10.1002/humu.21438</pub-id> </citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fu</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Lou</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Bedford</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Mu</surname>
<given-names>X. J.</given-names>
</name>
<name>
<surname>Yip</surname>
<given-names>K. Y.</given-names>
</name>
<etal/>
</person-group> (<year>2014</year>). <article-title>FunSeq2: a Framework for Prioritizing Noncoding Regulatory Variants in Cancer</article-title>. <source>Genome Biol.</source> <volume>15</volume> (<issue>10</issue>), <fpage>480</fpage>. <pub-id pub-id-type="doi">10.1186/s13059-014-0480-5</pub-id> </citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fuller</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Brooks</surname>
<given-names>D. A.</given-names>
</name>
<name>
<surname>Evangelista</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Hein</surname>
<given-names>L. K.</given-names>
</name>
<name>
<surname>Hopwood</surname>
<given-names>J.&#x20;J.</given-names>
</name>
<name>
<surname>Meikle</surname>
<given-names>P. J.</given-names>
</name>
</person-group> (<year>2005</year>). <article-title>Prediction of Neuropathology in Mucopolysaccharidosis I Patients</article-title>. <source>Mol. Genet. Metab.</source> <volume>84</volume> (<issue>1</issue>), <fpage>18</fpage>&#x2013;<lpage>24</lpage>. <pub-id pub-id-type="doi">10.1016/j.ymgme.2004.09.004</pub-id> </citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gonz&#xe1;lez-P&#xe9;rez</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>L&#xf3;pez-Bigas</surname>
<given-names>N.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>Improving the Assessment of the Outcome of Nonsynonymous SNVs with a Consensus Deleteriousness Score, Condel</article-title>. <source>Am. J.&#x20;Hum. Genet.</source> <volume>88</volume> (<issue>4</issue>), <fpage>440</fpage>&#x2013;<lpage>449</lpage>. <pub-id pub-id-type="doi">10.1016/j.ajhg.2011.03.004</pub-id> </citation>
</ref>
<ref id="B20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gulko</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Hubisz</surname>
<given-names>M. J.</given-names>
</name>
<name>
<surname>Gronau</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Siepel</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>A Method for Calculating Probabilities of Fitness Consequences for point Mutations across the Human Genome</article-title>. <source>Nat. Genet.</source> <volume>47</volume> (<issue>3</issue>), <fpage>276</fpage>&#x2013;<lpage>283</lpage>. <pub-id pub-id-type="doi">10.1038/ng.3196</pub-id> </citation>
</ref>
<ref id="B21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Harris</surname>
<given-names>C. R.</given-names>
</name>
<name>
<surname>Millman</surname>
<given-names>K. J.</given-names>
</name>
<name>
<surname>van der Walt</surname>
<given-names>S. J.</given-names>
</name>
<name>
<surname>Gommers</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Virtanen</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Cournapeau</surname>
<given-names>D.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>Array Programming with NumPy</article-title>. <source>Nature</source> <volume>585</volume> (<issue>7825</issue>), <fpage>357</fpage>&#x2013;<lpage>362</lpage>. <pub-id pub-id-type="doi">10.1038/s41586-020-2649-2</pub-id> </citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Huang</surname>
<given-names>Y.-F.</given-names>
</name>
<name>
<surname>Gulko</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Siepel</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Fast, Scalable Prediction of Deleterious Noncoding Variants from Functional and Population Genomic Data</article-title>. <source>Nat. Genet.</source> <volume>49</volume> (<issue>4</issue>), <fpage>618</fpage>&#x2013;<lpage>624</lpage>. <pub-id pub-id-type="doi">10.1038/ng.3810</pub-id> </citation>
</ref>
<ref id="B23">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hunter</surname>
<given-names>J.&#x20;D.</given-names>
</name>
</person-group> (<year>2007</year>). <article-title>Matplotlib: A 2D Graphics Environment</article-title>. <source>Comput. Sci. Eng.</source> <volume>9</volume> (<issue>3</issue>), <fpage>90</fpage>&#x2013;<lpage>95</lpage>. <pub-id pub-id-type="doi">10.1109/mcse.2007.55</pub-id> </citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ioannidis</surname>
<given-names>N. M.</given-names>
</name>
<name>
<surname>Rothstein</surname>
<given-names>J.&#x20;H.</given-names>
</name>
<name>
<surname>Pejaver</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Middha</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>McDonnell</surname>
<given-names>S. K.</given-names>
</name>
<name>
<surname>Baheti</surname>
<given-names>S.</given-names>
</name>
<etal/>
</person-group> (<year>2016</year>). <article-title>REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants</article-title>. <source>Am. J.&#x20;Hum. Genet.</source> <volume>99</volume> (<issue>4</issue>), <fpage>877</fpage>&#x2013;<lpage>885</lpage>. <pub-id pub-id-type="doi">10.1016/j.ajhg.2016.08.016</pub-id> </citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jagadeesh</surname>
<given-names>K. A.</given-names>
</name>
<name>
<surname>Wenger</surname>
<given-names>A. M.</given-names>
</name>
<name>
<surname>Berger</surname>
<given-names>M. J.</given-names>
</name>
<name>
<surname>Guturu</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Stenson</surname>
<given-names>P. D.</given-names>
</name>
<name>
<surname>Cooper</surname>
<given-names>D. N.</given-names>
</name>
<etal/>
</person-group> (<year>2016</year>). <article-title>M-CAP Eliminates a Majority of Variants of Uncertain Significance in Clinical Exomes at High Sensitivity</article-title>. <source>Nat. Genet.</source> <volume>48</volume> (<issue>12</issue>), <fpage>1581</fpage>&#x2013;<lpage>1586</lpage>. <pub-id pub-id-type="doi">10.1038/ng.3703</pub-id> </citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Karczewski</surname>
<given-names>K. J.</given-names>
</name>
<name>
<surname>Francioli</surname>
<given-names>L. C.</given-names>
</name>
<name>
<surname>Tiao</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Cummings</surname>
<given-names>B. B.</given-names>
</name>
<name>
<surname>Alf&#xf6;ldi</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Q.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>The Mutational Constraint Spectrum Quantified from Variation in 141,456 Humans</article-title>. <source>Nature</source> <volume>581</volume> (<issue>7809</issue>), <fpage>434</fpage>&#x2013;<lpage>443</lpage>. <pub-id pub-id-type="doi">10.1038/s41586-020-2308-7</pub-id> </citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kircher</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Witten</surname>
<given-names>D. M.</given-names>
</name>
<name>
<surname>Jain</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>O&#x27;Roak</surname>
<given-names>B. J.</given-names>
</name>
<name>
<surname>Cooper</surname>
<given-names>G. M.</given-names>
</name>
<name>
<surname>Shendure</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>A General Framework for Estimating the Relative Pathogenicity of Human Genetic Variants</article-title>. <source>Nat. Genet.</source> <volume>46</volume> (<issue>3</issue>), <fpage>310</fpage>&#x2013;<lpage>315</lpage>. <pub-id pub-id-type="doi">10.1038/ng.2892</pub-id> </citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kubaski</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>de Oliveira Poswar</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Michelin-Tirelli</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Matte</surname>
<given-names>U. d. S.</given-names>
</name>
<name>
<surname>Horovitz</surname>
<given-names>D. D.</given-names>
</name>
<name>
<surname>Barth</surname>
<given-names>A. L.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>Mucopolysaccharidosis Type I</article-title>. <source>Diagnostics</source> <volume>10</volume> (<issue>3</issue>), <fpage>161</fpage>. <pub-id pub-id-type="doi">10.3390/diagnostics10030161</pub-id> </citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kumar</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Henikoff</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Ng</surname>
<given-names>P. C.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>Predicting the Effects of Coding Non-synonymous Variants on Protein Function Using the SIFT Algorithm</article-title>. <source>Nat. Protoc.</source> <volume>4</volume> (<issue>7</issue>), <fpage>1073</fpage>&#x2013;<lpage>1081</lpage>. <pub-id pub-id-type="doi">10.1038/nprot.2009.86</pub-id> </citation>
</ref>
<ref id="B30">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Landrum</surname>
<given-names>M. J.</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>J.&#x20;M.</given-names>
</name>
<name>
<surname>Riley</surname>
<given-names>G. R.</given-names>
</name>
<name>
<surname>Jang</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Rubinstein</surname>
<given-names>W. S.</given-names>
</name>
<name>
<surname>Church</surname>
<given-names>D. M.</given-names>
</name>
<etal/>
</person-group> (<year>2014</year>). <article-title>ClinVar: Public Archive of Relationships Among Sequence Variation and Human Phenotype</article-title>. <source>Nucl. Acids Res.</source> <volume>42</volume> (<issue>Database issue</issue>), <fpage>D980</fpage>&#x2013;<lpage>D985</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkt1113</pub-id> </citation>
</ref>
<ref id="B31">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lehman</surname>
<given-names>T. J.&#x20;A.</given-names>
</name>
<name>
<surname>Miller</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Norquist</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Underhill</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Keutzer</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>Diagnosis of the Mucopolysaccharidoses</article-title>. <source>Rheumatology</source> <volume>50</volume> (<issue>Suppl. 5</issue>), <fpage>v41</fpage>&#x2013;<lpage>v48</lpage>. <pub-id pub-id-type="doi">10.1093/rheumatology/ker390</pub-id> </citation>
</ref>
<ref id="B32">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lek</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Karczewski</surname>
<given-names>K. J.</given-names>
</name>
<name>
<surname>Karczewski</surname>
<given-names>K. J.</given-names>
</name>
<name>
<surname>Minikel</surname>
<given-names>E. V.</given-names>
</name>
<name>
<surname>Samocha</surname>
<given-names>K. E.</given-names>
</name>
<name>
<surname>Banks</surname>
<given-names>E.</given-names>
</name>
<etal/>
</person-group> (<year>2016</year>). <article-title>Analysis of Protein-Coding Genetic Variation in 60,706 Humans</article-title>. <source>Nature</source> <volume>536</volume> (<issue>7616</issue>), <fpage>285</fpage>&#x2013;<lpage>291</lpage>. <pub-id pub-id-type="doi">10.1038/nature19057</pub-id> </citation>
</ref>
<ref id="B33">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Krishnan</surname>
<given-names>V. G.</given-names>
</name>
<name>
<surname>Mort</surname>
<given-names>M. E.</given-names>
</name>
<name>
<surname>Xin</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Kamati</surname>
<given-names>K. K.</given-names>
</name>
<name>
<surname>Cooper</surname>
<given-names>D. N.</given-names>
</name>
<etal/>
</person-group> (<year>2009</year>). <article-title>Automated Inference of Molecular Mechanisms of Disease from Amino Acid Substitutions</article-title>. <source>Bioinformatics</source> <volume>25</volume> (<issue>21</issue>), <fpage>2744</fpage>&#x2013;<lpage>2750</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btp528</pub-id> </citation>
</ref>
<ref id="B34">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Mou</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Dong</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Tu</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>dbNSFP V4: a Comprehensive Database of Transcript-specific Functional Predictions and Annotations for Human Nonsynonymous and Splice-Site SNVs</article-title>. <source>Genome Med.</source> <volume>12</volume> (<issue>1</issue>), <fpage>103</fpage>. <pub-id pub-id-type="doi">10.1186/s13073-020-00803-9</pub-id> </citation>
</ref>
<ref id="B35">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Malhis</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Jacobson</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Jones</surname>
<given-names>S. J.&#x20;M.</given-names>
</name>
<name>
<surname>Gsponer</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>LIST-S2: Taxonomy Based Sorting of Deleterious Missense Mutations across Species</article-title>. <source>Nucleic Acids Res.</source> <volume>48</volume> (<issue>W1</issue>), <fpage>W154</fpage>&#x2013;<lpage>W161</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkaa288</pub-id> </citation>
</ref>
<ref id="B36">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Naslavsky</surname>
<given-names>M. S.</given-names>
</name>
<name>
<surname>Yamamoto</surname>
<given-names>G. L.</given-names>
</name>
<name>
<surname>Almeida</surname>
<given-names>T. F.</given-names>
</name>
<name>
<surname>Ezquina</surname>
<given-names>S. A. M.</given-names>
</name>
<name>
<surname>Sunaga</surname>
<given-names>D. Y.</given-names>
</name>
<name>
<surname>Pho</surname>
<given-names>N.</given-names>
</name>
<etal/>
</person-group> (<year>2017</year>). <article-title>Exomic Variants of an Elderly Cohort of Brazilians in the ABraOM Database</article-title>. <source>Hum. Mutat.</source> <volume>38</volume> (<issue>7</issue>), <fpage>751</fpage>&#x2013;<lpage>763</lpage>. <pub-id pub-id-type="doi">10.1002/humu.23220</pub-id> </citation>
</ref>
<ref id="B37">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Niroula</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Urolagin</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Vihinen</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>PON-P2: Prediction Method for Fast and Reliable Identification of Harmful Variants</article-title>. <source>PloS one</source> <volume>10</volume> (<issue>2</issue>), <fpage>e0117380</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0117380</pub-id> </citation>
</ref>
<ref id="B38">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Niroula</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Vihinen</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>How Good Are Pathogenicity Predictors in Detecting Benign Variants?</article-title> <source>Plos Comput. Biol.</source> <volume>15</volume> (<issue>2</issue>), <fpage>e1006481</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pcbi.1006481</pub-id> </citation>
</ref>
<ref id="B39">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nykamp</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Anderson</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Anderson</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Powers</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Garcia</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Herrera</surname>
<given-names>B.</given-names>
</name>
<etal/>
</person-group> (<year>2017</year>). <article-title>Sherloc: a Comprehensive Refinement of the ACMG-AMP Variant Classification Criteria</article-title>. <source>Genet. Med.</source> <volume>19</volume> (<issue>10</issue>), <fpage>1105</fpage>&#x2013;<lpage>1117</lpage>. <pub-id pub-id-type="doi">10.1038/gim.2017.37</pub-id> </citation>
</ref>
<ref id="B40">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pedregosa</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Varoquaux</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Gramfort</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Michel</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Thirion</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Grisel</surname>
<given-names>O.</given-names>
</name>
<etal/>
</person-group> (<year>2011</year>). <article-title>Scikit-learn: Machine Learning in Python</article-title>. <source>JMLR</source> <volume>12</volume>, <fpage>2825</fpage>&#x2013;<lpage>2830</lpage>. </citation>
</ref>
<ref id="B41">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Quang</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Xie</surname>
<given-names>X.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>DANN: a Deep Learning Approach for Annotating the Pathogenicity of Genetic Variants</article-title>. <source>Bioinformatics (Oxford, England)</source> <volume>31</volume> (<issue>5</issue>), <fpage>761</fpage>&#x2013;<lpage>763</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btu703</pub-id> </citation>
</ref>
<ref id="B42">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Reva</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Antipin</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Sander</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>2007</year>). <article-title>Determinants of Protein Function Revealed by Combinatorial Entropy Optimization</article-title>. <source>Genome Biol.</source> <volume>8</volume> (<issue>11</issue>), <fpage>R232</fpage>. <pub-id pub-id-type="doi">10.1186/gb-2007-8-11-r232</pub-id> </citation>
</ref>
<ref id="B43">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Richards</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Aziz</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Aziz</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Bale</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Bick</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Das</surname>
<given-names>S.</given-names>
</name>
<etal/>
</person-group> (<year>2015</year>). <article-title>Standards and Guidelines for the Interpretation of Sequence Variants: a Joint Consensus Recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology</article-title>. <source>Genet. Med.</source> <volume>17</volume> (<issue>5</issue>), <fpage>405</fpage>&#x2013;<lpage>423</lpage>. <pub-id pub-id-type="doi">10.1038/gim.2015.30</pub-id> </citation>
</ref>
<ref id="B44">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ritchie</surname>
<given-names>G. R. S.</given-names>
</name>
<name>
<surname>Dunham</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Zeggini</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Flicek</surname>
<given-names>P.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Functional Annotation of Noncoding Sequence Variants</article-title>. <source>Nat. Methods</source> <volume>11</volume> (<issue>3</issue>), <fpage>294</fpage>&#x2013;<lpage>296</lpage>. <pub-id pub-id-type="doi">10.1038/nmeth.2832</pub-id> </citation>
</ref>
<ref id="B45">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schwarz</surname>
<given-names>J.&#x20;M.</given-names>
</name>
<name>
<surname>R&#xf6;delsperger</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Schuelke</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Seelow</surname>
<given-names>D.</given-names>
</name>
</person-group> (<year>2010</year>). <article-title>MutationTaster Evaluates Disease-Causing Potential of Sequence Alterations</article-title>. <source>Nat. Methods</source> <volume>7</volume> (<issue>8</issue>), <fpage>575</fpage>&#x2013;<lpage>576</lpage>. <pub-id pub-id-type="doi">10.1038/nmeth0810-575</pub-id> </citation>
</ref>
<ref id="B46">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Scott</surname>
<given-names>H. S.</given-names>
</name>
<name>
<surname>Anson</surname>
<given-names>D. S.</given-names>
</name>
<name>
<surname>Orsborn</surname>
<given-names>A. M.</given-names>
</name>
<name>
<surname>Nelson</surname>
<given-names>P. V.</given-names>
</name>
<name>
<surname>Clements</surname>
<given-names>P. R.</given-names>
</name>
<name>
<surname>Morris</surname>
<given-names>C. P.</given-names>
</name>
<etal/>
</person-group> (<year>1991</year>). <article-title>Human Alpha-L-Iduronidase: cDNA Isolation and Expression</article-title>. <source>Proc. Natl. Acad. Sci.</source> <volume>88</volume> (<issue>21</issue>), <fpage>9695</fpage>&#x2013;<lpage>9699</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.88.21.9695</pub-id> </citation>
</ref>
<ref id="B47">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Scott</surname>
<given-names>H. S.</given-names>
</name>
<name>
<surname>Litjens</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Hop wood</surname>
<given-names>J.&#x20;J.</given-names>
</name>
<name>
<surname>Morris</surname>
<given-names>C. P.</given-names>
</name>
</person-group> (<year>1992</year>). <article-title>PCR Detection of Two RFLPs in Exon I of the ?-L-Iduronidase (IDUA) Gene</article-title>. <source>Hum. Genet.</source> <volume>90</volume> (<issue>3</issue>), <fpage>327</fpage>. <pub-id pub-id-type="doi">10.1007/BF00220095</pub-id> </citation>
</ref>
<ref id="B48">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sherry</surname>
<given-names>S. T.</given-names>
</name>
<name>
<surname>Ward</surname>
<given-names>M. H.</given-names>
</name>
<name>
<surname>Kholodov</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Baker</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Phan</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Smigielski</surname>
<given-names>E. M.</given-names>
</name>
<etal/>
</person-group> (<year>2001</year>). <article-title>dbSNP: the NCBI Database of Genetic Variation</article-title>. <source>Nucleic Acids Res.</source> <volume>29</volume> (<issue>1</issue>), <fpage>308</fpage>&#x2013;<lpage>311</lpage>. <pub-id pub-id-type="doi">10.1093/nar/29.1.308</pub-id> </citation>
</ref>
<ref id="B49">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shihab</surname>
<given-names>H. A.</given-names>
</name>
<name>
<surname>Gough</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Cooper</surname>
<given-names>D. N.</given-names>
</name>
<name>
<surname>Stenson</surname>
<given-names>P. D.</given-names>
</name>
<name>
<surname>Barker</surname>
<given-names>G. L. A.</given-names>
</name>
<name>
<surname>Edwards</surname>
<given-names>K. J.</given-names>
</name>
<etal/>
</person-group> (<year>2013</year>). <article-title>Predicting the Functional, Molecular, and Phenotypic Consequences of Amino Acid Substitutions Using Hidden Markov Models</article-title>. <source>Hum. Mutat.</source> <volume>34</volume> (<issue>1</issue>), <fpage>57</fpage>&#x2013;<lpage>65</lpage>. <pub-id pub-id-type="doi">10.1002/humu.22225</pub-id> </citation>
</ref>
<ref id="B50">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Stenson</surname>
<given-names>P. D.</given-names>
</name>
<name>
<surname>Mort</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Ball</surname>
<given-names>E. V.</given-names>
</name>
<name>
<surname>Shaw</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Phillips</surname>
<given-names>A. D.</given-names>
</name>
<name>
<surname>Cooper</surname>
<given-names>D. N.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>The Human Gene Mutation Database: Building a Comprehensive Mutation Repository for Clinical and Molecular Genetics, Diagnostic Testing and Personalized Genomic Medicine</article-title>. <source>Hum. Genet.</source> <volume>133</volume> (<issue>1</issue>), <fpage>1</fpage>&#x2013;<lpage>9</lpage>. <pub-id pub-id-type="doi">10.1007/s00439-013-1358-4</pub-id> </citation>
</ref>
<ref id="B51">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sundaram</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Gao</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Padigepati</surname>
<given-names>S. R.</given-names>
</name>
<name>
<surname>McRae</surname>
<given-names>J.&#x20;F.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Kosmicki</surname>
<given-names>J.&#x20;A.</given-names>
</name>
<etal/>
</person-group> (<year>2019</year>). <article-title>Author Correction: Predicting the Clinical Impact of Human Mutation with Deep Neural Networks</article-title>. <source>Nat. Genet.</source> <volume>51</volume> (<issue>2</issue>), <fpage>364</fpage>. <pub-id pub-id-type="doi">10.1038/s41588-018-0329-z</pub-id> </citation>
</ref>
<ref id="B52">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Thomas</surname>
<given-names>P. D.</given-names>
</name>
<name>
<surname>Campbell</surname>
<given-names>M. J.</given-names>
</name>
<name>
<surname>Kejariwal</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Mi</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Karlak</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Daverman</surname>
<given-names>R.</given-names>
</name>
<etal/>
</person-group> (<year>2003</year>). <article-title>PANTHER: a Library of Protein Families and Subfamilies Indexed by Function</article-title>. <source>Genome Res.</source> <volume>13</volume> (<issue>9</issue>), <fpage>2129</fpage>&#x2013;<lpage>2141</lpage>. <pub-id pub-id-type="doi">10.1101/gr.772403</pub-id> </citation>
</ref>
<ref id="B53">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Vaser</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Adusumalli</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Leng</surname>
<given-names>S. N.</given-names>
</name>
<name>
<surname>Sikic</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Ng</surname>
<given-names>P. C.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>SIFT Missense Predictions for Genomes</article-title>. <source>Nat. Protoc.</source> <volume>11</volume> (<issue>1</issue>), <fpage>1</fpage>&#x2013;<lpage>9</lpage>. <pub-id pub-id-type="doi">10.1038/nprot.2015.123</pub-id> </citation>
</ref>
<ref id="B54">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Viana</surname>
<given-names>G. M.</given-names>
</name>
<name>
<surname>Lima</surname>
<given-names>N. O. d.</given-names>
</name>
<name>
<surname>Cavaleiro</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Alves</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Souza</surname>
<given-names>I. C. N.</given-names>
</name>
<name>
<surname>Feio</surname>
<given-names>R.</given-names>
</name>
<etal/>
</person-group> (<year>2011</year>). <article-title>Mucopolysaccharidoses in Northern Brazil: Targeted Mutation Screening and Urinary Glycosaminoglycan Excretion in Patients Undergoing Enzyme Replacement Therapy</article-title>. <source>Genet. Mol. Biol.</source> <volume>34</volume> (<issue>3</issue>), <fpage>410</fpage>&#x2013;<lpage>415</lpage>. <pub-id pub-id-type="doi">10.1590/S1415-47572011005000025</pub-id> </citation>
</ref>
<ref id="B55">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yates</surname>
<given-names>C. M.</given-names>
</name>
<name>
<surname>Filippis</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Kelley</surname>
<given-names>L. A.</given-names>
</name>
<name>
<surname>Sternberg</surname>
<given-names>M. J.&#x20;E.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>SuSPect: Enhanced Prediction of Single Amino Acid Variant (SAV) Phenotype Using Network Features</article-title>. <source>J.&#x20;Mol. Biol.</source> <volume>426</volume> (<issue>14</issue>), <fpage>2692</fpage>&#x2013;<lpage>2701</lpage>. <pub-id pub-id-type="doi">10.1016/j.jmb.2014.04.026</pub-id> </citation>
</ref>
<ref id="B56">
<citation citation-type="book">
<collab>Zenodo</collab> (<year>2020</year>). <source>The Pandas Development Team, Pandas-Dev/pandas</source>. <publisher-loc>Geneva</publisher-loc>: <publisher-name>CERN Data Centre</publisher-name>. <pub-id pub-id-type="doi">10.5281/zenodo.3509134</pub-id> </citation>
</ref>
</ref-list>
</back>
</article>