<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="editorial" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Genet.</journal-id>
<journal-title>Frontiers in Genetics</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Genet.</abbrev-journal-title>
<issn pub-type="epub">1664-8021</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">895796</article-id>
<article-id pub-id-type="doi">10.3389/fgene.2022.895796</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Genetics</subject>
<subj-group>
<subject>Editorial</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Editorial: Data Mining and Statistical Methods for Knowledge Discovery in Diseases Based on Multimodal Omics</article-title>
<alt-title alt-title-type="left-running-head">Wang et al.</alt-title>
<alt-title alt-title-type="right-running-head">Editorial: Disease Knowledge Discovery Using Multiomics</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Wang</surname>
<given-names>Tao</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/800132/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Renter&#xed;a</surname>
<given-names>Miguel E.</given-names>
</name>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/291138/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Peng</surname>
<given-names>Jiajie</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/563708/overview"/>
</contrib>
</contrib-group>
<aff id="aff1">
<sup>1</sup>
<institution>School of Computer Science</institution>, <institution>Northwestern Polytechnical University</institution>, <addr-line>Xi&#x2019;an</addr-line>, <country>China</country>
</aff>
<aff id="aff2">
<sup>2</sup>
<institution>Key Laboratory of Big Data Storage and Management</institution>, <institution>Ministry of Industry and Information Technology</institution>, <institution>Northwestern Polytechnical University</institution>, <addr-line>Xi&#x2019;an</addr-line>, <country>China</country>
</aff>
<aff id="aff3">
<sup>3</sup>
<institution>Department of Genetics and Computational Biology</institution>, <institution>QIMR Berghofer Medical Research Institute</institution>, <addr-line>Brisbane</addr-line>, <addr-line>QLD</addr-line>, <country>Australia</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited and reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/541321/overview">Simon Charles Heath</ext-link>, Center for Genomic Regulation (CRG), Spain</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Tao Wang, <email>twang@nwpu.edu.cn</email>; Miguel E. Renter&#xed;a, <email>Miguel.Renteria@qimrberghofer.edu.au</email>; Jiajie Peng, <email>jiajiepeng@nwpu.edu.cn</email>
</corresp>
<fn fn-type="other">
<p>This article was submitted to Statistical Genetics and Methodology, a section of the journal Frontiers in Genetics</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>26</day>
<month>04</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>13</volume>
<elocation-id>895796</elocation-id>
<history>
<date date-type="received">
<day>14</day>
<month>03</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>25</day>
<month>03</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2022 Wang, Renter&#xed;a and Peng.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Wang, Renter&#xed;a and Peng</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<related-article id="RA1" journal-id="Front. Chem." related-article-type="commentary-article" xlink:href="https://www.frontiersin.org/researchtopic/18824" ext-link-type="uri">Editorial on the Research Topic<article-title>Data Mining and Statistical Methods for Knowledge Discovery in Diseases Based on Multimodal Omics</article-title>
</related-article>
<kwd-group>
<kwd>multimodal</kwd>
<kwd>omics</kwd>
<kwd>disease biology</kwd>
<kwd>data mining</kwd>
<kwd>statistical methods</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<p>Over the last decade, advances in high-throughput omics technologies and methods have enabled researchers to measure multiple biological data modalities simultaneously and accurately or to integrate multi-omics data from different sources and modalities. Numerous datasets are being rapidly generated encompassing genomics, transcriptomics, proteomics, metabolomics, phenomics, radiomics, cutting-edge 3D spatial omics, and single-cell omics data. This represents an unprecedented opportunity for knowledge discovery in disease biology, including the identification of biomarkers, functional modules, causal pathways, or regulatory networks implicated in disease, thus having also the potential to bolster current therapeutic pipelines.</p>
<p>In parallel, a wide-array of statistical methods have been developed to leverage availability of these data, from genome-wide association studies (GWAS) to transcription-wide association studies (TWAS), methylome-wide association studies (MWAS), molecular quantitative trait loci (molQTL) analysis, or summary-based two-sample Mendelian Randomization. However, the ability to integrate different features of existing methods is still insufficient, limiting the power for knowledge discovery. Thus, advances in data mining, or statistical and machine learning techniques are urgently needed to perform cross-modal data integration and modeling. Here, we present a Research Topic on &#x201c;Data Mining and Statistical Methods for Knowledge Discovery in Diseases Based on Multimodal Omics&#x201d; to showcase studies that leverage these techniques to enable discovery of disease-related knowledge and illuminate molecular mechanisms of complex diseases. After rigorous peer-review, a total of 14 outstanding articles were selected for this topic collection. Below we highlighted six of them.</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fgene.2021.763259/full">Huang et al.</ext-link> explored the causal effects of insomnia on bipolar disorder, major depression, and schizophrenia in the European population using a two-sample Mendelian randomization approach. They first collected GWAS summary datasets for each trait and conducted meta-analyses for each trait to increase statistical power. The results of Mendelian randomization were further evaluated using extensive complementarity and sensitivity analysis. Among these psychiatric disorders, they found insomnia is causally associated with an increased risk of major depression, with an odds ratio estimated as 1.408 (95% confidence interval (CI): 1.210&#x2013;1.640, <italic>p</italic> &#x3d; 1.03E-05) in the European population. No causal association was observed for other traits. The study provides new evidence to support the causal effect of insomnia on major depression and adds to a better understanding of the relationship between sleep and psychiatric disorders.</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fgene.2021.724785/full">Hamidi et al.</ext-link> proposed a machine learning framework to explore miRNA biomarkers and prediction for Ovarian cancer. miRNAs play an important role in cancer progression. In this study, the authors first used LASSO and Elastic Net for miRNA feature selection. They found 10 miRNA&#x2019;s as potential biomarkers by comparing the expression levels in ovarian serum cancer samples and normal samples. Furthermore, they used multiple machine learning classifiers, including logistic regression, random forest, artificial neural network, XGBoost, and decision trees for ovarian cancer prediction. Experiments demonstrated the accuracy of their proposed model. The performance of the proposed models was further evaluated in external datasets.</p>
<p>Cerebral ischemic stroke (IS) is a complex disease caused by multiple factors, including vascular risk, genetic, and environmental factors. Identifying the genes associated with IS critical for understanding the biological mechanisms underlying the disease. <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fgene.2021.728333/full">Liu et al.</ext-link> proposed a network representation learning (NRL)-based method to identify the disease-related genes of cerebral IS. The proposed method includes three key components: capturing the topological information of the PPI network, denoising the gene feature, and optimizing a support vector machine (SVM) classifier to identify IS-related genes. The evaluation showed that the proposed method performs better than existing methods on IS-related gene prediction. In addition, the case study also shows that the proposed method can identify IS-related genes.</p>
<p>Recently, single-cell RNA sequencing (scRNA-seq) technology has been used to measure RNA levels at single-cell resolution to study biological functions. <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fgene.2021.739677/full">Xu et al.</ext-link> proposed an imputation method based on semi-supervised autoencoders named AdImpute. The method applies the cost function with imputation weights to learn the latent information in the data to achieve a more accurate imputation. The evaluation indicates that AdImpute is more accurate than the other four publicly available scRNA-seq imputation methods on the simulated and real data sets.</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fgene.2021.729326/full">Yang et al.</ext-link> tackled the issue of systematic selection bias in Mendelian randomization. The authors proposed a new approach that uses control exposures based on subject-matter knowledge to triangulate the estimated causal effects vulnerable to selection bias. The proposed approach can be used to assess credible MR estimates in the presence of selection bias from selection of survivors. The authors illustrate the application of their method by validating MR estimates through a real example investigating the potential association of transferrin with stroke (including ischemic and cardioembolic stroke).</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fgene.2022.814412/abstract">Park et al.</ext-link> developed an innovative approach for integrative pathway analysis that leverages genome-wide association studies summary statistics to construct genetic metabolomic scores (GMSs) that are then used as components of pathways in a hierarchical model that considers the structural relationships of SNPs, metabolites, pathways, and phenotypes. The authors applied their method to identify pathways associated with type 2 diabetes in the Korean population.</p>
<p>All the contributions in this special issue have been peer-reviewed by no less than two professional domain experts. We believe that the final compilation includes high-quality publications that represent significant scientific progress that will impact the relevant research communities. On this basis, we have launched a <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/research-topics/32114/data-mining-and-statistical-methods-for-knowledge-discovery-in-diseases-based-on-multimodal-omics-vo">second edition</ext-link> of this Research Topic which is currently open for submissions.</p>
</body>
<back>
<sec id="s1">
<title>Author Contributions</title>
<p>TW, MR and JP conducted this topic issue and wrote the manuscript. All authors contributed to the article and approved the submitted version.</p>
</sec>
<sec id="s2">
<title>Funding</title>
<p>This work was supported by National Natural Science Foundation of China (Nos. 62102319 and 62072376); Fundamental Research Funds for the Central Universities of China (No. G2021KY05112).</p>
</sec>
<sec sec-type="COI-statement" id="s3">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s4">
<title>Publisher&#x2019;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ack>
<p>We would like to thank all authors for their contributions to our special issue and all reviewers&#x2019; time and effort. We would also thank the editor-in-chief and editorial department of Frontiers in Genetics for their support throughout the process.</p>
</ack>
</back>
</article>