<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="research-article" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Bioinform.</journal-id>
<journal-title>Frontiers in Bioinformatics</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Bioinform.</abbrev-journal-title>
<issn pub-type="epub">2673-7647</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">780229</article-id>
<article-id pub-id-type="doi">10.3389/fbinf.2022.780229</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Bioinformatics</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>NetRank Recovers Known Cancer Hallmark Genes as Universal Biomarker Signature for Cancer Outcome Prediction</article-title>
<alt-title alt-title-type="left-running-head">Al-Fatlawi et&#x20;al.</alt-title>
<alt-title alt-title-type="right-running-head">Hallmark Biomarker Signature for Cancer Prediction</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Al-Fatlawi</surname>
<given-names>Ali</given-names>
</name>
<uri xlink:href="https://loop.frontiersin.org/people/1486500/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Afrin</surname>
<given-names>Nazia</given-names>
</name>
<uri xlink:href="https://loop.frontiersin.org/people/1688210/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Ozen</surname>
<given-names>Cigdem</given-names>
</name>
<uri xlink:href="https://loop.frontiersin.org/people/524634/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Malekian</surname>
<given-names>Negin</given-names>
</name>
<uri xlink:href="https://loop.frontiersin.org/people/1701745/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Schroeder</surname>
<given-names>Michael</given-names>
</name>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1516368/overview"/>
</contrib>
</contrib-group>
<aff>
<institution>Biotechnology Center (BIOTEC)</institution>, <institution>Center for Molecular and Cellular Bioengineering</institution>, <institution>Technische Universit&#xe4;t Dresden</institution>, <addr-line>Dresden</addr-line>, <country>Germany</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/983334/overview">Ozlem Keskin</ext-link>, Ko&#xe7; University, Turkey</p>
</fn>
<fn fn-type="edited-by">
<p>
<bold>Reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/683489/overview">Paolo Martini</ext-link>, University of Brescia, Italy</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/111132/overview">Abel Gonzalez-Perez</ext-link>, Pompeu Fabra University, Spain</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Michael Schroeder, <email>michael.schroeder@tu-dresden.de</email>
</corresp>
<fn fn-type="other">
<p>This article was submitted to Network Bioinformatics, a section of the journal Frontiers in Bioinformatics</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>23</day>
<month>03</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>2</volume>
<elocation-id>780229</elocation-id>
<history>
<date date-type="received">
<day>20</day>
<month>09</month>
<year>2021</year>
</date>
<date date-type="accepted">
<day>16</day>
<month>02</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2022 Al-Fatlawi, Afrin, Ozen, Malekian and Schroeder.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Al-Fatlawi, Afrin, Ozen, Malekian and Schroeder</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these&#x20;terms.</p>
</license>
</permissions>
<abstract>
<p>Gene expression can serve as a powerful predictor for disease progression and other phenotypes. Consequently, microarrays, which capture gene expression genome-wide, have been used widely over the past two decades to derive biomarker signatures for tasks such as cancer grading, prognosticating the formation of metastases, survival, and others. Each of these signatures was selected and optimized for a very specific phenotype, tissue type, and experimental set-up. While all of these differences may naturally contribute to very heterogeneous and different biomarker signatures, all cancers share characteristics regardless of particular cell types or tissue as summarized in the hallmarks of cancer. These commonalities could give rise to biomarker signatures, which perform well across different phenotypes, cell and tissue types. Here, we explore this possibility by employing a network-based approach for pan-cancer biomarker discovery. We implement a random surfer model, which integrates interaction, expression, and phenotypic information to rank genes by their suitability for outcome prediction. To evaluate our approach, we assembled 105&#x20;high-quality microarray datasets sampled from around 13,000 patients and covering 13 cancer types. We applied our approach (NetRank) to each dataset and aggregated individual signatures into one compact signature of 50 genes. This signature stands out for two reasons. First, in contrast to other signatures of the 105 datasets, it is performant across nearly all cancer types and phenotypes. Second, It is interpretable, as the majority of genes are linked to the hallmarks of cancer in general and proliferation specifically. Many of the identified genes are cancer drivers with a known mutation burden linked to cancer. Overall, our work demonstrates the power of network-based approaches to compose robust, compact, and universal biomarker signatures for cancer outcome prediction.</p>
</abstract>
<kwd-group>
<kwd>cancer</kwd>
<kwd>network bioinformatics</kwd>
<kwd>hallmarks</kwd>
<kwd>biomarker</kwd>
<kwd>gene expression</kwd>
<kwd>microarray</kwd>
</kwd-group>
<contract-sponsor id="cn001">Technische Universit&#xe4;t Dresden<named-content content-type="fundref-id">10.13039/501100002957</named-content>
</contract-sponsor>
</article-meta>
</front>
<body>
<sec id="s1">
<title>Introduction</title>
<p>Cancer is an uncontrollable growth of cells that can occur in nearly any organ of the human body. Biomarkers help to improve cancer diagnosis and disease progression. A number of biomarkers are in clinical use today, such as the Carbohydrate antigen 19-9 (CA19-9) for early detection of pancreatic cancer (<xref ref-type="bibr" rid="B22">Koprowski et&#x20;al., 1981</xref>), MYC for monitoring the prognosis of lymphoma and leukemia, and <italic>ALK,</italic> for lung cancer (<xref ref-type="bibr" rid="B38">Targeted Cancer Therapies Fact Sheet&#x2014;National Cancer Institute, 2021</xref>). Identifying highly accurate biomarkers is a complex problem. CA19-9, for example, is well established in pancreatic cancer but has only an accuracy of 70&#x2013;80% (<xref ref-type="bibr" rid="B3">Al-Fatlawi et&#x20;al., 2021</xref>), which means that it is not suitable for diagnosis on its own, but only to monitor relapse after surgery. One way to improve accuracy and robustness of the diagnoses is to employ biomarker signatures instead of only using single biomarkers. Key enabling technology for discovering biomarker signatures is high-throughput screening techniques such as microarray and deep sequencing. When microarrays were introduced in the late 90s, a first high-impact study identified a biomarker signature of 70 genes to estimate metastases after breast cancer surgery (<xref ref-type="bibr" rid="B41">van &#x2019;t Veer et&#x20;al., 2002</xref>). Today, this signature is commercially available, and it is in wide use internationally as MammaPrint.</p>
<p>Defining such a signature is a complex undertaking as there are three requirements<list list-type="simple">
<list-item>
<p>1) Robustness: A signature should be robust to changes in the data, and it must not be over-optimized for a specific dataset. If the signature is applied independently to a dataset of a similar phenotype, it should perform similarly to the original dataset. If not, it could be overfitted and biased towards the original&#x20;data.</p>
</list-item>
<list-item>
<p>2) Compactness: A signature should be compact. If a signature consists of thousands of genes, it becomes complicated to understand how individual components of the signature contribute to the prediction result.</p>
</list-item>
<list-item>
<p>3) Interpretability: A signature should be meaningful and interpretable. The identified genes should be connected to cancer, so that first steps can be taken to extend the correlation between biomarker and phenotype towards a causal model that explains how the biomarker links to the observed phenotype.</p>
</list-item>
</list>
</p>
<p>In general, discovering a biomarker signature for specific cancer and prediction is a daunting task due to combinatorial explosion. If a genome screen obtains data for 20,000 genes and a signature consists of 50 genes, then there are around 3.5 &#x2a; 10<sup>150</sup> possible signatures [C (20,000, 50) &#x3d; 3.5 &#x2a; 10<sup>150</sup>]. The vast majority of these signatures will not be suitable for any outcome prediction task. However, even if only a small percentage of signatures are suitable, it is still a large number. Consequently, many good signatures may exist. The breast cancer signature introduced by van&#x2019;t Veer was complemented by a completely different signature for the same task with similar performance (<xref ref-type="bibr" rid="B29">Paik et&#x20;al., 2004</xref>; <xref ref-type="bibr" rid="B13">Ein-Dor et&#x20;al., 2006</xref>). This begs the question of how arbitrary the choice of a good signature could&#x20;be.</p>
<p>Should not one expect that biomarker signatures for similar cancer types and outcome prediction tasks share some similarities? This should be especially true as different cancers share basic mechanisms such as survival, tumor growth, invasion, and others (<xref ref-type="bibr" rid="B17">Hanahan and Weinberg, 2011</xref>). These principles were summarized by Hanahan and Weinberg as hallmarks of cancer (<xref ref-type="bibr" rid="B18">Hanahan and Weinberg, 2000</xref>; <xref ref-type="bibr" rid="B17">Hanahan and Weinberg, 2011</xref>). They represent the biological properties acquired during the multistage development of cancer, including sustaining proliferative signaling, evading growth suppressors, resisting cell death, and seven other principles. Linking biomarkers to the hallmarks of cancer is one possibility for an interpretable signature. This paper defines a universal biomarker signature for cancer outcome prediction, which is robust, compact, and interpretable by pursuing a network-based approach.</p>
<p>There is a long-standing tradition to use interaction networks in biomarker discovery. Shi et&#x20;al. developed a network-based signature for colorectal cancer recurrence by integrating several signatures and interaction networks (<xref ref-type="bibr" rid="B34">Shi et&#x20;al., 2012</xref>). They highlighted the dysregulated biological processes in colorectal cancer recurrence. Dutkowski and coworkers combined gene expression profiles and physical protein interaction maps of embryonic tissue, metastatic breast cancer, and brain tumors to provide global network modules pointing out representative development and cancer programs (<xref ref-type="bibr" rid="B11">Dutkowski and Ideker, 2011</xref>). Winter et&#x20;al. (<xref ref-type="bibr" rid="B43">Winter et&#x20;al., 2012</xref>) developed a network-based outcome prediction approach&#x2014;NetRank and successfully predicted patient survival using gene expression data. It ranks genes according to their network connectivity and statistical relevance using a modified formula for Google&#x2019;s PageRank algorithm. NetRank was also applied to several cancer microarray datasets using transcription factor and protein-protein interaction networks (<xref ref-type="bibr" rid="B33">Roy et&#x20;al., 2014</xref>). The study showed that integration of network information and gene expression data provides more accurate outcome predictions than classical methods on a par with signatures published by the authors of the studies. Barter et&#x20;al. used gene expression microarray data from melanoma and ovarian cancer to predict patient clinical outcomes through gene expression. They compared three feature selection methods, including the most commonly used single gene (based on differential gene expression differences), gene-set (based on biological pathway or function), and network-based approaches (based on protein-protein interactions). The study also evaluated two network-based feature-selection algorithms: NetRank and GeneRank. As a result, they reported that NetRank was the most accurate for identifying more stable gene expression signatures (<xref ref-type="bibr" rid="B4">Barter et&#x20;al., 2014</xref>).</p>
<p>We set out in this paper to collect 105 datasets covering 13 cancer types with different phenotypes. We proceed as depicted in the graphical abstract of <xref ref-type="fig" rid="F1">Figure&#x20;1</xref>. In the first step, we show that biomarker signatures proposed by authors of the datasets do not overlap, and hence they follow the pattern that was already observed two decades ago when the two main breast cancer signatures turned out to be entirely different. Next, we develop our network-based approach by applying NetRank to the gene expression and phenotype data using the String database network (<xref ref-type="bibr" rid="B37">Szklarczyk et&#x20;al., 2019</xref>), which covers over four million interactions between more than 20,000 proteins. We evaluate the performance of these signatures and compare the composition of these against each other and against the signatures originally proposed by the authors of the datasets. In the last step, we combine the NetRank biomarker signatures of each dataset into a global NetRank signature using majority voting. We evaluate the performance of this signature in terms of area under the curve for the cancer outcome prediction tasks and in terms of their relation to the hallmarks of cancer using an evaluation set. Overall, we show that the NetRank signature is robust, compact, and interpretable.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption>
<p>Overview biomarker discovery pipeline. NetRank identifies biomarker signatures by combining protein interactions from the String database with gene expression data. NetRank is applied to each dataset individually. Every dataset was split into a feature selection set (70%) and an evaluation set (30%). NetRank was applied to the first set. Principal component analysis was performed on the latter set using the selected features to evaluate the signatures in predicting the phenotype in an independent set.</p>
</caption>
<graphic xlink:href="fbinf-02-780229-g001.tif"/>
</fig>
</sec>
<sec sec-type="methods" id="s2">
<title>Methods</title>
<sec id="s2-1">
<title>Datasets</title>
<p>Microarray datasets were obtained as follows. PubMed was queried in January 2021 for the keywords cancer and gene expression. Dates were limited to 2000 to 2020. To obtain only high-quality datasets, we filtered articles by impact factor greater than 15 and obtained ca. 3,700 papers. These were scanned manually for relevance to differential gene expression leaving 1,288 articles. For these, we found 225 datasets in the Gene Expression Omnibus database (<xref ref-type="bibr" rid="B12">Edgar et&#x20;al., 2002</xref>). We filtered out 120 datasets because of their missing phenotype data (48), missing prob signals or few probes (35), missing gene symbols (11), the small size of fewer than six samples, or low quality indicated by many missing or NaN values (26). As a result, we kept 105 datasets. As demonstrated in <xref ref-type="fig" rid="F2">Figure&#x20;2</xref>, the selected datasets comprise around 13,000 individuals for 13 cancer types with different phenotypes; <xref ref-type="sec" rid="s10">Supplementary Table S1</xref> and <xref ref-type="sec" rid="s10">Supplementary Sheets S1, S2</xref>. Each dataset was normalized, evaluated, and studied individually, and then their outcomes were compared. Individuals of 11 out of the 105 datasets were mice, so we humanized their gene symbols using the R package biomaRt (<xref ref-type="bibr" rid="B10">Durinck et&#x20;al., 2005</xref>).</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption>
<p>Microarray datasets. The 105 datasets are comprehensive, with around 13,000 samples from 13 cancer types, which each comprise a substantial number of probe&#x20;sets.</p>
</caption>
<graphic xlink:href="fbinf-02-780229-g002.tif"/>
</fig>
</sec>
<sec id="s2-2">
<title>Microarray Data Processing</title>
<p>The robust multi-array averaging (RMA) method was used for background correction and normalization of the unnormalized datasets using the affy package in R 3.6 (<xref ref-type="bibr" rid="B15">Gautier et&#x20;al., 2004</xref>). Affymetrix probes were mapped to gene symbols using the provided functional annotation of each dataset. We excluded the genes and samples with NaN values using the function &#x201c;goodSamplesGenes&#x201d; provided by the WGCNA R package 1.6.9 (<xref ref-type="bibr" rid="B24">Langfelder and Horvath, 2008</xref>), which kept only the records that have a minimum fraction of non-missing samples for a gene of 50%. We performed hierarchical cluster analysis and principal components analysis to evaluate the distance between individuals and remove the detected outliers using the R package stats 3.6.2 (<xref ref-type="bibr" rid="B32">R Core Team., 2020</xref>). Finally, the Pearson standard correlation coefficient and Fisher&#x2019;s asymptotic <italic>p</italic>-value were determined using a robust correlation measure implementation (<xref ref-type="bibr" rid="B24">Langfelder and Horvath, 2008</xref>) in the R package WGCNA&#x20;1.6.9.</p>
</sec>
<sec id="s2-3">
<title>Protein-Protein Interaction Network</title>
<p>To calculate the connectivity of each protein, we used the protein-protein interaction (PPI) STRING network (<xref ref-type="bibr" rid="B37">Szklarczyk et&#x20;al., 2019</xref>). The analysis was carried out using the R package STRINGdb_1.26-0 with database version 10. The STRING database contains above four Mio interactions for more than 20,000 human proteins and above five Mio interactions for more than 22,000 mouse proteins. We have not applied any filtering for the connections. The nodes&#x2019; connectivity scores were normalized by dividing it by the maximum possible connectivity score in the network.</p>
</sec>
<sec id="s2-4">
<title>NetRank</title>
<p>Our method is derived from the PageRank algorithm, which Google uses to rank web pages in their search engine. NetRank assumes a random surfer who navigates through a network of web pages by following links with probability <italic>d</italic> or starting new sessions on a random page with a probability of <italic>1&#x2212;d</italic>. The random surfer visits a web page and randomly clicks on a link visiting the next page. Consequently, pages, which are central and well-connected, are visited more frequently by the random surfer than pages on the periphery of the network.</p>
<p>While PageRank takes only the node connectivity into account to designate ranking (<xref ref-type="disp-formula" rid="e1">Eq. 1</xref>), NetRank takes into account both connectivity and statistical association of the genes with a specific phenotype (<xref ref-type="disp-formula" rid="e2">Eq. 2</xref>).<disp-formula id="e1">
<mml:math id="m1">
<mml:mrow>
<mml:msubsup>
<mml:mi>r</mml:mi>
<mml:mi>j</mml:mi>
<mml:mi>n</mml:mi>
</mml:msubsup>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>d</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x2b;</mml:mo>
<mml:mi>d</mml:mi>
<mml:munderover>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>N</mml:mi>
</mml:munderover>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mi>m</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#xa0;</mml:mo>
<mml:msubsup>
<mml:mi>r</mml:mi>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
<mml:mrow>
<mml:mi>deg</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:msub>
<mml:mi>e</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfrac>
<mml:mo>,</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x2264;</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>&#x2264;</mml:mo>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:math>
<label>(1)</label>
</disp-formula>
<disp-formula id="e2">
<mml:math id="m2">
<mml:mrow>
<mml:msubsup>
<mml:mi>r</mml:mi>
<mml:mi>j</mml:mi>
<mml:mi>n</mml:mi>
</mml:msubsup>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>d</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:mi>d</mml:mi>
<mml:munderover>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>N</mml:mi>
</mml:munderover>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mi>m</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#xa0;</mml:mo>
<mml:msubsup>
<mml:mi>r</mml:mi>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
<mml:mrow>
<mml:mi>deg</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:msub>
<mml:mi>e</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfrac>
<mml:mo>,</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x2264;</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>&#x2264;</mml:mo>
<mml:mi>N</mml:mi>
<mml:mo>&#xa0;</mml:mo>
</mml:mrow>
</mml:math>
<label>(2)</label>
</disp-formula>
</p>
<p>r: the node (gene) ranking score</p>
<p>n: iteration</p>
<p>j: index of the current node</p>
<p>d: damping factor (ranging between 0 and 1)</p>
<p>s: Pearson correlation coefficients</p>
<p>degree: the sum of the output connectivities for connected nodes</p>
<p>N: number of the total nodes</p>
<p>m: connectivity of connected nodes, <inline-formula id="inf1">
<mml:math id="m3">
<mml:mrow>
<mml:msub>
<mml:mi>m</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula> if <italic>i</italic> and <italic>j</italic> are connected and 0 otherwise.</p>
<p>In <xref ref-type="disp-formula" rid="e2">Eq. 2</xref>, the damping factor d balances the impact of the network links (connectivity of the protein) with its statistical significance. In our implementation, we kept the value of d fixed (0.5) in all datasets to avoid any bias of having customized parameters in each analysis.</p>
<p>Here, the random surfer model is applied to an interaction network [such as the protein-protein interaction (PPI) network of STRINGdb (<xref ref-type="bibr" rid="B37">Szklarczyk et&#x20;al., 2019</xref>)]. It combines the connectivity score with another score representing the correlation to the phenotype. Instead of counting page visits as in PageRank, NetRank initializes scores as the gene&#x2019;s correlation to the phenotype. When the surfer visits a node, its correlation is distributed in equal parts to its neighbors. Then, its score is updated with one contribution from the correlation and the other from the neighbors&#x2019; scoring.</p>
<p>NetRank can be seen as method average across a network. Instead of considering a value in isolation, it is combined with its neighbors&#x2019; scores. In other words, two pieces of information are required to rank the coding genes: a network of the interactions between the genes and their statistical significance of association with the phenotype. For evaluating the significance of association between a phenotype and gene expression, we determined their correlation. Fisher&#x2019;s asymptotic <italic>p</italic>-value was determined using an approximation to the true distribution. The advantage of Fisher&#x2019;s asymptotic <italic>p</italic>-value is that it is valid in small and large sample sizes (<xref ref-type="bibr" rid="B2">Agresti, 1992</xref>). A <italic>p</italic>-value of 0.01 or below was considered significant in the analysis. <xref ref-type="sec" rid="s10">Supplementary Figure S1</xref> shows the pseudo-code for this procedure.</p>
</sec>
<sec id="s2-5">
<title>Data Splitting and Feature Selection</title>
<p>In our analysis, the NetRank algorithm was applied on only 70% of each dataset (feature selection set), and we kept 30% unseen for evaluating the approach (<xref ref-type="fig" rid="F1">Figure&#x20;1</xref>). To keep the signatures compact and avoid bias toward specific datasets, we specified a threshold of 50 genes (maximum) for all datasets and selected those that showed the highest-ranking and met the <italic>p</italic>-value requirement below&#x20;0.01.</p>
</sec>
<sec id="s2-6">
<title>Outcome Prediction Using PCA</title>
<p>Principal component analysis was performed using Python 3.7.6 with core functions provided by scikit-learn (sklearn) 0.20.3 (<xref ref-type="bibr" rid="B31">Pedregosa et&#x20;al., 2011</xref>). It was applied only on the datasets with an adequate number of samples in each class in the test set (i.e.,&#x20;at least four samples per class). In our analysis, out of 105 datasets, 60 had enough samples in the test set for clustering (i.e.,&#x20;six samples). The area under the ROC curve (AUC) was calculated using scikit-learn for the best component in the PCA analysis.</p>
</sec>
<sec id="s2-7">
<title>Cancer Hallmarks Genes</title>
<p>The selected genes in our signature were manually searched on the Cancer Hallmarks Genes (<xref ref-type="bibr" rid="B46">Zhang et&#x20;al., 2020</xref>). Cancer Hallmarks Genes dataset has a collection of 2,940 genes that are categorized into ten hallmarks. We searched for our genes in each hallmark, and provided degree and betweenness centrality information of the gene in different hallmark networks.</p>
</sec>
<sec id="s2-8">
<title>Biological Interpretation</title>
<p>We searched for possible existing drugs for our protein list in ChEMBL (<xref ref-type="bibr" rid="B14">Gaulton et&#x20;al., 2017</xref>) using the open-target project (<xref ref-type="bibr" rid="B23">Koscielny et&#x20;al., 2017</xref>) and provided the results. Moreover, we check these genes in the Cancer Genome Atlas Data Portal (<xref ref-type="bibr" rid="B16">GDC, 2021</xref>).</p>
</sec>
</sec>
<sec sec-type="results" id="s3">
<title>Results</title>
<p>This study explores the possibility of a universal cancer signature arising from processes common to most cancers. Inspired by the hallmarks of cancer, it investigates whether mechanisms such as tumor growth or cancer survival and progression, which are present in all cancers, can give rise to biomarker signatures, which perform well in cancer outcome prediction&#x20;tasks.</p>
<p>To this end, we devised a universal signature, which shows good performance across many types of cancer in the cancer outcome prediction tasks. The defined signature is compact interpretable in that its genes have confirmed links to cancer, the hallmarks of cancer in particular. The base for this goal is a large dataset of gene expression data for cancer outcome prediction&#x20;tasks.</p>
<p>
<bold>105 datasets cover 13,000 samples, 13 cancer types, and over ten phenotypes.</bold> When collecting datasets for our study, we had two goals: The collection had to be comprehensive, covering many types of cancer and various outcome prediction tasks, and the data had to be high quality. We addressed both aims by screening scientific literature and focusing on high-impact publications. After rigorous filtering as described in the methods section, we obtained 105 microarray datasets. These datasets ranged very substantially in size from some specialized, small-scale studies with as little as six samples (such as GSE73396 in liver cancer and GSE43444, GSE17538 in colon cancer) to a large-scale multi-center study to evaluate the use of microarrays in leukemia diagnosis with 2,096 samples (GSE13204). The average size per dataset is 73 samples. In total, there were 12,900 samples.</p>
<p>Overall, the samples were very diverse in terms of cancer types and phenotypes. The largest number of studies dealt with breast cancer (25), followed by liver (17) and prostate, leukemia, lung, and lymphoma with around ten each. Overall, 13 different types of cancer are present (<xref ref-type="fig" rid="F2">Figure&#x20;2</xref>). The overwhelming majority of datasets consisted of human samples, and however, 206 of the 12,900 samples were from mice. The phenotypes investigated in the 105 studies also captured a broad range, including grading (18), distinctive cancer-specific phenotypes such as epithelial cell adhesion molecules in the liver and lymph node status in breast cancer (16), cancer vs. non-cancer (12), metastasis status or localization (12), subtypes (9), survival status or time (7), mutation of genes or receptor (6), treatment effect (4), tumor localization (4), remission or relapse (3), progression (1), and 13 others (<xref ref-type="sec" rid="s10">Supplementary Sheets S2</xref>). This comprehensive mixture ensures that easier tasks such as distinguishing healthy from cancerous tissue as well as more complex tasks such as survival are present.</p>
<p>Nearly all studies used standard microarrays, and every of the 13 cancer types has at least two datasets with over 40,000 probes. Only three out of the 105 datasets have less than 10,000 probes. The dataset with the smallest number of probes (1,756) is also the largest dataset with 2,096 samples.</p>
<p>The datasets span a period of 13&#xa0;years from 2005 to 2018, with peaks between 2009 and 2012, which is in line with the introduction of microarrays in the late 90s to early 00s and the recent advent of low-cost deep sequencing as a new technology superseding microarray.</p>
<p>We collected gene expression signatures that were proposed by the authors of the datasets. Taken together, the 105 author signatures comprise 4,343 genes. The signatures vary immensely in size with the smallest consisting of only one gene and the largest of 3,232 genes. The average number of genes put forward by the authors of the datasets is therefore 41 (4,343/105). This is a similar order of magnitude as the highly successful 70 gene signature underlying the Mammaprint breast cancer signature (<xref ref-type="bibr" rid="B41">van &#x2019;t Veer et&#x20;al., 2002</xref>). Therefore, we fixed the size of signatures to be proposed by our methods to&#x20;50.</p>
<p>
<bold>Author signatures are dissimilar.</bold> The starting point of our analysis is how similar or dissimilar biomarker signatures are across datasets and tasks. Given that we have 25 breast cancer datasets, one could expect that the signatures for these datasets overlap. The degree of similarity between the investigated phenotypes relates to the degree of overlap. As shown in <xref ref-type="fig" rid="F3">Figure&#x20;3</xref> and <xref ref-type="sec" rid="s10">Supplementary Figure S2</xref>, the datasets hardly overlap. The only significant overlap exists between datasets from the same study (GSE25066, GSE21653, GSE11121, GSE20685, GSE21653, GSE3494) (<xref ref-type="bibr" rid="B21">Ko et&#x20;al., 2013</xref>), which was a meta-analysis on the role of ion channels as predictors. We expanded these comparisons to all datasets (<xref ref-type="sec" rid="s10">Supplementary Figure S3</xref>) and found the same: Author signatures hardly overlap, and this holds in particular for each of the 13 cancer&#x20;types.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption>
<p>Overlap of the author (green) and the NetRank (blue) signatures for 25 breast cancer datasets. Author signatures do generally not overlap (green vs. green triangle in the bottom left). Author signatures hardly overlap with NetRank signatures (blue vs. green rectangle in the bottom right). NetRank signatures strongly overlap (blue vs. blue triangle in top right).</p>
</caption>
<graphic xlink:href="fbinf-02-780229-g003.tif"/>
</fig>
<p>This finding was in agreement with the observation of Ein-Dor et&#x20;al., who observed the lack of similarity between signatures of different studies for the same prediction task (<xref ref-type="bibr" rid="B13">Ein-Dor et&#x20;al., 2006</xref>). While pathways in different tissues are formed from specific genes and proteins and while author signatures were introduced in this specific context, we aim to highlight the commonalities of data sets. Besides the&#x20;significant contribution of the aforementioned signatures in underlying the genetic information in each cancer and phenotype type individually, there is a necessity for studying the shared genomic process and biological phenomena in cancer generally, regardless of particular cell types or tissue. This concept of generalization in principles of cancer biology is highly inspired by the notion of cancer hallmarks which helps in understanding the common mechanisms of tumor growth and cancer survival and progression.</p>
<p>
<bold>Standard correlation signatures are dissimilar.</bold> One trivial reason why there is only so little overlap between the signatures proposed by the original authors of the datasets is that there may have been differences in pre-processing and normalizing the data and in the selection procedure for biomarkers. Therefore, we processed all datasets in the same manner (<xref ref-type="sec" rid="s2">Section 2</xref>) and devised a simple method to generate a biomarker signature per dataset. We correlated each gene to the desired phenotype and combined the top 50 genes with the best correlation into a signature. <xref ref-type="sec" rid="s10">Supplementary Figure S4</xref> shows the pairwise overlap between these signatures. Moreover, again, there is hardly any overlap.</p>
<p>
<bold>NetRank signatures are similar.</bold> To focus on the common cancer characteristics, we employed a network-based approach which added a new aspect to the biomarker selection process. It combines two forms of information in ranking biomarkers: first, the gene&#x2019;s correlation to the target phenotype as introduced above; second, the interactions between these genes (<xref ref-type="sec" rid="s2">Section&#x20;2</xref>).</p>
<p>After running NetRank on a dataset, we define the top 50 genes with the highest NetRank score and <italic>p</italic>-value lower than 0.01 as NetRank signature for this dataset. Strong overlap was found between signatures of the same cancer type (see the overlap of breast cancer signatures in <xref ref-type="fig" rid="F3">Figure&#x20;3</xref> and <xref ref-type="sec" rid="s10">Supplementary Figure S5</xref>). Considerable overlap was noted even between different cancer types (<xref ref-type="sec" rid="s10">Supplementary Figure S6</xref>). We specify the 50 most overlapped biomarkers within each cancer as a signature for that cancer. <xref ref-type="sec" rid="s10">Supplementary Sheets S3</xref> present the signatures of 13 cancer types used in further analysis to propose a universal cancer signature.</p>
<p>
<bold>NetRank is an outstanding feature selection technique.</bold> For each dataset, we created a feature selection set (70%), which NetRank uses, while 30% were kept unseen to serve as an evaluation set (<xref ref-type="fig" rid="F1">Figure&#x20;1</xref>). In the evaluation process, to avoid over-optimization of the outcome prediction with the signatures, we used a linear dimension reduction technique (PCA) instead of more complex non-linear methods such as machine learning with neural networks. Using the independent evaluation set, we evaluated the features by calculating the area under the ROC curve (AUC) of each dataset&#x2019;s best principal component in the PCA analysis. The closer the AUC to 1, the better the predictive&#x20;model.</p>
<p>
<xref ref-type="fig" rid="F5">Figure&#x20;5</xref> and <xref ref-type="sec" rid="s10">Supplementary Sheets S2</xref> show that 74% of the datasets were classified with AUC better than 0.80. Thus, NetRank serves as an outstanding feature selection method in bringing biologically meaningful features without causing a considerable drop in performance.</p>
<p>We compared the performance of the NetRank&#x2019;s features with those chosen by the standard correction method. Statistically, the standard correlation features performed slightly better (78% of the datasets having AUC better than 0.80 compared to 74% for the NetRank features). Importantly, NetRank features were highly overlapped and biologically relevant.</p>
<p>
<bold>Compact and robust universal biomarkers signature.</bold> Given the strong overlap between the 13 NetRank signatures illustrated in <xref ref-type="sec" rid="s10">Supplementary Figure S6</xref>, we asked whether it is possible to combine individual NetRank signatures for each dataset into one universal NetRank signature for all datasets. We took a consensus approach. We counted how often each of our genes was selected in any of the 13 NetRank signatures. We defined the universal biomarker signature as the top 50 genes, which appear most frequently in any NetRank signature. These 50 biomarkers are illustrated in <xref ref-type="fig" rid="F4">Figure&#x20;4</xref>. Except for pancreatic and ovarian cancer, they were associated with all types of cancer.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption>
<p>Universal NetRank signature in different cancer and hallmarks. Top 50 most frequent genes in the 105 NetRank signatures. <bold>(A)</bold>: Gene vs. cancer type. The color indicates how frequently the gene was part of a NetRank signature. All genes were selected as biomarkers in several cancer types. Pancreatic cancer stands out with hardly any genes present. <bold>(B)</bold>: Break down the 50 genes of the universal NetRank signature by ten hallmarks of cancer. All hallmarks are captured by the signature.</p>
</caption>
<graphic xlink:href="fbinf-02-780229-g004.tif"/>
</fig>
<p>With 50 genes, the universal signature is compact, which leaves the question of whether it is robust. The biomarkers in the universal signature are special. Due to the network-based approach, they comprise central and well-connected genes in the protein interaction network. They emerged from a consensus method and should therefore be widely applicable. To assess robustness, we had to evaluate the predictive power of the universal signature and define a baseline as a comparison. As a baseline, we selected the standard correlation signature and the NetRank signature. Since both are optimized for a dataset, we expect the universal signature to perform less well than these two signatures.</p>
<p>We applied dimension reduction to the evaluation sets (30%) using the 50 features of the universal biomarker signature. Then we evaluated again by calculating the area under the ROC curve (AUC). Given that biomarkers such as CA19-9, which is widely used in pancreas cancer diagnosis, achieve 70&#x2013;80%, we consider an AUC of 0.80 success.</p>
<p>Overall, we found that the correlation signature has this successful performance for 78% of datasets, the NetRank signatures for 74%, and the universal signature for 66% (<xref ref-type="fig" rid="F5">Figure&#x20;5</xref> and <xref ref-type="sec" rid="s10">Supplementary Sheets S2</xref>). A closer inspection reveals that predicting survival time, disease grades, and progression was more difficult than distinguishing cancer from control. Most of the cases that have AUC below 0.80 were for these phenotypes (<xref ref-type="fig" rid="F5">Figure&#x20;5</xref>). In contrast, cancer versus non-cancer can be very well separated.</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption>
<p>Comparison of prediction performance of correlation (blue), NetRank (red), and universal (yellow) NetRank signature. 105 datasets vs. performance measured as AUC. All signatures achieve across all datasets a good AUC. The universal signature has a comparative performance to the correlation and NetRank signature, which are optimized per dataset. All results can be found in <xref ref-type="sec" rid="s10">Supplementary Sheets S2</xref>.</p>
</caption>
<graphic xlink:href="fbinf-02-780229-g005.tif"/>
</fig>
<p>Overall, all three approaches produce satisfactory statistical results for the majority of datasets. The key difference resides in the number of different genes that are necessary. Across all 105 datasets, the correlation signatures consist of 3,812 different genes, which is close to the 4,343 genes proposed in total by the authors of the datasets. For each test, we used 50 of them that are optimized for that dataset. In contrast, the union of all 105 NetRank signatures has already a reduced size of 1,770 genes, and by definition, the universal signature comprises only 50 genes. Therefore, the universal signature is a compact condensation of the key genes&#x2019; performant across all&#x20;data.</p>
<p>
<bold>The universal biomarker signature relates to commercially available signatures.</bold> First, the protein-protein interactions between these biomarkers indicate their high connectivity and possible functional interaction in biological processes (<xref ref-type="sec" rid="s10">Supplementary Figure S7</xref>). We compared our universal signature with currently used tumor signatures as well. We found three genes (PLK1, TOP2A, RAD51) are in common with Prolaris prostate cancer signature (<xref ref-type="bibr" rid="B9">Crawford et&#x20;al., 2014</xref>), two genes (BCL2 and GAPDH) with the breast cancer signature Oncotype Dx (<xref ref-type="bibr" rid="B29">Paik et&#x20;al., 2004</xref>), other two genes (TP53, PTEN) with ColoNext (<xref ref-type="bibr" rid="B7">Colon Cancer Genetic Testing, 2021</xref>), one gene (GMPS) in common with the breast cancer signature Mammaprint (<xref ref-type="bibr" rid="B41">van &#x2019;t Veer et&#x20;al., 2002</xref>), and finally one gene (BCL2) with the Prosigna breast cancer test PAM50 (<xref ref-type="bibr" rid="B30">Parker et&#x20;al., 2009</xref>).</p>
<p>
<bold>The universal biomarker signature recovers known cancer hallmark genes</bold>. It is interpretable in the sense that it connects well to the hallmarks of cancer, although this information was not used to generate the universal signature. To assess this connection, we used the Cancer Hallmarks Genes (CHG) database designed by Zhang et&#x20;al. (<xref ref-type="bibr" rid="B46">Zhang et&#x20;al., 2020</xref>), which comprises 2,940 genes and their association to one or more hallmarks. We searched the 50 genes in the universal signature and found that 31 are listed in the hallmarks database (<xref ref-type="table" rid="T1">Table&#x20;1</xref>). From the perspective of hallmarks, at least six genes in the universal signature represented each of the ten hallmarks. The most strongly represented hallmarks were &#x201c;sustaining proliferative signaling&#x201d; (23 genes), &#x201c;resisting cell death&#x201d; (22 genes), and &#x201c;activating invasion and metastasis.&#x201d; (16 genes). At the level of individual biomarkers, we found the five biomarkers TGF&#x3b2;1, MAPK13, TP53, MAPK3, and NRAS in at least seven hallmarks and at least eight cancer types. They play well-defined roles in particular cancers such as breast, liver, lung, melanoma (<xref ref-type="bibr" rid="B45">Zarzynska, 2014</xref>; <xref ref-type="bibr" rid="B6">Cicenas et&#x20;al., 2017</xref>; <xref ref-type="bibr" rid="B35">Silwal-Pandit et&#x20;al., 2017</xref>; <xref ref-type="bibr" rid="B1">Afr&#x103;s&#xe2;nie et&#x20;al., 2019</xref>; <xref ref-type="bibr" rid="B25">Lee et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B36">Stolfi et&#x20;al., 2020</xref>). BCL2, as another example. Its gene rearrangements are used for diagnosing and planning Lymphomas and Leukemias (<xref ref-type="bibr" rid="B38">Targeted Cancer Therapies Fact Sheet&#x2014;National Cancer Institute, 2021</xref>). In our analysis, BCL2 was found in nine cancers and involved in five hallmarks. Furthermore, considering hallmark types and numbers, MAPK3 and NRAS showed the same profile, and they were involved in 9 out of 10 cancer hallmarks. We have provided degree and betweenness centrality information of the genes of universal signature in different hallmark networks in <xref ref-type="sec" rid="s10">Supplementary Table&#x20;S2</xref>.</p>
<table-wrap id="T1" position="float">
<label>TABLE 1</label>
<caption>
<p>Universal NetRank signature genes and the hallmarks of cancer. Cancer Hallmarks: 1: Sustaining proliferative signaling; 2: Evading growth suppressors; 3: Evading immune destruction; 4: Enabling replicative immortality; 5: Tumor-promoting inflammation; 6: Activating column and metastasis; 7: Inducing angiogenesis; 8: Genome instability and mutation; 9: Resisting cell death; 10: Reprogramming energy metabolism. &#x201c;&#x23; Cancer&#x201d; parameter indicates how many cancers a particular gene was found in the analysis. &#x201c;&#x221a;&#x201d; means that the gene (row) is involved in one pathway of a specific hallmark of cancer (column). SUM shows how many genes are involved in a particular cancer hallmark.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Gene symbol</th>
<th align="center">
<inline-graphic xlink:href="fbinf-02-780229-fx1.tif"/>1</th>
<th align="center">
<inline-graphic xlink:href="fbinf-02-780229-fx2.tif"/>2</th>
<th align="center">
<inline-graphic xlink:href="fbinf-02-780229-fx3.tif"/>3</th>
<th align="center">
<inline-graphic xlink:href="fbinf-02-780229-fx4.tif"/>4</th>
<th align="center">
<inline-graphic xlink:href="fbinf-02-780229-fx5.tif"/>5</th>
<th align="center">
<inline-graphic xlink:href="fbinf-02-780229-fx6.tif"/>6</th>
<th align="center">
<inline-graphic xlink:href="fbinf-02-780229-fx7.tif"/>7</th>
<th align="center">
<inline-graphic xlink:href="fbinf-02-780229-fx8.tif"/>8</th>
<th align="center">
<inline-graphic xlink:href="fbinf-02-780229-fx9.tif"/>9</th>
<th align="center">
<inline-graphic xlink:href="fbinf-02-780229-fx10.tif"/>10</th>
<th align="center">&#x23; Cancer</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">LRRK2</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">11</td>
</tr>
<tr>
<td align="left">TGFB1</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">11</td>
</tr>
<tr>
<td align="left">TOP2A</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">10</td>
</tr>
<tr>
<td align="left">GART</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">10</td>
</tr>
<tr>
<td align="left">IL6</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">9</td>
</tr>
<tr>
<td align="left">DECR1</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">9</td>
</tr>
<tr>
<td align="left">CAT</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">9</td>
</tr>
<tr>
<td align="left">EGR1</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">9</td>
</tr>
<tr>
<td align="left">PDGFRB</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">9</td>
</tr>
<tr>
<td align="left">PPARG</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">9</td>
</tr>
<tr>
<td align="left">MAPK13</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">9</td>
</tr>
<tr>
<td align="left">BCL2</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">9</td>
</tr>
<tr>
<td align="left">RAD51</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">9</td>
</tr>
<tr>
<td align="left">HSPA5</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">9</td>
</tr>
<tr>
<td align="left">XPO1</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">9</td>
</tr>
<tr>
<td align="left">APP</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">9</td>
</tr>
<tr>
<td align="left">CDK5</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">9</td>
</tr>
<tr>
<td align="left">TSPO</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">8</td>
</tr>
<tr>
<td align="left">TP53</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">8</td>
</tr>
<tr>
<td align="left">SRC</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">8</td>
</tr>
<tr>
<td align="left">JUN</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">8</td>
</tr>
<tr>
<td align="left">ITGA2</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">8</td>
</tr>
<tr>
<td align="left">FYN</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">8</td>
</tr>
<tr>
<td align="left">MAPK3</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">8</td>
</tr>
<tr>
<td align="left">ACTA2</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">8</td>
</tr>
<tr>
<td align="left">PLK1</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">8</td>
</tr>
<tr>
<td align="left">GMPS</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">8</td>
</tr>
<tr>
<td align="left">CDK6</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">8</td>
</tr>
<tr>
<td align="left">GAPDH</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">8</td>
</tr>
<tr>
<td align="left">CDK1</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">8</td>
</tr>
<tr>
<td align="left">FOS</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">8</td>
</tr>
<tr>
<td align="left">CDK2</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">8</td>
</tr>
<tr>
<td align="left">ISG15</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">8</td>
</tr>
<tr>
<td align="left">NRAS</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">8</td>
</tr>
<tr>
<td align="left">OASL</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">8</td>
</tr>
<tr>
<td align="left">CDK4</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">8</td>
</tr>
<tr>
<td align="left">RHOB</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">8</td>
</tr>
<tr>
<td align="left">SMARCA2</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">8</td>
</tr>
<tr>
<td align="left">HSP90AB1</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">8</td>
</tr>
<tr>
<td align="left">PTEN</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">8</td>
</tr>
<tr>
<td align="left">ACLY</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">8</td>
</tr>
<tr>
<td align="left">ACACA</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">8</td>
</tr>
<tr>
<td align="left">UMPS</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">8</td>
</tr>
<tr>
<td align="left">HDAC1</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">8</td>
</tr>
<tr>
<td align="left">NOTCH1</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">8</td>
</tr>
<tr>
<td align="left">UBC</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">7</td>
</tr>
<tr>
<td align="left">CAD</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">7</td>
</tr>
<tr>
<td align="left">ABL1</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">7</td>
</tr>
<tr>
<td align="left">ACTC1</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">7</td>
</tr>
<tr>
<td align="left">TLR4</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x221a;</td>
<td align="center">&#x2014;</td>
<td align="center">7</td>
</tr>
<tr>
<td align="left">SUM</td>
<td align="center">23</td>
<td align="center">11</td>
<td align="center">9</td>
<td align="center">6</td>
<td align="center">9</td>
<td align="center">16</td>
<td align="center">9</td>
<td align="center">7</td>
<td align="center">22</td>
<td align="center">15</td>
<td align="center">&#x2014;</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>While 31 genes in the universal signature are linked to the hallmarks, 19 are not. We investigated the 19 further and found that four of these 19 genes are cancer drivers. It has been reported that LRRK2 (leucine-rich repeat kinase 2) plays an adaptive role between cancer and Parkinson&#x2019;s disease and is defined as a new target molecule for cancer therapy due to its increased kinase activity (<xref ref-type="bibr" rid="B42">War&#xf8; and Aasly, 2018</xref>; <xref ref-type="bibr" rid="B19">Kim and Jeong, 2020</xref>). GMPS guanine monophosphate synthetase is in the Mammaprint gene list that composes known biomarkers for breast cancer defined by Tian et&#x20;al. (<xref ref-type="bibr" rid="B39">Tian et&#x20;al., 2010</xref>). Other examples were TOP2A and GART. TOP2A, as a DNA topology changer in various DNA associated processes (i.e.,&#x20;replication, chromosome segregation, recombination), is a well-known anti-cancer drug target. Almost 50% of chemotherapies include at least one of the TOP2A inhibitors such as etoposide or doxorubicin (<xref ref-type="bibr" rid="B27">Nitiss, 2009</xref>). GART gene is a trifunctional purine biosynthetic protein adenosine-3, a part of nucleotide metabolism, specifically purine metabolism. Cong et&#x20;al. associated GART with poor prognosis in hepatocellular carcinoma and reported it as a liver cancer cell proliferation promoter (<xref ref-type="bibr" rid="B8">Cong et&#x20;al., 2014</xref>).</p>
<p>The immediate and exciting result of the paper is that the use of network information helps select biomarkers, which represent the hallmarks of cancer, although this information was not explicitly used in the generation of the biomarkers. Overall, we have demonstrated that NetRank, which combines interaction, expression, and phenotype data, can generate robust, compact, and interpretable biomarkers signatures for cancer outcome prediction.</p>
<p>
<bold>The universal biomarker signature picks cancer drivers and drug targets.</bold> Evaluating the genes in the universal signature using the Cancer Genome Atlas (TCGA) reveals that most have degrees of somatic mutations in different cancers. We found simple somatic mutation frequencies between 0.3% (TSPO) to 49.86% (TP53). In addition, we found 14 genes (TP53, PTEN, NOTCH1, NRAS, PDGFRB, ABL1, XPO1, HSP90AB1, PPARG, GMPS, JUN, CDK6, BCL2, CDK4) that are also cancer driver genes. Driver genes are defined as those genes that contain mutations that have been causally implicated in cancer and explain how dysfunction of these genes drives cancer) in the Cancer Gene Consensus database (Cancer Gene Census). These results can be viewed in <xref ref-type="sec" rid="s10">Supplementary Sheets S2</xref>. Finally, some of our genes are already defined as drug targets for some&#x20;types of cancer. We found that most clinical trials, completed or&#x20;incomplete, targeted ABL1, BCL2, CDK1, CDK2, CDK5, FYN, PDGFRB, PLK1, TOP2 for various types of leukemia.</p>
</sec>
<sec id="s4">
<title>Discussion and Conclusion</title>
<p>Biomarkers play a vital role in cancer diagnosis and treatment. Composing suitable biomarker signatures is a complex problem as it requires selecting a limited number of markers from a genome-wide screen. Subsequently, many biomarker signatures reported in the literature were context-specific and did not overlap. This is not surprising as the pathways in different tissues are formed from specific genes and proteins, and the author signatures were introduced accordingly. In this work, we aimed to study shared characteristics of different cancers, taking into account the shared core functions of cancer in different organisms, which were defined as hallmarks of cancer. The latter summarizes and groups these characteristics in ten principles, namely: sustaining proliferative signaling, evading growth suppressors, evading immune destruction, enabling replicative immortality, tumor-promoting inflammation, activating column and metastasis, inducing angiogenesis, genome instability and mutation, resisting cell death, reprogramming energy metabolism (<ext-link ext-link-type="uri" xlink:href="https://www.zotero.org/google-docs/?WIka5W">Hanahan and Weinberg, 2011</ext-link>).</p>
<p>In this study, we addressed this imbalance and employed a network-based method, NetRank, to identify robust biomarkers, which perform across many cancer types and phenotypes. We adapted a random surfer model, which incorporates gene expression, large-scale interaction data, and phenotypic data from the 105 datasets into a feature selection model applied to the 105 datasets. The resulting biomarkers were aggregated and focused on the most frequently selected ones. The result is a universal biomarker signature of 50 genes, which is very compact in comparison to the total of 4,343 distinct genes proposed in signatures of the original data. Using PCA, the universal NetRank signature showed very strong prediction performance across nearly all cancer types except pancreas cancer and across all phenotypes. Thus, this signature is compact, robust, and performant, and it is linked to the hallmarks of cancer genes, although this information was not incorporated in the model. Over half of the genes in the NetRank signature are hallmark genes. Furthermore, a large number are cancer driver genes with a known mutation burden, and others are cancer drug targets. Thus, the use of networks in phenotype prediction leads to reliable, transferable, and interpretable biomarker signatures.</p>
<p>Pancreatic cancer and, to some extent, ovarian cancer are exceptions as they have only a few shared biomarkers with the other cancers. This is probably due to the complexity of the genetic component of pancreatic cancer, which makes it not easily explainable. It is widely accepted that other low penetrance genes play a role in pancreatic cancer (<xref ref-type="bibr" rid="B26">Milne et&#x20;al., 2009</xref>; <xref ref-type="bibr" rid="B20">Klein and Westenberger, 2012</xref>; <xref ref-type="bibr" rid="B3">Al-Fatlawi et&#x20;al., 2021</xref>). When we looked deeper into biomarker studies, including both pancreatic and ovarian cancer, we realized that both pancreatic and ovarian cancer are similar in terms of tissue structure, and both are located in the endocrine system. It has been reported that the biomarkers CA19-9 and CA125 are used to detect both cancer types and that many studies have reported complexity in their genetic components (<xref ref-type="bibr" rid="B28">Nolen and Lokshin, 2014</xref>). Unfortunately, a satisfactory explanation for what makes these two organs so special is still obscure (<xref ref-type="bibr" rid="B40">Ueland, 2017</xref>; <xref ref-type="bibr" rid="B5">Brezgyte et&#x20;al., 2021</xref>). However, there is an interesting study on this subject: In 2019, Yeung and colleagues reported that both ovarian and pancreatic cancer are surrounded by cancer-associated fibroblasts (CAF), and CAF increases angiogenesis and metastasis in these cancers by releasing the microfibril-associated protein 5 (MFAP5) (<xref ref-type="bibr" rid="B44">Yeung et&#x20;al., 2019</xref>). However, as can be seen, this protein is not directly related to these two cancers, at least at the level of the cellular transcriptome. This indicates that both ovarian and pancreatic cancer are affected by microenvironmental factors rather than intracellular factors. In this case, biomarker studies related to these two cancer types need to be examined in terms of microenvironmental factors&#x20;also.</p>
<p>Regarding the data, our study includes the majority of human data and a few mouse datasets. While including mouse datasets does not significantly influence the study, it adds valuable information and an indication of the study&#x2019;s replicability. <xref ref-type="sec" rid="s10">Supplementary Sheets S2</xref> shows that removing mouse datasets has no notable influence on the results, as human datasets mainly indicated the genes included in our universal signature. For example, our best five genes: LRRK2, TGFB1, TOP2A, GART, and IL6, were among the best 50 genes and were significant in 16, 18, 31, 17, and 17 human datasets, respectively, in comparison with only 4, 1, 3, 3, and 1 mouse datasets.</p>
<p>This study builds on biomarker signatures discovered over the last two decades from microarray data. This time period was necessary to turn these signatures into commercial products used in clinical practice. In the meantime, microarrays are superseded by deep sequencing techniques. It is interesting to explore our approach on RNA-Seq data. However, to date, microarray data is still much more abundant than RNA-Seq data. As an estimate for the ratio of microarray to RNA-Seq data, we queried PubMed for &#x201c;microarray cancer outcome prediction&#x201d; and for &#x201c;deep sequencing cancer outcome prediction&#x201d;. The former returned 19,000 papers, the latter 3,000. The former spread out over the two decades with 1,500 papers per year and a recent decrease due to the advent of RNA-Seq. The latter rises steeply peaking at 750 papers. In a few years time, there will be sufficient RNA-Seq data to perform a similar analysis on this type of&#x20;data.</p>
<p>In summary, we have demonstrated that it is possible to compose biomarker signatures that build on common principles of cancer and subsequently perform well on many cancer types and prediction tasks. This universal signature may serve as a starting point and as one building block to develop highly optimised and precise signatures for specific cancer types and outcome prediction&#x20;tasks.</p>
</sec>
</body>
<back>
<sec id="s5">
<title>Data Availability Statement</title>
<p>Publicly available datasets were analyzed in this study. This data can be found here: <ext-link ext-link-type="uri" xlink:href="https://www.ncbi.nlm.nih.gov/geo/">https://www.ncbi.nlm.nih.gov/geo/</ext-link>.</p>
</sec>
<sec id="s6">
<title>Ethics Statement</title>
<p>Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements. Written informed consent was not obtained from the individual(s) for the publication of any potentially identifiable images or data included in this article.</p>
</sec>
<sec id="s7">
<title>Author Contributions</title>
<p>AA-F and MS conceived the study. NA collected data. AA-F evaluated data. AA-F, NA, CO, and NM analyzed data. CO provided the biological interpretation. AA-F, NA, CO, and MS wrote the paper.</p>
</sec>
<sec sec-type="COI-statement" id="s8">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s9">
<title>Publisher&#x2019;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ack>
<p>Thanks to Alexander Mestiashvili for IT support, Christof Winter, Janine Roy and Katja Lisa Linnemann for the fruitful discussion.</p>
</ack>
<sec id="s10">
<title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fbinf.2022.780229/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fbinf.2022.780229/full&#x23;supplementary-material</ext-link>
</p>
<supplementary-material xlink:href="DataSheet4.docx" id="SM1" mimetype="application/docx" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="DataSheet3.xlsx" id="SM2" mimetype="application/xlsx" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="DataSheet1.xlsx" id="SM3" mimetype="application/xlsx" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="DataSheet2.xlsx" id="SM4" mimetype="application/xlsx" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Afr&#x103;s&#xe2;nie</surname>
<given-names>V. A.</given-names>
</name>
<name>
<surname>Marinca</surname>
<given-names>M. V.</given-names>
</name>
<name>
<surname>Alexa-Stratulat</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Gafton</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>P&#x103;duraru</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Adavidoaiei</surname>
<given-names>A. M.</given-names>
</name>
<etal/>
</person-group> (<year>2019</year>). <article-title>KRAS, NRAS, BRAF, HER2 and Microsatellite Instability in Metastatic Colorectal Cancer - Practical Implications for the Clinician</article-title>. <source>Radiol. Oncol.</source> <volume>53</volume> (<issue>3</issue>), <fpage>265</fpage>&#x2013;<lpage>274</lpage>. <pub-id pub-id-type="doi">10.2478/raon-2019-0033</pub-id> </citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Agresti</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>1992</year>). <article-title>A Survey of Exact Inference for Contingency Tables</article-title>. <source>Stat. Sci.</source> <volume>7</volume> (<issue>1</issue>), <fpage>131</fpage>&#x2013;<lpage>153</lpage>. <pub-id pub-id-type="doi">10.1214/ss/1177011454</pub-id> </citation>
</ref>
<ref id="B3">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Al-Fatlawi</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Malekian</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Garc&#xed;a</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Henschel</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Kim</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Dahl</surname>
<given-names>A.</given-names>
</name>
<etal/>
</person-group> (<year>2021</year>). <article-title>Deep Learning Improves Pancreatic Cancer Diagnosis Using RNA-Based Variants</article-title>. <source>Cancers (Basel)</source> <volume>13</volume> (<issue>11</issue>), <fpage>2654</fpage>. <pub-id pub-id-type="doi">10.3390/cancers13112654</pub-id> </citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Barter</surname>
<given-names>R. L.</given-names>
</name>
<name>
<surname>Schramm</surname>
<given-names>S. J.</given-names>
</name>
<name>
<surname>Mann</surname>
<given-names>G. J.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>Y. H.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Network-based Biomarkers Enhance Classical Approaches to Prognostic Gene Expression Signatures</article-title>. <source>BMC Syst. Biol.</source> <volume>8</volume>, <fpage>S5</fpage>. <pub-id pub-id-type="doi">10.1186/1752-0509-8-S4-S5</pub-id> </citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Brezgyte</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Shah</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Jach</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Crnogorac-Jurcevic</surname>
<given-names>T.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Non-Invasive Biomarkers for Earlier Detection of Pancreatic Cancer-A Comprehensive Review</article-title>. <source>Cancers (Basel)</source> <volume>13</volume> (<issue>11</issue>), <fpage>2722</fpage>. <pub-id pub-id-type="doi">10.3390/cancers13112722</pub-id> </citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cicenas</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Tamosaitis</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Kvederaviciute</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Tarvydas</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Staniute</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Kalyan</surname>
<given-names>K.</given-names>
</name>
<etal/>
</person-group> (<year>2017</year>). <article-title>KRAS, NRAS and BRAF Mutations in Colorectal Cancer and Melanoma</article-title>. <source>Med. Oncol.</source> <volume>34</volume> (<issue>2</issue>), <fpage>26</fpage>. <pub-id pub-id-type="doi">10.1007/s12032-016-0879-9</pub-id> </citation>
</ref>
<ref id="B7">
<citation citation-type="web">
<collab>Colon Cancer Genetic Testing</collab> (<year>2021</year>). <article-title>ColoNext Ambry Genetics</article-title>. <comment>Available at</comment>: <comment>: <ext-link ext-link-type="uri" xlink:href="https://www.ambrygen.com/providers/genetic-testing/6/oncology/colonext">https://www.ambrygen.com/providers/genetic-testing/6/oncology/colonext</ext-link>.</comment> </citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cong</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Lu</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Cui</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Cai</surname>
<given-names>J.</given-names>
</name>
<etal/>
</person-group> (<year>2014</year>). <article-title>Increased Expression of Glycinamide Ribonucleotide Transformylase Is Associated with a Poor Prognosis in Hepatocellular Carcinoma, and it Promotes Liver Cancer Cell Proliferation</article-title>. <source>Hum. Pathol.</source> <volume>45</volume> (<issue>7</issue>), <fpage>1370</fpage>&#x2013;<lpage>1378</lpage>. <pub-id pub-id-type="doi">10.1016/j.humpath.2013.11.021</pub-id> </citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Crawford</surname>
<given-names>E. D.</given-names>
</name>
<name>
<surname>Scholz</surname>
<given-names>M. C.</given-names>
</name>
<name>
<surname>Kar</surname>
<given-names>A. J.</given-names>
</name>
<name>
<surname>Fegan</surname>
<given-names>J.&#x20;E.</given-names>
</name>
<name>
<surname>Haregewoin</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Kaldate</surname>
<given-names>R. R.</given-names>
</name>
<etal/>
</person-group> (<year>2014</year>). <article-title>Cell Cycle Progression Score and Treatment Decisions in Prostate Cancer: Results from an Ongoing Registry</article-title>. <source>Curr. Med. Res. Opin.</source> <volume>30</volume> (<issue>6</issue>), <fpage>1025</fpage>&#x2013;<lpage>1031</lpage>. <pub-id pub-id-type="doi">10.1185/03007995.2014.899208</pub-id> </citation>
</ref>
<ref id="B10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Durinck</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Moreau</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Kasprzyk</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Davis</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>De Moor</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Brazma</surname>
<given-names>A.</given-names>
</name>
<etal/>
</person-group> (<year>2005</year>). <article-title>BioMart and Bioconductor: A Powerful Link between Biological Databases and Microarray Data Analysis</article-title>. <source>Bioinformatics</source> <volume>21</volume> (<issue>16</issue>), <fpage>3439</fpage>&#x2013;<lpage>3440</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/bti525</pub-id> </citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dutkowski</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Ideker</surname>
<given-names>T.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>Protein Networks as Logic Functions in Development and Cancer</article-title>. <source>Plos Comput. Biol.</source> <volume>7</volume> (<issue>9</issue>), <fpage>e1002180</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pcbi.1002180</pub-id> </citation>
</ref>
<ref id="B12">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Edgar</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Domrachev</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Lash</surname>
<given-names>A. E.</given-names>
</name>
</person-group> (<year>2002</year>). <article-title>Gene Expression Omnibus: NCBI Gene Expression and Hybridization Array Data Repository</article-title>. <source>Nucleic Acids Res.</source> <volume>30</volume> (<issue>1</issue>), <fpage>207</fpage>&#x2013;<lpage>210</lpage>. <pub-id pub-id-type="doi">10.1093/nar/30.1.207</pub-id> </citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ein-Dor</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Zuk</surname>
<given-names>O.</given-names>
</name>
<name>
<surname>Domany</surname>
<given-names>E.</given-names>
</name>
</person-group> (<year>2006</year>). <article-title>Thousands of Samples Are Needed to Generate a Robust Gene List for Predicting Outcome in Cancer</article-title>. <source>Proc. Natl. Acad. Sci. U S A.</source> <volume>103</volume> (<issue>15</issue>), <fpage>5923</fpage>&#x2013;<lpage>5928</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.0601231103</pub-id> </citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gaulton</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Hersey</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Nowotka</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Bento</surname>
<given-names>A. P.</given-names>
</name>
<name>
<surname>Chambers</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Mendez</surname>
<given-names>D.</given-names>
</name>
<etal/>
</person-group> (<year>2017</year>). <article-title>The ChEMBL Database in 2017</article-title>. <source>Nucleic Acids Res.</source> <volume>45</volume> (<issue>D1</issue>), <fpage>D945</fpage>&#x2013;<lpage>D954</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkw1074</pub-id> </citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gautier</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Cope</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Bolstad</surname>
<given-names>B. M.</given-names>
</name>
<name>
<surname>Irizarry</surname>
<given-names>R. A.</given-names>
</name>
</person-group> (<year>2004</year>). <article-title>affy--analysis of Affymetrix GeneChip Data at the Probe Level</article-title>. <source>Bioinformatics</source> <volume>20</volume> (<issue>3</issue>), <fpage>307</fpage>&#x2013;<lpage>315</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btg405</pub-id> </citation>
</ref>
<ref id="B16">
<citation citation-type="web">
<collab>GDC</collab> (<year>2021</year>). <article-title>Harmonized Cancer Datasets Genomic Data Commons Data Portal</article-title>. <comment>Available at: <ext-link ext-link-type="uri" xlink:href="https://portal.gdc.cancer.gov/">https://portal.gdc.cancer.gov/</ext-link>.</comment> </citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hanahan</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Weinberg</surname>
<given-names>R. A.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>Hallmarks of Cancer: The Next Generation</article-title>. <source>Cell</source> <volume>144</volume> (<issue>5</issue>), <fpage>646</fpage>&#x2013;<lpage>674</lpage>. <pub-id pub-id-type="doi">10.1016/j.cell.2011.02.013</pub-id> </citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hanahan</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Weinberg</surname>
<given-names>R. A.</given-names>
</name>
</person-group> (<year>2000</year>). <article-title>The Hallmarks of Cancer</article-title>. <source>Cell</source> <volume>100</volume> (<issue>1</issue>), <fpage>57</fpage>&#x2013;<lpage>70</lpage>. <pub-id pub-id-type="doi">10.1016/s0092-8674(00)81683-9</pub-id> </citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kim</surname>
<given-names>Y. C.</given-names>
</name>
<name>
<surname>Jeong</surname>
<given-names>B. H.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Identification of Somatic Mutations in Dementia-Related Genes in Cancer Patients</article-title>. <source>Curr. Alzheimer Res.</source> <volume>17</volume> (<issue>9</issue>), <fpage>835</fpage>&#x2013;<lpage>844</lpage>. <pub-id pub-id-type="doi">10.2174/1567205017666201203124341</pub-id> </citation>
</ref>
<ref id="B20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Klein</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Westenberger</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>Genetics of Parkinson&#x2019;s Disease</article-title>. <source>Cold Spring Harbor Perspect. Med.</source> <volume>2</volume>, <fpage>a008888</fpage>. <pub-id pub-id-type="doi">10.1101/cshperspect.a008888</pub-id> </citation>
</ref>
<ref id="B21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ko</surname>
<given-names>J.&#x20;H.</given-names>
</name>
<name>
<surname>Ko</surname>
<given-names>E. A.</given-names>
</name>
<name>
<surname>Gu</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Lim</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Bang</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>T.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>Expression Profiling of Ion Channel Genes Predicts Clinical Outcome in Breast Cancer</article-title>. <source>Mol. Cancer</source> <volume>12</volume> (<issue>1</issue>), <fpage>106</fpage>. <pub-id pub-id-type="doi">10.1186/1476-4598-12-106</pub-id> </citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Koprowski</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Herlyn</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Steplewski</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Sears</surname>
<given-names>H. F.</given-names>
</name>
</person-group> (<year>1981</year>). <article-title>Specific Antigen in Serum of Patients with colon Carcinoma</article-title>. <source>Science</source> <volume>212</volume> (<issue>4490</issue>), <fpage>53</fpage>&#x2013;<lpage>55</lpage>. <pub-id pub-id-type="doi">10.1126/science.6163212</pub-id> </citation>
</ref>
<ref id="B23">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Koscielny</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>An</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Carvalho-Silva</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Cham</surname>
<given-names>J.&#x20;A.</given-names>
</name>
<name>
<surname>Fumis</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Gasparyan</surname>
<given-names>R.</given-names>
</name>
<etal/>
</person-group> (<year>2017</year>). <article-title>Open Targets: A Platform for Therapeutic Target Identification and Validation</article-title>. <source>Nucleic Acids Res.</source> <volume>45</volume> (<issue>D1</issue>), <fpage>D985</fpage>&#x2013;<lpage>D994</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkw1055</pub-id> </citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Langfelder</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Horvath</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2008</year>). <article-title>WGCNA: An R Package for Weighted Correlation Network Analysis</article-title>. <source>BMC Bioinformatics</source> <volume>9</volume>, <fpage>559</fpage>. <pub-id pub-id-type="doi">10.1186/1471-2105-9-559</pub-id> </citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lee</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Rauch</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Kolch</surname>
<given-names>W.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Targeting MAPK Signaling in Cancer: Mechanisms of Drug Resistance and Sensitivity</article-title>. <source>Int. J.&#x20;Mol. Sci.</source> <volume>21</volume> (<issue>3</issue>), <fpage>E1102</fpage>. <pub-id pub-id-type="doi">10.3390/ijms21031102</pub-id> </citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Milne</surname>
<given-names>R. L.</given-names>
</name>
<name>
<surname>Greenhalf</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Murta-Nascimento</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Real</surname>
<given-names>F. X.</given-names>
</name>
<name>
<surname>Malats</surname>
<given-names>N.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>The Inherited Genetic Component of Sporadic Pancreatic Adenocarcinoma</article-title>. <source>Pancreatology</source> <volume>9</volume> (<issue>3</issue>), <fpage>206</fpage>&#x2013;<lpage>214</lpage>. <pub-id pub-id-type="doi">10.1159/000210261</pub-id> </citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nitiss</surname>
<given-names>J.&#x20;L.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>Targeting DNA Topoisomerase II in Cancer Chemotherapy</article-title>. <source>Nat. Rev. Cancer</source> <volume>9</volume> (<issue>5</issue>), <fpage>338</fpage>&#x2013;<lpage>350</lpage>. <pub-id pub-id-type="doi">10.1038/nrc2607</pub-id> </citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nolen</surname>
<given-names>B. M.</given-names>
</name>
<name>
<surname>Lokshin</surname>
<given-names>A. E.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Chapter 45&#x2014;Pancreatic and Ovarian Cancer Biomarkers</article-title>. <source>Biomarkers Toxicol.</source>, <fpage>759</fpage>. <pub-id pub-id-type="doi">10.1016/B978-0-12-404630-6.00045-2</pub-id> </citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Paik</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Shak</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Tang</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Kim</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Baker</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Cronin</surname>
<given-names>M.</given-names>
</name>
<etal/>
</person-group> (<year>2004</year>). <article-title>A Multigene Assay to Predict Recurrence of Tamoxifen-Treated, Node-Negative Breast Cancer</article-title>. <source>N. Engl. J.&#x20;Med.</source> <volume>351</volume> (<issue>27</issue>), <fpage>2817</fpage>&#x2013;<lpage>2826</lpage>. <pub-id pub-id-type="doi">10.1056/NEJMoa041588</pub-id> </citation>
</ref>
<ref id="B30">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Parker</surname>
<given-names>J.&#x20;S.</given-names>
</name>
<name>
<surname>Mullins</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Cheang</surname>
<given-names>M. C.</given-names>
</name>
<name>
<surname>Leung</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Voduc</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Vickery</surname>
<given-names>T.</given-names>
</name>
<etal/>
</person-group> (<year>2009</year>). <article-title>Supervised Risk Predictor of Breast Cancer Based on Intrinsic Subtypes</article-title>. <source>J.&#x20;Clin. Oncol.</source> <volume>27</volume> (<issue>8</issue>), <fpage>1160</fpage>&#x2013;<lpage>1167</lpage>. <pub-id pub-id-type="doi">10.1200/JCO.2008.18.1370</pub-id> </citation>
</ref>
<ref id="B31">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pedregosa</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Varoquaux</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Gramfort</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Michel</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Thirion</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Grisel</surname>
<given-names>O.</given-names>
</name>
<etal/>
</person-group> (<year>2011</year>). <article-title>Scikit-learn: Machine Learning in Python</article-title>. <source>J.&#x20;Machine Learn. Res.</source> <volume>12</volume> (<issue>85</issue>), <fpage>2825</fpage>&#x2013;<lpage>2830</lpage>. <comment>Available at: <ext-link ext-link-type="uri" xlink:href="http://jmlr.org/papers/v12/pedregosa11a.html">http://jmlr.org/papers/v12/pedregosa11a.html</ext-link>.</comment> </citation>
</ref>
<ref id="B32">
<citation citation-type="book">
<collab>R Core Team.</collab> (<year>2020</year>). <source>R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing</source>. <publisher-loc>Vienna, Austria</publisher-loc>. <comment>Available at: <ext-link ext-link-type="uri" xlink:href="https://www.r-bloggers.com/2018/06/its-easy-to-cite-and-reference-r/">https://www.r-bloggers.com/2018/06/its-easy-to-cite-and-reference-r/</ext-link>.</comment> </citation>
</ref>
<ref id="B33">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Roy</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Winter</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Isik</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Schroeder</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Network Information Improves Cancer Outcome Prediction</article-title>. <source>Brief Bioinform</source> <volume>15</volume> (<issue>4</issue>), <fpage>612</fpage>&#x2013;<lpage>625</lpage>. <pub-id pub-id-type="doi">10.1093/bib/bbs083</pub-id> </citation>
</ref>
<ref id="B34">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shi</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Beauchamp</surname>
<given-names>R. D.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>B.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>A Network-Based Gene Expression Signature Informs Prognosis and Treatment for Colorectal Cancer Patients</article-title>. <source>PLoS ONE</source> <volume>7</volume> (<issue>7</issue>), <fpage>e41292</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0041292</pub-id> </citation>
</ref>
<ref id="B35">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Silwal-Pandit</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Langer&#xf8;d</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>B&#xf8;rresen-Dale</surname>
<given-names>A.-L.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>TP53 Mutations in Breast and Ovarian Cancer</article-title>. <source>Cold Spring Harbor Perspect. Med.</source> <volume>7</volume>, <fpage>a026252</fpage>. <pub-id pub-id-type="doi">10.1101/cshperspect.a026252</pub-id> </citation>
</ref>
<ref id="B36">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Stolfi</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Troncone</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Marafini</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Monteleone</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Role of TGF-Beta and Smad7 in Gut Inflammation, Fibrosis and Cancer</article-title>. <source>Biomolecules</source> <volume>11</volume> (<issue>1</issue>), <fpage>E17</fpage>. <pub-id pub-id-type="doi">10.3390/biom11010017</pub-id> </citation>
</ref>
<ref id="B37">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Szklarczyk</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Gable</surname>
<given-names>A. L.</given-names>
</name>
<name>
<surname>Lyon</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Junge</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Wyder</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Huerta-Cepas</surname>
<given-names>J.</given-names>
</name>
<etal/>
</person-group> (<year>2019</year>). <article-title>STRING V11: Protein-Protein Association Networks with Increased Coverage, Supporting Functional Discovery in Genome-wide Experimental Datasets</article-title>. <source>Nucleic Acids Res.</source> <volume>47</volume> (<issue>D1</issue>), <fpage>D607</fpage>&#x2013;<lpage>D613</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gky1131</pub-id> </citation>
</ref>
<ref id="B38">
<citation citation-type="web">
<collab>Targeted Cancer Therapies Fact Sheet&#x2014;National Cancer Institute</collab> (<year>2021</year>). <article-title>Targeted Cancer Therapies Fact Sheet&#x2014;National Cancer Institute (Nciglobal,ncienterprise)</article-title>. <comment>Available at: <ext-link ext-link-type="uri" xlink:href="https://www.cancer.gov/about-cancer/treatment/types/targeted-therapies/targeted-therapies-fact-sheet">https://www.cancer.gov/about-cancer/treatment/types/targeted-therapies/targeted-therapies-fact-sheet</ext-link>.</comment> </citation>
</ref>
<ref id="B39">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tian</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Roepman</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Van&#x27;t Veer</surname>
<given-names>L. J.</given-names>
</name>
<name>
<surname>de Snoo</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Glas</surname>
<given-names>A. M.</given-names>
</name>
<etal/>
</person-group> (<year>2010</year>). <article-title>Biological Functions of the Genes in the Mammaprint Breast Cancer Profile Reflect the Hallmarks of Cancer</article-title>. <source>Biomark Insights</source> <volume>5</volume>, <fpage>129</fpage>&#x2013;<lpage>138</lpage>. <pub-id pub-id-type="doi">10.4137/BMI.S6184</pub-id> </citation>
</ref>
<ref id="B40">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ueland</surname>
<given-names>F. R.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>A Perspective on Ovarian Cancer Biomarkers: Past, Present and Yet-To-Come</article-title>. <source>Diagnostics (Basel)</source> <volume>7</volume> (<issue>1</issue>), <fpage>E14</fpage>. <pub-id pub-id-type="doi">10.3390/diagnostics7010014</pub-id> </citation>
</ref>
<ref id="B41">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>van &#x27;t Veer</surname>
<given-names>L. J.</given-names>
</name>
<name>
<surname>Dai</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>van de Vijver</surname>
<given-names>M. J.</given-names>
</name>
<name>
<surname>He</surname>
<given-names>Y. D.</given-names>
</name>
<name>
<surname>Hart</surname>
<given-names>A. A.</given-names>
</name>
<name>
<surname>Mao</surname>
<given-names>M.</given-names>
</name>
<etal/>
</person-group> (<year>2002</year>). <article-title>Gene Expression Profiling Predicts Clinical Outcome of Breast Cancer</article-title>. <source>Nature</source> <volume>415</volume> (<issue>6871</issue>), <fpage>530</fpage>&#x2013;<lpage>536</lpage>. <pub-id pub-id-type="doi">10.1038/415530a</pub-id> </citation>
</ref>
<ref id="B42">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>War&#xf8;</surname>
<given-names>B. J.</given-names>
</name>
<name>
<surname>Aasly</surname>
<given-names>J.&#x20;O.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Exploring Cancer in LRRK2 Mutation Carriers and Idiopathic Parkinson&#x2019;s Disease</article-title>. <source>Brain Behav.</source> <volume>8</volume>, <fpage>e00858</fpage>. <pub-id pub-id-type="doi">10.1002/brb3.858</pub-id> </citation>
</ref>
<ref id="B43">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Winter</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Kristiansen</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Kersting</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Roy</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Aust</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Kn&#xf6;sel</surname>
<given-names>T.</given-names>
</name>
<etal/>
</person-group> (<year>2012</year>). <article-title>Google goes cancer: Improving outcome prediction for cancer patients by network-based ranking of marker genes</article-title> <source>PLoS Comput Biol</source>. <volume>8</volume> (<issue>5</issue>), <fpage>e1002</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pcbi.1002511</pub-id> </citation>
</ref>
<ref id="B44">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yeung</surname>
<given-names>T. L.</given-names>
</name>
<name>
<surname>Leung</surname>
<given-names>C. S.</given-names>
</name>
<name>
<surname>Yip</surname>
<given-names>K. P.</given-names>
</name>
<name>
<surname>Sheng</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Vien</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Bover</surname>
<given-names>L. C.</given-names>
</name>
<etal/>
</person-group> (<year>2019</year>). <article-title>Anticancer Immunotherapy by MFAP5 Blockade Inhibits Fibrosis and Enhances Chemosensitivity in Ovarian and Pancreatic Cancer</article-title>. <source>Clin. Cancer Res.</source> <volume>25</volume> (<issue>21</issue>), <fpage>6417</fpage>&#x2013;<lpage>6428</lpage>. <pub-id pub-id-type="doi">10.1158/1078-0432.CCR-19-0187</pub-id> </citation>
</ref>
<ref id="B45">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zarzynska</surname>
<given-names>J.&#x20;M.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Two Faces of TGF-Beta1 in Breast Cancer</article-title>. <source>Mediators Inflamm.</source> <volume>2014</volume>, <fpage>141747</fpage>. <pub-id pub-id-type="doi">10.1155/2014/141747</pub-id> </citation>
</ref>
<ref id="B46">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Huo</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Xie</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>L.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>CHG: A Systematically Integrated Database of Cancer Hallmark Genes</article-title>. <source>Front. Genet.</source> <volume>11</volume>, <fpage>29</fpage>. <pub-id pub-id-type="doi">10.3389/fgene.2020.00029</pub-id> </citation>
</ref>
</ref-list>
</back>
</article>