<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Archiving and Interchange DTD v2.3 20070202//EN" "archivearticle.dtd">
<article article-type="methods-article" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Genet.</journal-id>
<journal-title>Frontiers in Genetics</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Genet.</abbrev-journal-title>
<issn pub-type="epub">1664-8021</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">839949</article-id>
<article-id pub-id-type="doi">10.3389/fgene.2022.839949</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Genetics</subject>
<subj-group>
<subject>Methods</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>An Ensemble Learning Framework for Detecting Protein Complexes From PPI Networks</article-title>
<alt-title alt-title-type="left-running-head">Wang et&#x20;al.</alt-title>
<alt-title alt-title-type="right-running-head">ELF-DPC</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Wang</surname>
<given-names>Rongquan</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1513870/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Ma</surname>
<given-names>Huimin</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1514047/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Wang</surname>
<given-names>Caixia</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1587485/overview"/>
</contrib>
</contrib-group>
<aff id="aff1">
<sup>1</sup>
<institution>School of Computer and Communication Engineering</institution>, <institution>University of Science and Technology Beijing</institution>, <addr-line>Beijing</addr-line>, <country>China</country>
</aff>
<aff id="aff2">
<sup>2</sup>
<institution>School of International Economics</institution>, <institution>China Foreign Affairs University</institution>, <addr-line>Beijing</addr-line>, <country>China</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/716733/overview">Yichuan Liu</ext-link>, Children&#x2019;s Hospital of Philadelphia (CHOP), United&#x20;States</p>
</fn>
<fn fn-type="edited-by">
<p>
<bold>Reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/632487/overview">Min Wu</ext-link>, Institute for Infocomm Research (A&#x2217;STAR), Singapore</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/862620/overview">Tiantian He</ext-link>, Technology and Research (A&#x2217;STAR), Singapore</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Huimin Ma, <email>mhmpub@ustb.edu.cn</email>
</corresp>
<fn fn-type="other">
<p>This article was submitted to Computational Genomics, a section of the journal Frontiers in Genetics</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>24</day>
<month>02</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>13</volume>
<elocation-id>839949</elocation-id>
<history>
<date date-type="received">
<day>20</day>
<month>12</month>
<year>2021</year>
</date>
<date date-type="accepted">
<day>31</day>
<month>01</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2022 Wang, Ma and Wang.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Wang, Ma and Wang</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these&#x20;terms.</p>
</license>
</permissions>
<abstract>
<p>Detecting protein complexes is one of the keys to understanding cellular organization and processes principles. With high-throughput experiments and computing science development, it has become possible to detect protein complexes by computational methods. However, most computational methods are based on either unsupervised learning or supervised learning. Unsupervised learning-based methods do not need training datasets, but they can only detect one or several topological protein complexes. Supervised learning-based methods can detect protein complexes with different topological structures. However, they are usually based on a type of training model, and the generalization of a single model is poor. Therefore, we propose an Ensemble Learning Framework for Detecting Protein Complexes (ELF-DPC) within protein-protein interaction (PPI) networks to address these challenges. The ELF-DPC first constructs the weighted PPI network by combining topological and biological information. Second, it mines protein complex cores using the protein complex core mining strategy we designed. Third, it obtains an ensemble learning model by integrating structural modularity and a trained voting regressor model. Finally, it extends the protein complex cores and forms protein complexes by a graph heuristic search strategy. The experimental results demonstrate that ELF-DPC performs better than the twelve state-of-the-art approaches. Moreover, functional enrichment analysis illustrated that ELF-DPC could detect biologically meaningful protein complexes. The code/dataset is available for free download from <ext-link ext-link-type="uri" xlink:href="https://github.com/RongquanWang/ELF-DPC">https://github.com/RongquanWang/ELF-DPC</ext-link>.</p>
</abstract>
<kwd-group>
<kwd>protein complexes</kwd>
<kwd>protein-protein interaction networks</kwd>
<kwd>graph clustering algorithms</kwd>
<kwd>ensemble learning</kwd>
<kwd>network embedding</kwd>
<kwd>biological information</kwd>
</kwd-group>
<contract-num rid="cn002">U20B2062 62172036</contract-num>
<contract-sponsor id="cn001">Fundamental Research Funds for the Central Universities<named-content content-type="fundref-id">10.13039/501100012226</named-content>
</contract-sponsor>
<contract-sponsor id="cn002">National Natural Science Foundation of China<named-content content-type="fundref-id">10.13039/501100001809</named-content>
</contract-sponsor>
</article-meta>
</front>
<body>
<sec id="s1">
<title>1 Introduction</title>
<p>Most complex systems, such as biological systems and human society, can be presented as complex networks in the real world. Social networks, biological networks, brain networks, citation networks, and protein-protein interaction networks are examples of complex networks (<xref ref-type="bibr" rid="B46">Pourkazemi and Keyvanpour, 2017</xref>). Community detection in complex networks is essential in many fields, aiming to identify clusters with high internal connectivity. These clusters are well separated from the rest of the network. Over the past several years, the study of community identification in complex networks has grown popular. Community detection is a fundamental problem in network analysis that tries to mine the hidden structure of a specific complex network (<xref ref-type="bibr" rid="B10">Fortunato, 2010</xref>; <xref ref-type="bibr" rid="B1">Abduljabbar et&#x20;al., 2020</xref>). In bioinformatics, the crucial topic is to mine protein complexes in PPI networks. Proteins usually interact with each other, forming protein complexes to accomplish their biological functions (<xref ref-type="bibr" rid="B13">Gavin et&#x20;al., 2002</xref>; <xref ref-type="bibr" rid="B53">Spirin and Mirny, 2003</xref>). As a community structure in the PPI network, it may be the natural protein complex, and the proteins in the protein complex should be highly interconnected (<xref ref-type="bibr" rid="B14">Girvan and Newman, 2002</xref>; <xref ref-type="bibr" rid="B6">Chen et&#x20;al., 2014</xref>). The truth is that the prediction of protein complexes is essential for studying cellular organization theory and understanding protein complex formation. Biologically, a protein complex is a group of proteins formed by interacting simultaneously and in place. The detection of protein complexes using biological experiments is both costly and time-consuming. With the development of high-throughput experimental methods, many PPI networks have been produced, which usually have small world, scale-free, and modularity characteristics. They could be formulated as graphs where the nodes represent the proteins, and the edges represent the interactions. Therefore, many computational algorithms present alternate ways to automatically discover protein complexes from the PPI networks. More details on the related work are introduced in the related work section.</p>
<sec id="s1-1">
<title>1.1 Related Work</title>
<p>During the past decade, various computational methods have been presented to identify protein complexes in PPI networks. We will briefly review the related work from three aspects. The first is identifying protein complexes based on unsupervised learning-based methods. Another type of identifying protein complex methods is based on a model optimization-based method. The last type of identifying protein complex methods is based on supervised learning-based methods.</p>
<sec id="s1-1-1">
<title>1.1.1 Unsupervised Learning-Based Methods</title>
<p>Many researchers hypothesize that subgraphs with different topological structures in PPI networks are factual protein complexes (<xref ref-type="bibr" rid="B55">Wang et&#x20;al., 2010</xref>) such as density, k-clique, and core-attachment structures. Most of these methods are either global heuristic search, local heuristic search, or both. Meanwhile, some methods integrate topological and biological information to further improve the accuracy of detecting protein complexes.</p>
<p>Many local heuristic-based methods have been proposed to identify protein complexes. For instance, Altaf-Ul-Amin et&#x20;al. (<xref ref-type="bibr" rid="B3">Altaf-Ul-Amin et&#x20;al., 2006</xref>) developed DPClus, which generates clusters by ensuring density and checking the periphery of the clusters. Gavin et&#x20;al. (<xref ref-type="bibr" rid="B12">Gavin et&#x20;al., 2006</xref>) studied the organization of protein complexes, demonstrating that a protein complex generally contains a unique protein complex core and attachment proteins, called a core-attachment structure. Here, proteins in a protein complex core have relatively more reliable interactions among themselves. The attachment proteins are the surrounding proteins of the protein complex core to assist it in performing related functions (<xref ref-type="bibr" rid="B29">Lakizadeh et&#x20;al., 2015</xref>). Wu et&#x20;al. (<xref ref-type="bibr" rid="B62">Wu et&#x20;al., 2009</xref>) proposed a classic protein complex discovery method (COACH) using the core-attachment structure. COACH first detects protein complex cores and then identifies its attachment proteins to form a whole protein complex. Peng et&#x20;al. (<xref ref-type="bibr" rid="B45">Peng et&#x20;al., 2014</xref>) designed a PageRank Nibble strategy to give adjacent proteins different probabilities with core-attachment structures and proposed WPNCA to predict protein complexes. Nepuse et&#x20;al. (<xref ref-type="bibr" rid="B42">Nepusz et&#x20;al., 2012</xref>) presented ClusterONE, which utilizes a demanding growth process to mine subgraphs with high cohesiveness that may be protein complexes. Recently, Wang et&#x20;al. (<xref ref-type="bibr" rid="B59">Wang et&#x20;al., 2020</xref>) presented a new graph clustering method using a local heuristic search strategy to detect static and dynamic protein complexes. These local heuristic methods have strong local searchability, but finding an optimal global solution is difficult.</p>
<p>Meanwhile, some global heuristic-based methods have been proposed to identify protein complexes. In 2009, Liu et&#x20;al. (<xref ref-type="bibr" rid="B35">Liu et&#x20;al., 2009</xref>) used an iterative method to weight PPI networks and developed a maximal clique-based method (CMC) to discover protein complexes from weighted PPI networks. Wang et&#x20;al. (<xref ref-type="bibr" rid="B24">Wang et&#x20;al., 2012</xref>) were inspired by the hierarchical organization of GO annotations and known protein complexes. Then they proposed OH-PIN, which is based on the concepts of overlapping M-clusters, <italic>&#x3bb;</italic>-module, and clustering coefficients to detect both overlapping and hierarchical protein complexes in PPI networks. PC2P (<xref ref-type="bibr" rid="B43">Omranian et&#x20;al., 2021</xref>) is a parameter-free greedy approximation algorithm casts the problem of protein complex detection as a network partitioning into biclique spanned subgraphs, which include both sparse and dense subgraphs. Although these global heuristic search methods have a strong global search ability, they require considerable time and computing resources.</p>
<p>Recently, some methods based on network embedding strategies have been used to detect protein complexes. DPC-HCNE (<xref ref-type="bibr" rid="B40">Meng et&#x20;al., 2019</xref>) is a novel protein complex detection method based on hierarchical compressing network embedding and core-attachment structures. It can preserve both the local topological information and global topological information of a PPI network. CPredictor 5.0 (<xref ref-type="bibr" rid="B66">Yao et&#x20;al., 2019</xref>) uses the network embedding method Node2Vec (<xref ref-type="bibr" rid="B15">Grover and Leskovec, 2016</xref>) to learn node feature vector representation and then calculates the node embedding similarity and the functional similarity between interacting proteins to construct the weight PPI networks. These methods illustrate that employing the network embedding method could improve the accuracy of protein complex identification.</p>
<p>It is well known that PPI networks contain many false-positive and false-negative interactions, i.e.,&#x20;noise. To overcome the noise of the PPI networks, some studies try to exploit biological information, such as gene expression data (<xref ref-type="bibr" rid="B25">Keretsu and Sarmah, 2016</xref>), gene ontology (GO) data (<xref ref-type="bibr" rid="B57">Wang et&#x20;al., 2019</xref>; <xref ref-type="bibr" rid="B66">Yao et&#x20;al., 2019</xref>), and subcellular localization data (<xref ref-type="bibr" rid="B31">Lei et&#x20;al., 2018</xref>) to complement the interactions in PPI networks. CPredictor2.0 (<xref ref-type="bibr" rid="B65">Xu et&#x20;al., 2017</xref>) effectively detects protein complexes from PPI networks, and first groups proteins based on functional annotations. Then, it applies the MCL algorithm to detect dense clusters as protein complexes. Zhang et&#x20;al. (<xref ref-type="bibr" rid="B74">Zhang et&#x20;al., 2016</xref>) calculated the active time point and the active probability of each protein and constructed dynamic PPI networks. Then a novel method was proposed based on the core-attachment structure. Zhang et&#x20;al. (<xref ref-type="bibr" rid="B72">Zhang et&#x20;al., 2019</xref>) proposed a novel method based on the core-attachment structure and seed expansion strategy to identify protein complexes using the topological structure and biological data in static PPI networks. ICJointLE (<xref ref-type="bibr" rid="B72">Zhang et&#x20;al., 2019</xref>) is a novel method to identify protein complexes with the features of joint colocalization and joint coexpression in static PPI networks. NNP (<xref ref-type="bibr" rid="B60">Zhang et&#x20;al., 2021</xref>) is a new method for recognizing protein complexes by topological characteristics and biological characteristics. Some methods (<xref ref-type="bibr" rid="B70">Zaki et&#x20;al., 2013</xref>; <xref ref-type="bibr" rid="B57">Wang et&#x20;al., 2019</xref>) are based on topological information to weight interactions in PPI networks. For example, PEWCC (<xref ref-type="bibr" rid="B70">Zaki et&#x20;al., 2013</xref>) is a novel graph mining method that first assesses the reliability of the interactions and then detects protein complexes based on the concept of the weighted clustering coefficient. These methods have shown that the accuracy of protein complex identification can be significantly improved by integrating network topological structure and multiple biological information.</p>
</sec>
<sec id="s1-1-2">
<title>1.1.2 Model Optimization-Based Methods</title>
<p>Several recent methods suggested that identifying protein complexes or community structures can be an optimization problem using network topology and protein attributes. For example, RNSC (<xref ref-type="bibr" rid="B26">King et&#x20;al., 2004</xref>) attempts to find an optimal set of partitions of a PPI network graph by employing different cost functions for detecting protein complexes. RSGNM (<xref ref-type="bibr" rid="B64">Zhang et&#x20;al., 2012</xref>) is a regularized sparse generative network model that adds another process that generates propensities into an existing generative network model for protein complex identification. EGCPI (<xref ref-type="bibr" rid="B17">He and Chan, 2016</xref>) formulates the problem as an optimization problem to mine the optimal clusters with densely connected vertices in the PPI networks to discover protein complexes. DPCA (<xref ref-type="bibr" rid="B23">Hu et&#x20;al., 2018</xref>) formulates the problem of detecting protein complexes as a constrained optimization problem according to protein complexes&#x2019; topological and biological properties. In particular, it is an algorithm with high efficiency and effectiveness. GMFTP (<xref ref-type="bibr" rid="B73">Zhang et&#x20;al., 2014</xref>) is a generative model to simulate the generative processes of topological and biological information, and clusters that maximize the likelihood of generating the given PIN are considered protein complexes. DCAFP (<xref ref-type="bibr" rid="B22">Hu and Chan, 2015</xref>) transforms the problem of identifying protein complexes into a constrained optimization problem and introduces an optimization model by considering the integration of functional preferences and dense structures. He et&#x20;al. (<xref ref-type="bibr" rid="B18">He et&#x20;al., 2019</xref>) introduced a novel graph clustering model called contextual correlation preserving multiview featured graph clustering (CCPMVFGC) for discovering communities in graphs with multiview features, viewwise correlations of pairwise features and the graph topology. VVAMo (<xref ref-type="bibr" rid="B19">He et&#x20;al., 2021a</xref>) is a novel matrix factorization-based model for communities in complex network. It proposes a unified likelihood function for VVAMo and derives an alternating algorithm for learning the optimal parameters of the proposed model. In 2017, Zhang et&#x20;al. (<xref ref-type="bibr" rid="B75">Zhang et&#x20;al., 2017</xref>) proposed a new firefly clustering algorithm for transforming the protein complex detection problem into an optimization problem. IMA (<xref ref-type="bibr" rid="B58">Wang et&#x20;al., 2021</xref>) is a novel improved memetic algorithm that optimizes a fitness function to detect protein complexes. These model optimization-based methods usually have more parameters and variables, and the parameter optimization process is time-consuming. However, these methods also have some significance for us to transform the identification of protein complexes into an optimization problem.</p>
</sec>
<sec id="s1-1-3">
<title>1.1.3 Supervised Learning-Based Methods</title>
<p>The methods mentioned above are either unsupervised learning-based or model optimization-based methods that identify protein complexes using predefined assumptions and determined models. Unsupervised learning-based methods do not need to resolve practical problems, such as insufficient feature extraction from known protein complexes, model selection, and model training. Those methods cannot utilize the information of known protein complexes, and they neglect some other topological protein complexes such as the &#x2018;star&#x2019; mode and &#x2018;spoke&#x2019; mode and so on. Generally, supervised learning-based methods first train a supervised learning model by extracting features, and then trained supervised learning models are used to search new protein complexes.</p>
<p>Many standard protein complex datasets have been obtained in recent years. Therefore, several supervised learning-based methods based on training regression or classification models are proposed to discover protein complexes from PPI networks. For example, Qi et&#x20;al. (<xref ref-type="bibr" rid="B48">Qi et&#x20;al., 2008</xref>) proposed a framework to learn the parameters of the Bayesian network model for discovering protein complexes. Yu et&#x20;al. (<xref ref-type="bibr" rid="B67">Yu et&#x20;al., 2014</xref>) presented a supervised learning-based method to detect protein complexes, which used cliques as initial clusters and selected a trained linear regression model to form protein complexes. Lei et&#x20;al. (<xref ref-type="bibr" rid="B50">Shi et&#x20;al., 2011</xref>) proposed a semisupervised algorithm, and trained a neural network model to detect protein complexes. ClusterEPs (<xref ref-type="bibr" rid="B36">Liu et&#x20;al., 2016</xref>) estimated the possibility of a subgraph being a protein complex by emerging patterns (EPs). Dong et&#x20;al.(<xref ref-type="bibr" rid="B8">Dong et&#x20;al., 2018</xref>) provided the ClusterSS method, which integrates a trained neural network model and local cohesiveness function to guide the search strategy to identify protein complexes. Liu et&#x20;al. (<xref ref-type="bibr" rid="B37">Liu et&#x20;al., 2018</xref>) proposed a supervised learning method based on network embeddings and a random forest model for discovering protein complexes. Based on the decision tree, Sikandar et&#x20;al. (<xref ref-type="bibr" rid="B51">Sikandar et&#x20;al., 2018</xref>) presented a method using biological and topological information to detect protein complexes. Liu et&#x20;al.(<xref ref-type="bibr" rid="B34">Liu et&#x20;al., 2021</xref>) proposed a novel semisupervised model and a protein complex detection algorithm to identify significant protein complexes with clear module structures from PPI networks. Mei et&#x20;al. (<xref ref-type="bibr" rid="B39">Mei, 2022</xref>) proposed a computational method that combines supervised learning and dense subgraph discovery to predict protein complexes. On the one hand, the accuracy of these detection methods based on semisupervised learning or supervised learning is limited due to the small training dataset. On the other hand, these methods only train a single type of learning model, so these models are not so generalizable and their learning ability has certain limitations.</p>
<p>Some existing studies show that graph neural networks (GNNs) methods can effectively learn graph structure and node features. For example, Kipf et&#x20;al. (<xref ref-type="bibr" rid="B27">Kipf and Welling, 2016</xref>) presented a scalable approach for semisupervised learning on graph-structured data. The proposed graph convolutional network (GCN) model is based on an efficient variant of convolutional neural networks. It can encode both graph structure and node features in a way useful for semisupervised classification. In 2021, Zaki et&#x20;al. (<xref ref-type="bibr" rid="B71">Zaki et&#x20;al., 2021</xref>) introduced various GCN approaches to improve the detection of protein complexes. graph attention networks (GATs), which aggregate neighbor nodes through the attention mechanism, realize the adaptive allocation of weights of different neighbors, thus greatly improving the expression ability of GNN models. He et&#x20;al. (<xref ref-type="bibr" rid="B20">He et&#x20;al., 2021b</xref>) proposed a class of novel learning-to-attend strategies, named conjoint attentions (CAs) to construct graph conjoint attention networks (CATs) for GNNs. CAs offer flexible incorporation of layerwise node features and structural interventions that can be learned outside the GNNs to compute appropriate weights for feature aggregation. We will study the detection of protein complexes in PPI networks using GATs in the future.</p>
</sec>
</sec>
<sec id="s1-2">
<title>1.2 Observations and Contributions</title>
<p>Based on the related work, assigning weights to the interacting edges by the network embedding method and multiple biological information can effectively improve the accuracy of the detection methods. Meanwhile, some studies have shown that protein complexes have core-attachment structures. Therefore, our ELF-DPC is based on a core-attachment structure, and we constructed a weighted PPI network. Second, we proposed a protein complex core strategy to mine local protein complex cores. We identified global protein complex cores using the CPredictor2.0 method, which endows our ELF-DPC with both global search ability and local search ability. Third, most current methods are based on either unsupervised learning or supervised learning. Unsupervised learning-based methods can detect only one or several topological protein complexes and cannot fully learn the characteristics of known protein complexes. Supervised learning-based methods can learn the characteristics of known protein complexes, detecting protein complexes with different topological structures. Still, current supervised learning-based methods are based on a single base model for training. However, the generalization of a single model is poor. Therefore, we propose an ensemble learning model consisting of a trained voting regression model based on different types of base regression models and structural modularity to detect protein complexes with different topological structures. Finally, we proposed a graph heuristic search strategy to extend each protein complex core to form a protein complex. The results obtained show that ELF-DPC attained superior performances over 12 state-of-the-art methods. Furthermore, functional enrichment analysis results of ELF-DPC showed higher biological relevance by GO enrichment analysis.</p>
<p>To summarize, we make the following contributions:<list list-type="simple">
<list-item>
<p>&#x2022; We introduce a protein complex core mining strategy based on the core-attachment structure and design a graph heuristic search strategy to search protein complexes.</p>
</list-item>
<list-item>
<p>&#x2022; We propose structural modularity to describe the inherent topological organization of protein complexes.</p>
</list-item>
<list-item>
<p>&#x2022; We present some new topological features and design an ensemble learning model by combining structural modularity and a voting regression model, which quantifies the possibility for a cluster as a protein complex.</p>
</list-item>
<list-item>
<p>&#x2022; We present an ensemble learning framework to identify protein complexes, and it achieves better performance than other competing methods.</p>
</list-item>
</list>
</p>
<p>The rest of this study is organized as follows. The Materials and methods section introduces the datasets, terminologies, and methods. The Experiments and results section describes evaluation metrics and parameter selection and compare ELF-DPC with the competing methods. Finally, the Conclusion section provides a conclusion and future&#x20;work.</p>
</sec>
</sec>
<sec id="s2">
<title>2 Materials and Methods</title>
<sec id="s2-1">
<title>2.1 Datasets</title>
<sec id="s2-1-1">
<title>2.1.1&#x20;Protein-Protein Interaction Networks</title>
<p>In this paper, we used the four PPI networks for the experiments, i.e.,&#x20;Gavin (<xref ref-type="bibr" rid="B12">Gavin et&#x20;al., 2006</xref>), Krogan core (<xref ref-type="bibr" rid="B28">Krogan et&#x20;al., 2006</xref>), DIP (<xref ref-type="bibr" rid="B63">Xenarios et&#x20;al., 2002</xref>), and MIPS (<xref ref-type="bibr" rid="B16">G&#xfc;ldener et&#x20;al., 2006</xref>). The detailed properties of these PPI networks are shown in <xref ref-type="table" rid="T1">Table&#x20;1</xref>. Here, the self-interactions and duplicate interactions were eliminated.</p>
<table-wrap id="T1" position="float">
<label>TABLE 1</label>
<caption>
<p>The detailed properties of the protein-protein interaction datasets.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Dataset</th>
<th align="left">Number of node</th>
<th align="center">Number of edge</th>
<th align="center">Density</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">Gavin</td>
<td align="char" char=".">1855</td>
<td align="center">7,669</td>
<td align="center">0.004&#x2009;459&#x2009;796&#x2009;985</td>
</tr>
<tr>
<td align="left">Krogan core</td>
<td align="char" char=".">2,674</td>
<td align="center">7,075</td>
<td align="center">0.001&#x2009;979&#x2009;684&#x2009;934</td>
</tr>
<tr>
<td align="left">DIP</td>
<td align="char" char=".">4,930</td>
<td align="center">17&#x2009;201</td>
<td align="center">0.001&#x2009;415&#x2009;721&#x2009;912&#x2009;41</td>
</tr>
<tr>
<td align="left">MIPS</td>
<td align="char" char=".">4,553</td>
<td align="center">12&#x2009;318</td>
<td align="center">0.001&#x2009;188&#x2009;694&#x2009;605&#x2009;27</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s2-1-2">
<title>2.1.2 Standard Protein Complexes</title>
<p>We used two standard protein complexes that were constructed in the literature (<xref ref-type="bibr" rid="B59">Wang et&#x20;al., 2020</xref>). Their properties are shown in <xref ref-type="table" rid="T2">Table&#x20;2</xref>. Here, standard protein complexes 1 consists of the known protein complexes from MIPS (<xref ref-type="bibr" rid="B41">Mewes et&#x20;al., 2004</xref>), SGD (<xref ref-type="bibr" rid="B21">Hong et&#x20;al., 2007</xref>), TAP06 (<xref ref-type="bibr" rid="B12">Gavin et&#x20;al., 2006</xref>), ALOY (<xref ref-type="bibr" rid="B2">Aloy et&#x20;al., 2004</xref>), CYC 2008 (<xref ref-type="bibr" rid="B47">Pu et&#x20;al., 2009</xref>), and NEWMIPS (<xref ref-type="bibr" rid="B11">Friedel et&#x20;al., 2009</xref>). Standard protein complexes 2 is also a combined protein complex dataset (<xref ref-type="bibr" rid="B38">Ma et&#x20;al., 2017</xref>). It consists of the Wodak database (<xref ref-type="bibr" rid="B47">Pu et&#x20;al., 2009</xref>), PINdb and GO complexes (<xref ref-type="bibr" rid="B38">Ma et&#x20;al., 2017</xref>).</p>
<table-wrap id="T2" position="float">
<label>TABLE 2</label>
<caption>
<p>The properties of the standard protein complexes.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Datasets</th>
<th align="left">Number</th>
<th align="left">Protein coverage</th>
<th align="left">Avg size</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">standard protein complexes 1</td>
<td align="char" char=".">812</td>
<td align="char" char=".">2,773</td>
<td align="char" char=".">8.92</td>
</tr>
<tr>
<td align="left">standard protein complexes 2</td>
<td align="char" char=".">1,045</td>
<td align="char" char=".">2,778</td>
<td align="char" char=".">8.97</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s2-1-3">
<title>2.1.3 GO Annotation Data and Gene Expression Data</title>
<p>In this study, we used the GO-slim data for describing the functional similarity of interactions, which is available on the link: <ext-link ext-link-type="uri" xlink:href="https://downloads.yeastgenome.org">https://downloads.yeastgenome.org</ext-link>. Meanwhile, the gene expression data were obtained from <ext-link ext-link-type="uri" xlink:href="https://www.ncbi.nlm.nih.gov/sites/GDSbrowser">https://www.ncbi.nlm.nih.gov/sites/GDSbrowser</ext-link>. Additionally, subcellular localization data was obtained from <ext-link ext-link-type="uri" xlink:href="https://compartments.jensenlab.org/Downloads">https://compartments.jensenlab.org/Downloads</ext-link>.</p>
<table-wrap id="alg1" position="float">
<label>Algorithm 1</label>
<caption>
<p>The framework of ELF-DPC algorithm.</p>
</caption>
<table>
<tbody>
<tr>
<td>
<inline-graphic xlink:href="fgene-13-839949-fx1.tif"/>
</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
</sec>
<sec id="s2-2">
<title>2.2 Terminologies</title>
<p>Here, we will give some terminologies that are used in this paper. A PPI network is generally described as a weighted graph <italic>G</italic>&#x20;&#x3d; (<italic>V</italic>, <italic>E</italic>, <italic>W</italic>), where <italic>V</italic> is a set of proteins, <italic>E</italic> is a set of interactions, and <italic>W</italic> is a <italic>n</italic>&#x20;&#xd7; <italic>n</italic>(<italic>n</italic>&#x20;&#x3d; &#x7c;<italic>V</italic>&#x7c;) matrix that represents the reliability of protein pairs in PPI networks. The direct interacting neighbor of node <italic>v</italic> is defined as <italic>N</italic>
<sub>
<italic>v</italic>
</sub> &#x3d; {<italic>u</italic>&#x7c;(<italic>u</italic>, <italic>v</italic>) &#x2208; <italic>E</italic>, <italic>u</italic>&#x20;&#x2208;&#x20;<italic>V</italic>}.</p>
</sec>
<sec id="s2-3">
<title>2.3 Methods</title>
<sec id="s2-3-1">
<title>2.3.1 The Framework of ELF-DPC Algorithm</title>
<p>This work is a novel ensemble learning framework to identify protein complexes from PPI networks. The block diagram of the detection process is shown in <xref ref-type="fig" rid="F1">Figure&#x20;1</xref>.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption>
<p>The ensemble framework of proposed protein complex detection.</p>
</caption>
<graphic xlink:href="fgene-13-839949-g001.tif"/>
</fig>
<p>The framework of this method is outlined in Algorithm 1. The input to the algorithm is the PPI network, which produces a set of protein complexes as output. Our algorithm consists of five main steps. The first step is to construct a weighted PPI network by combining topological structure, gene expression data, GO annotation data, and subcellular location data in Line 2 (Constructing a weighted PPI network section). The second step is to design a protein complex core mining strategy to identify protein complex cores in the PPI networks (Mining protein complex cores section) in Line 3. The third step is first to construct feature vectors to describe the properties of known and false protein complexes in the PPI networks and train a voting regression model (Training a voting regression model section) to model and represent the protein complex based on supervised learning in Line 5. Then second, we define a quality function called structural modularity to describe the structural modularity of protein complexes. Then we combine the trained voting regression model and structural modularity to obtain an ensemble learning model in Line 6. In the fourth step, based on the ensemble learning model, we propose a graph heuristic search strategy (Forming protein complexes section) to extend each protein complex core for forming protein complexes from the PPI networks in Lines 7&#x2013;14. Finally, we remove these redundant identified protein complexes in Line&#x20;15.</p>
</sec>
<sec id="s2-3-2">
<title>2.3.2 Constructing a Weighted PPI Network</title>
<p>Some studies have confirmed that the performance of protein complex detection could be markedly enhanced when the weight of edges is considered (<xref ref-type="bibr" rid="B25">Keretsu and Sarmah, 2016</xref>; <xref ref-type="bibr" rid="B31">Lei et&#x20;al., 2018</xref>). Meanwhile, integrating multiple data sources into a PPI network can strengthen the reliability of the PPI networks (<xref ref-type="bibr" rid="B31">Lei et&#x20;al., 2018</xref>; <xref ref-type="bibr" rid="B59">Wang et&#x20;al., 2020</xref>), which inspires us with confidence to give the weight for interactions. Moreover, a protein complex consists of proteins and interactions among themselves, and the proteins in the same protein complex are coexpressed and have a similar function and localization. Thus, we integrate multiple pieces of information, including gene expression data, protein localization data, and gene ontology data, to weight the interactions within the PPI networks.</p>
<sec id="s2-3-2-1">
<title>2.3.2.1 Protein Coexpression Similarity</title>
<p>Generally, for a pair of interacting proteins, their coexpression level can reflect the strength of their interactions. Proteins with coexpressed relationships may also have similar functions (<xref ref-type="bibr" rid="B9">Eisen et&#x20;al., 1998</xref>) and show stronger consistency of functions (<xref ref-type="bibr" rid="B7">Chen and Xu, 2004</xref>). Some studies have shown that coexpressed protein pairs tend to interact in the same protein complexes (<xref ref-type="bibr" rid="B25">Keretsu and Sarmah, 2016</xref>). Furthermore, the Person correlation coefficient (PCC) was used to estimate how strongly two interacting proteins are coexpressed (<xref ref-type="bibr" rid="B30">Lei et&#x20;al., 2016</xref>; <xref ref-type="bibr" rid="B49">Shang et&#x20;al., 2016</xref>). For a pair of proteins <italic>X</italic> and <italic>Y</italic>, their gene expression profiles are <italic>X</italic>&#x20;&#x3d; {<italic>x</italic>
<sub>1</sub>, <italic>x</italic>
<sub>2</sub>, &#x2026; , <italic>x</italic>
<sub>
<italic>i</italic>
</sub>, &#x2026; , <italic>x</italic>
<sub>
<italic>m</italic>
</sub>} and <italic>Y</italic>&#x20;&#x3d; {<italic>y</italic>
<sub>1</sub>, <italic>y</italic>
<sub>2</sub>, &#x2026; , <italic>y</italic>
<sub>
<italic>i</italic>
</sub>, &#x2026; , <italic>y</italic>
<sub>
<italic>m</italic>
</sub>}, respectively. The value of their PPC is defined as <xref ref-type="disp-formula" rid="e1">Eq. 1</xref> (<xref ref-type="bibr" rid="B56">Wang et&#x20;al., 2013</xref>).<disp-formula id="e1">
<mml:math id="m1">
<mml:mi>P</mml:mi>
<mml:mi>C</mml:mi>
<mml:mi>C</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>X</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>Y</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mo movablelimits="false" form="prefix">&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>X</mml:mi>
</mml:mrow>
<mml:mo>&#x304;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#xd7;</mml:mo>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>Y</mml:mi>
</mml:mrow>
<mml:mo>&#x304;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:msqrt>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mo movablelimits="false" form="prefix">&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>X</mml:mi>
</mml:mrow>
<mml:mo>&#x304;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msqrt>
<mml:mo>&#xd7;</mml:mo>
<mml:msqrt>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mo movablelimits="false" form="prefix">&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>Y</mml:mi>
</mml:mrow>
<mml:mo>&#x304;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:mfrac>
</mml:math>
<label>(1)</label>
</disp-formula>where <inline-formula id="inf1">
<mml:math id="m2">
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>X</mml:mi>
</mml:mrow>
<mml:mo>&#x304;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf2">
<mml:math id="m3">
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>Y</mml:mi>
</mml:mrow>
<mml:mo>&#x304;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</inline-formula> are the average gene expression of proteins <italic>X</italic> and <italic>Y</italic> at <italic>n</italic> time points, respectively. The value of <italic>PCC</italic>(<italic>X</italic>, <italic>Y</italic>) ranges from -1 to 1. For convenience, we use (<italic>PCC</italic>(<italic>X</italic>, <italic>Y</italic>) &#x2b; 1)/2 to replace <italic>PCC</italic>(<italic>X</italic>, <italic>Y</italic>), which sets the value of <italic>PCC</italic>(<italic>X</italic>, <italic>Y</italic>) in (0,1). The value of <italic>PCC</italic>(<italic>X</italic>, <italic>Y</italic>) is higher, and then the coexpression probability of nodes <italic>X</italic> and <italic>Y</italic> is larger. At the same time, they could consist of the same protein complex.</p>
</sec>
<sec id="s2-3-2-2">
<title>2.3.2.2 Protein Functional Similarity</title>
<p>From a functional standpoint, we use GO-slim data to reflect the functional similarity of proteins. If a pair of proteins have more common GO-slim annotations, they are more likely to have the same biological function. Even the reliability of interactions between them will become stronger. Here, we let <italic>FS</italic>(<italic>X</italic>, <italic>Y</italic>) describe this relationship, which is defined as <xref ref-type="disp-formula" rid="e2">Eq. 2</xref>:<disp-formula id="e2">
<mml:math id="m4">
<mml:mi>F</mml:mi>
<mml:mi>S</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>X</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>Y</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:mfenced open="{" close="">
<mml:mrow>
<mml:mtable class="cases">
<mml:mtr>
<mml:mtd columnalign="left">
<mml:mfrac>
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>S</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>X</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x2229;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>S</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>Y</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">min</mml:mi>
<mml:mfenced open="{" close="}">
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>S</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>X</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mo>,</mml:mo>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>S</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>Y</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfrac>
<mml:mo>,</mml:mo>
</mml:mtd>
<mml:mtd columnalign="left">
<mml:mi mathvariant="italic">min</mml:mi>
<mml:mfenced open="{" close="}">
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>S</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>X</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mo>,</mml:mo>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>S</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>Y</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x2a7e;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="left">
<mml:mn>0</mml:mn>
<mml:mo>,</mml:mo>
<mml:mspace width="1em"/>
</mml:mtd>
<mml:mtd columnalign="left">
<mml:mi>o</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>h</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>w</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>e</mml:mi>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mfenced>
</mml:math>
<label>(2)</label>
</disp-formula>where &#x7c;<italic>FS</italic>(<italic>X</italic>)&#x7c; and &#x7c;<italic>FS</italic>(<italic>Y</italic>)&#x7c; represent the number of GO-slim annotations for proteins <italic>X</italic> and <italic>Y</italic>, respectively. &#x7c;<italic>FS</italic>(<italic>X</italic>) &#x2229; <italic>FS</italic>(<italic>Y</italic>)&#x7c; denotes the number of common GO-slim annotations for proteins <italic>X</italic> and&#x20;<italic>Y</italic>.</p>
</sec>
<sec id="s2-3-2-3">
<title>2.3.2.3 Protein Subcellular Location Similarity</title>
<p>Generally, if two interacting proteins have more exact subcellular locations, the interaction between proteins is more reliable. Here, we define the subcellular location similarity <italic>SL</italic>(<italic>X</italic>, <italic>Y</italic>), which is defined as <xref ref-type="disp-formula" rid="e3">Eq. 3</xref>:<disp-formula id="e3">
<mml:math id="m5">
<mml:mi>S</mml:mi>
<mml:mi>L</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>X</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>Y</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>S</mml:mi>
<mml:mi>L</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>X</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x2229;</mml:mo>
<mml:mi>S</mml:mi>
<mml:mi>L</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>Y</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>S</mml:mi>
<mml:mi>L</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>X</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mo>&#x2b;</mml:mo>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>S</mml:mi>
<mml:mi>L</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>Y</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:math>
<label>(3)</label>
</disp-formula>where &#x7c;<italic>SL</italic>(<italic>X</italic>)&#x7c; and &#x7c;<italic>SL</italic>(<italic>Y</italic>)&#x7c; denote the number of subcellular localizations of proteins <italic>X</italic> and <italic>Y</italic>, respectively. &#x7c;<italic>SL</italic>(<italic>X</italic>) &#x2229; <italic>SL</italic>(<italic>Y</italic>)&#x7c; represents the number of common subcellular localizations between proteins <italic>X</italic> and&#x20;<italic>Y</italic>.</p>
</sec>
<sec id="s2-3-2-4">
<title>2.3.2.4 Protein Topological Structure Similarity</title>
<p>The network embedding method is a representation learning technique for representing the network&#x2019;s nodes, which can automatically learn topological information from PPI networks. In this study, we use the network embedding method Node2Vec (<xref ref-type="bibr" rid="B15">Grover and Leskovec, 2016</xref>) to learn low-dimensional feature representations for the structural information of the proteins in a PPI network. For proteins <italic>X</italic> and <italic>Y</italic>, their representations are two vectors, namely, <italic>X</italic> and <italic>Y</italic>. Meanwhile, the obtained protein embedding vectors by node2vec can reflect the topological structure similarity among proteins, and we use cosine similarity to calculate the similarity of vector representation of proteins <italic>X</italic> and <italic>Y</italic>, which is defined as <xref ref-type="disp-formula" rid="e4">Eq. 4</xref>:<disp-formula id="e4">
<mml:math id="m6">
<mml:mi>T</mml:mi>
<mml:mi>S</mml:mi>
<mml:mi>S</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>X</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>Y</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mo movablelimits="false" form="prefix">&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#xd7;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msqrt>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mo movablelimits="false" form="prefix">&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:msubsup>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:msqrt>
<mml:mo>&#xd7;</mml:mo>
<mml:msqrt>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mo movablelimits="false" form="prefix">&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:msubsup>
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:mfrac>
</mml:math>
<label>(4)</label>
</disp-formula>where <italic>F</italic>(<italic>X</italic>) &#x3d; (<italic>x</italic>
<sub>1</sub>, <italic>x</italic>
<sub>2</sub>, &#x2026; , <italic>x</italic>
<sub>
<italic>i</italic>
</sub>, &#x2026; , <italic>x</italic>
<sub>
<italic>n</italic>
</sub>) and <italic>F</italic>(<italic>Y</italic>) &#x3d; (<italic>y</italic>
<sub>1</sub>, <italic>y</italic>
<sub>2</sub>, &#x2026; , <italic>y</italic>
<sub>
<italic>i</italic>
</sub>, &#x2026; , <italic>y</italic>
<sub>
<italic>n</italic>
</sub>) is the <italic>n</italic> dimension of the corresponding vector. <italic>TSS</italic>(<italic>X</italic>, <italic>Y</italic>) indicates the topological structure similarity of two connecting proteins, <italic>X</italic> and&#x20;<italic>Y</italic>.</p>
<p>For each edge, its weighted value <italic>W</italic>(<italic>X</italic>, <italic>Y</italic>) is expressed by <xref ref-type="disp-formula" rid="e5">Eq. 5</xref>:<disp-formula id="e5">
<mml:math id="m7">
<mml:mi>W</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>X</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>Y</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mi>C</mml:mi>
<mml:mi>C</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>X</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>Y</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x2b;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>S</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>X</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>Y</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x2b;</mml:mo>
<mml:mi>S</mml:mi>
<mml:mi>L</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>X</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>Y</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x2b;</mml:mo>
<mml:mi>T</mml:mi>
<mml:mi>S</mml:mi>
<mml:mi>S</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>X</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>Y</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mn>4</mml:mn>
</mml:mrow>
</mml:mfrac>
</mml:math>
<label>(5)</label>
</disp-formula>when the edges, whose weight is 0, are noise and should be removed from the PPI networks. Finally, we integrate topological structure similarity and biological information similarity, which can enhance the reliability of PPI networks. Therefore, a weighted PPI network is constructed.</p>
</sec>
</sec>
<sec id="s2-3-3">
<title>2.3.3 Mining Protein Complex Cores</title>
<p>According to the constructing a weighted PPI network section, the weight of interactions is weighted using multiple biological properties and its topological structure, so the higher weight the edge has, the more likely it is that two terminate proteins are inside the same protein complex (<xref ref-type="bibr" rid="B61">Wang et&#x20;al., 2011</xref>; <xref ref-type="bibr" rid="B32">Li et&#x20;al., 2012</xref>). Furthermore, the protein complex cores often correspond to dense subgraphs in PPI networks (<xref ref-type="bibr" rid="B62">Wu et&#x20;al., 2009</xref>; <xref ref-type="bibr" rid="B57">Wang et&#x20;al., 2019</xref>). The pseudocode of mining protein complex cores is presented in <xref ref-type="statement" rid="Algorithm_2">Algorithm&#x20;2</xref>.</p>
<p>First, for the edge (<italic>v</italic>, <italic>u</italic>), its weight is <italic>w</italic>(<italic>v</italic>, <italic>u</italic>), and its neighborhood graph is denoted as <italic>NG</italic>(<italic>v</italic>, <italic>u</italic>) &#x3d; (<italic>V</italic>&#x2a;, <italic>E</italic>&#x2a;, <italic>W</italic>&#x2a;), where <italic>V</italic>&#x2a; &#x3d; <italic>N</italic>
<sub>
<italic>v</italic>
</sub> &#x222a; <italic>N</italic>
<sub>
<italic>u</italic>
</sub> &#x222a; {<italic>v</italic>, <italic>u</italic>}. Furthermore, the average weighted degree of <italic>NG</italic>(<italic>v</italic>, <italic>u</italic>) is denoted as <italic>AWD</italic>(<italic>NG</italic>(<italic>v</italic>, <italic>u</italic>)) (<xref ref-type="disp-formula" rid="e6">Eq. 6</xref>):<disp-formula id="e6">
<mml:math id="m8">
<mml:mi>A</mml:mi>
<mml:mi>W</mml:mi>
<mml:mi>D</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>N</mml:mi>
<mml:mi>G</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>v</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>u</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mo movablelimits="false" form="prefix">&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi>E</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2a;</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msub>
<mml:mi>w</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi>V</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2a;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
</mml:mfrac>
<mml:mo>.</mml:mo>
</mml:math>
<label>(6)</label>
</disp-formula>
</p>
<p>Based on the analysis above, we propose a score function (<xref ref-type="disp-formula" rid="e7">Eq. 7</xref>) to score seed edges based on the weight of the edge <italic>w</italic>(<italic>v</italic>, <italic>u</italic>) and the average weighted degree of the neighborhood graph of the edge (<xref ref-type="disp-formula" rid="e6">Eq. 6</xref>) to select seed edges in Line 1. Then, we sort all edges in nonascending order based on the score function (see <xref ref-type="disp-formula" rid="e7">Eq. 7</xref>) in the PPI networks. Only edges whose score function is greater than the mean of the score function of all edges are queued into <italic>Q</italic>. Seed edges in <italic>Q</italic> will mine protein complex cores in Line&#x20;2.</p>
<p>As a result, the score function of edge (<italic>v</italic>, <italic>u</italic>) is defined as <xref ref-type="disp-formula" rid="e7">Eq. 7</xref>:<disp-formula id="e7">
<mml:math id="m9">
<mml:mi>S</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>r</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi>e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>e</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>g</mml:mi>
<mml:mi>e</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>v</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>u</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>w</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>v</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>u</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>A</mml:mi>
<mml:mi>W</mml:mi>
<mml:mi>D</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>N</mml:mi>
<mml:mi>G</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>v</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>u</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
<mml:mo>.</mml:mo>
</mml:math>
<label>(7)</label>
</disp-formula>
</p>
<p>For an edge (<italic>v</italic>, <italic>u</italic>) &#x2208; <italic>E</italic>, its edge clustering coefficient (<italic>ECC</italic>(<italic>v</italic>, <italic>u</italic>)) is defined as the number of triangles to which (<italic>u</italic>, <italic>v</italic>) belongs, divided by the number of triangles that might potentially include (<italic>u</italic>, <italic>v</italic>), as shown in <xref ref-type="disp-formula" rid="e8">Eq. 8</xref>.<disp-formula id="e8">
<mml:math id="m10">
<mml:mi>E</mml:mi>
<mml:mi>C</mml:mi>
<mml:mi>C</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>v</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>u</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>Z</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>v</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>u</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">min</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi mathvariant="italic">deg</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>v</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mo>,</mml:mo>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi mathvariant="italic">deg</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>u</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfrac>
<mml:mo>.</mml:mo>
</mml:math>
<label>(8)</label>
</disp-formula>where <italic>Z</italic>(<italic>v</italic>, <italic>u</italic>) denotes the number of triangles built on edge (<italic>v</italic>, <italic>u</italic>), and <italic>min(&#x7c;&#x2009;deg</italic>(<italic>v</italic>)&#x7c;, &#x7c;&#x2009;<italic>deg</italic>(<italic>u</italic>)&#x7c;) is the minimum degree of the two terminate proteins.</p>
<p>Initially, select the protein with the highest weight edge as the first seed edge (<italic>v</italic>, <italic>u</italic>), and create a protein complex core in Line 6, where neighbors of the complex core are added to both the weight of edge <italic>w</italic>(<italic>x</italic>, <italic>t</italic>) &#x2265; <italic>Avgedgesweight</italic> (<italic>Avgedgesweight</italic> is defined as <xref ref-type="disp-formula" rid="e9">Eq. 9</xref>) and <italic>ECC</italic>(<italic>x</italic>, <italic>t</italic>) is greater than the average edge clustering coefficient <italic>ECC</italic> of all edges (<italic>AvgweightECC</italic>), according to the closeness between the seed edge (<italic>v</italic>, <italic>u</italic>) and its neighbors in Lines 9&#x2013;17. These two constraints can ensure that the proteins in the protein complex core are correlated in biological relations and closely connected in topological structure. The protein complex core is retained if it contains more than or equals two proteins in Lines 18&#x2013;20. Meanwhile, the seed edge (including two terminate proteins) would be marked and cannot be used as the seed edge of another cluster in Lines seven and eight. We select the next edge with the highest weight where its two terminal proteins are not included before seed edges, and it is used to form the next protein complex core until the seed queue <italic>Q</italic> is empty in Lines 6&#x2013;22.<disp-formula id="e9">
<mml:math id="m11">
<mml:mi>A</mml:mi>
<mml:mi>v</mml:mi>
<mml:mi>g</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>g</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>w</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>g</mml:mi>
<mml:mi>h</mml:mi>
<mml:mi>t</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mo movablelimits="false" form="prefix">&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>v</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>u</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x2208;</mml:mo>
<mml:mi>E</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mi>w</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>v</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>u</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>V</mml:mi>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
</mml:mfrac>
<mml:mo>.</mml:mo>
</mml:math>
<label>(9)</label>
</disp-formula>
</p>
<p>CPredictor2.0 (<xref ref-type="bibr" rid="B65">Xu et&#x20;al., 2017</xref>) is also employed to detect global protein complex cores. Here, CPredictor2.0 detects protein complexes using MCL and protein functional information. It first discovers clusters in each functional group using the Markov clustering algorithm and merges them with higher overlap. We use CPredictor2.0 to obtain global protein complex cores (<italic>CPrclusters</italic>) in Line 23. Next, we combine these local protein complex cores by a graph heuristic search method and global protein complex cores using the CPredictor2.0 method in Line&#x20;24.</p>
<p>Here, <xref ref-type="statement" rid="Algorithm_2">Algorithm 2</xref> identifies the protein complex cores, which may have some redundant protein complex cores. For these redundant protein complex cores, we only keep one of them in the list of protein complex cores in Line&#x20;25.</p>
<p>
<statement content-type="algorithm" id="Algorithm_2">
<label>Algorithm 2</label>
<p>Mining protein complex&#x20;cores.</p>
<p>
<inline-graphic xlink:href="fgene-13-839949-fx2.tif"/>
</p>
</statement>
</p>
</sec>
<sec id="s2-3-4">
<title>2.3.4 Obtaining an Ensemble Learning Model</title>
<sec id="s2-3-4-1">
<title>2.3.4.1 Training a Voting Regression Model</title>
<p>To obtain the trained regression model, we will follow several steps. First, we collect the known protein complexes and weighted a weighted PPI network based on <xref ref-type="disp-formula" rid="e5">Eq. 5</xref>. Second, we map these known protein complexes to the weighted and unweighted PPI networks to obtain mapped protein complexes. Third, we generate false protein complexes in current weighted and unweighted PPI networks based on the same size distribution of mapped protein complexes. Then we analyze the topological properties of known and false protein complexes. Fourth, we extract and select topological features from these mapped protein complexes and false protein complexes. Fifth, we chose an appropriate regression model and train it. Finally, we obtained the trained regression model. The whole training routine is illustrated in <xref ref-type="fig" rid="F2">Figure&#x20;2</xref>.</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption>
<p>A procession of training a regression&#x20;model.</p>
</caption>
<graphic xlink:href="fgene-13-839949-g002.tif"/>
</fig>
<p>Next, we mainly introduce the differences and contributions between this study and previous research works. Obtaining known protein complexes from the database of standard protein complexes 1 and 2 (<xref ref-type="bibr" rid="B59">Wang et&#x20;al., 2020</xref>) is very important, because they are used as factual protein complexes for training a model. Note that the protein complex has more than or equal to three proteins. Given machine learning, the quality of the training dataset is vital to model training. Previous methods generally construct false protein complexes by randomly selecting nodes in the graph. It has two disadvantages: it does not guarantee that the generated subgraphs are connected graphs and they cannot reflect the veracity of the topology of subgraphs in PPI networks. Therefore, we propose a false protein complex generating strategy. First, standard protein complexes are mapped to the PPI networks. Note that some standard protein complexes could not be mapped to the PPI networks, so the number of mapped protein complexes is generally less than the number of standard protein complexes. Second, we analyze the size distribution of the mapped protein complexes, and the size distribution of the generated false protein complexes follow the same power-law distribution. Third, according to the size distribution of the mapped protein complexes, we generate false protein complexes by randomly selecting the local neighborhood subgraphs in the PPI networks. Here, false protein complexes whose neighborhood affinity <italic>NA</italic>(<italic>A</italic>, <italic>B</italic>) (<xref ref-type="disp-formula" rid="e15">Eq. 15</xref>) with known protein complexes is less than 0.2. Finally, the ratio between the number of false protein complexes and the number of mapped protein complexes was 5 to 1. For selecting the parameter <italic>ratio</italic>, please see the parameter selection section.</p>
<p>In this paper, both known and false protein complexes in the PPI networks are modeled as weighted and unweighted undirected graphs. The weight is calculated based on <xref ref-type="disp-formula" rid="e5">Eq. 5</xref>. Extracting and selecting appropriate features are essential to distinguish between factual and false protein complexes. Previous supervised learning methods rely on finding cliques, triangles, rectangles, spokes, and star graphs to mine protein complexes in PPI networks. Of course, we can use topological features such as degree statistics, node size, and edge statistics. On the one hand, we use some existing topological features for protein complex identification.</p>
<p>On the other hand, we propose some topological features to describe the topological properties of protein complexes. We use 65 topological features to represent protein complexes in the PPI networks. <xref ref-type="table" rid="T3">Table&#x20;3</xref> presents the list of topological features we used. Some topological features are extracted from the unweighted and weighted PPI networks. The implementation details about these topological features are well described in <ext-link ext-link-type="uri" xlink:href="https://github.com/RongquanWang/ELF-DPC/Methods/Feature_selection.py">https://github.com/RongquanWang/ELF-DPC/Methods/Feature_selection.py</ext-link>. Additionally, if the reader discovers other relevant and valid topological features, please use them to represent protein complexes further.</p>
<table-wrap id="T3" position="float">
<label>TABLE 3</label>
<caption>
<p>The topological features are used for representing protein complexes.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Num</th>
<th align="center">Feature name</th>
<th align="center">Num</th>
<th align="center">Feature name</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">1</td>
<td align="left">Graph entropy</td>
<td align="char" char=".">2</td>
<td align="left">Graph weight entropy</td>
</tr>
<tr>
<td align="left">3</td>
<td align="left">Node size</td>
<td align="char" char=".">4</td>
<td align="left">Edge size</td>
</tr>
<tr>
<td align="left">5</td>
<td align="left">Graph clustering coefficient</td>
<td align="char" char=".">6</td>
<td align="left">Maximum degree</td>
</tr>
<tr>
<td align="left">7</td>
<td align="left">Minimum degree</td>
<td align="char" char=".">8</td>
<td align="left">Mean degree</td>
</tr>
<tr>
<td align="left">9</td>
<td align="left">Median degree</td>
<td align="char" char=".">10</td>
<td align="left">Variance degree</td>
</tr>
<tr>
<td align="left">11</td>
<td align="left">standard deviation degree</td>
<td align="char" char=".">12</td>
<td align="left">Maximum weight degree</td>
</tr>
<tr>
<td align="left">13</td>
<td align="left">Minimum weight degree</td>
<td align="char" char=".">14</td>
<td align="left">Average weight degree</td>
</tr>
<tr>
<td align="left">15</td>
<td align="left">Median weight degree</td>
<td align="char" char=".">16</td>
<td align="left">standard weight degree</td>
</tr>
<tr>
<td align="left">17</td>
<td align="left">Graph density</td>
<td align="char" char=".">18</td>
<td align="left">Graph weight density</td>
</tr>
<tr>
<td align="left">19</td>
<td align="left">Edge mean weight</td>
<td align="char" char=".">20</td>
<td align="left">Edge median weight</td>
</tr>
<tr>
<td align="left">21</td>
<td align="left">Edge variance weight</td>
<td align="char" char=".">22</td>
<td align="left">Edge standard weight</td>
</tr>
<tr>
<td align="left">23</td>
<td align="left">Average shortest path length</td>
<td align="char" char=".">24</td>
<td align="left">Graph diameter</td>
</tr>
<tr>
<td align="left">25</td>
<td align="left">Maximum Clustering Coefficient</td>
<td align="char" char=".">26</td>
<td align="left">Minimum Clustering Coefficient</td>
</tr>
<tr>
<td align="left">27</td>
<td align="left">Mean Clustering Coefficient</td>
<td align="char" char=".">28</td>
<td align="left">Median Clustering Coefficient</td>
</tr>
<tr>
<td align="left">29</td>
<td align="left">Variance Clustering Coefficient</td>
<td align="char" char=".">30</td>
<td align="left">Graph conductance</td>
</tr>
<tr>
<td align="left">31</td>
<td align="left">Graph weight conductance</td>
<td align="char" char=".">32</td>
<td align="left">Modularity score</td>
</tr>
<tr>
<td align="left">33</td>
<td align="left">Weight modularity score</td>
<td align="char" char=".">34</td>
<td align="left">Average boundary edge weight</td>
</tr>
<tr>
<td align="left">35</td>
<td align="left">Average edge modularity</td>
<td align="char" char=".">36</td>
<td align="left">Average common neighbor</td>
</tr>
<tr>
<td align="left">37</td>
<td align="left">Standard common neighbor</td>
<td align="char" char=".">38</td>
<td align="left">Variance common neighbor</td>
</tr>
<tr>
<td align="left">39</td>
<td align="left">Minimum common neighbor</td>
<td align="char" char=".">40</td>
<td align="left">Median common neighbor</td>
</tr>
<tr>
<td align="left">41</td>
<td align="left">Maximum common neighbor</td>
<td align="char" char=".">42</td>
<td align="left">Mean topological features</td>
</tr>
<tr>
<td align="left">43</td>
<td align="left">Median topological feature</td>
<td align="char" char=".">44</td>
<td align="left">Variance topological feature</td>
</tr>
<tr>
<td align="left">45</td>
<td align="left">Maximum topological feature</td>
<td align="char" char=".">46</td>
<td align="left">Minimum topological feature</td>
</tr>
<tr>
<td align="left">47</td>
<td align="left">Standard topological feature</td>
<td align="char" char=".">48</td>
<td align="left">Mean Degree correlation</td>
</tr>
<tr>
<td align="left">49</td>
<td align="left">Minimum Degree correlation</td>
<td align="char" char=".">50</td>
<td align="left">Variance Degree correlation</td>
</tr>
<tr>
<td align="left">51</td>
<td align="left">Maximum Degree correlation</td>
<td align="char" char=".">52</td>
<td align="left">Median Degree correlation</td>
</tr>
<tr>
<td align="left">53</td>
<td align="left">Community model</td>
<td align="char" char=".">54</td>
<td align="left">Weight community model</td>
</tr>
<tr>
<td align="left">55</td>
<td align="left">Topological Change 1</td>
<td align="char" char=".">56</td>
<td align="left">Topological Change 2</td>
</tr>
<tr>
<td align="left">57</td>
<td align="left">Topological Change 3</td>
<td align="char" char=".">58</td>
<td align="left">Topological Change 4</td>
</tr>
<tr>
<td align="left">59</td>
<td align="left">Topological Change 5</td>
<td align="char" char=".">60</td>
<td align="left">Topological Change 6</td>
</tr>
<tr>
<td align="left">61</td>
<td align="left">Topological Change 7</td>
<td align="char" char=".">62</td>
<td align="left">Topological Change 8</td>
</tr>
<tr>
<td align="left">63</td>
<td align="left">First Eigenvalues 1</td>
<td align="char" char=".">64</td>
<td align="left">First Eigenvalues 2</td>
</tr>
<tr>
<td align="left">65</td>
<td align="left">First Eigenvalues 3</td>
<td align="left"/>
<td align="left"/>
</tr>
</tbody>
</table>
</table-wrap>
<p>Ensemble learning combines multiple individual learners with certain strategies to form a learning committee, so that the overall generalization performance is greatly improved. In general, the generalization capability of an ensemble learner model is much greater than the generalization capability of a single learner model. Meanwhile, we know that there is a barrel theory so we focus on two major standards: accuracy and diversity:<list list-type="simple">
<list-item>
<p>&#x2022; Accuracy: The individual learner must not be too bad, but it must be accurate.</p>
</list-item>
<list-item>
<p>&#x2022; Diversity: The output of individual learners should be different from each&#x20;other.</p>
</list-item>
</list>
</p>
<p>Therefore, producing and combining &#x201c;good but different&#x201d; individual learners is the core of ensemble learning. The VotingRegressor model is one of the most efficient ensemble learning techniques to reduce the variance and improve detection accuracy. In this paper, we use a VotingRegressor model based on several base models for training. A VotingRegressor is an ensemble meta-estimator that fits several base estimators and averages the individual predictions to form a final prediction. Here, linear regression, BayesianRidge, DecisionTreeRegressor, and SVM. SVR (kernel &#x3d; &#x201c;linear&#x201d;) are used as the base estimators to build the VotingRegressor model. We select the VotingRegressor model due to its reduced variance in individual base estimators and better generalization capabilities, and the Voting Regressor model has more robustness than a single estimator. In this study, the VotingRegressor model and base estimators use default parameters. These models are a freely available machine learning tool used on scikit-learn (<xref ref-type="bibr" rid="B44">Pedregosa et&#x20;al., 2011</xref>), and they can be determined by the website <ext-link ext-link-type="uri" xlink:href="https://scikit-learn.org/stable/supervised">https://scikit-learn.org/stable/supervised</ext-link>_learning.html<italic>&#x23;</italic>supervised-learning.</p>
<p>As a result, a trained VotingRegressor model could be used to estimate the probability of a subgraph being a natural protein complex from a supervised learning perspective to detect protein complexes with various topological structures. The score of the VotingRegressor is based on the higher probability that it is an actual protein complex. The VotingRegressor is defined as <xref ref-type="disp-formula" rid="e10a">Eq. 10a</xref> and <xref ref-type="disp-formula" rid="e10b">Eq. 10b</xref>:<disp-formula id="e10a">
<mml:math id="m12">
<mml:mtable class="aligned">
<mml:mtr>
<mml:mtd columnalign="right">
<mml:mi>L</mml:mi>
<mml:mi>R</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>L</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>R</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>g</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>n</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="right">
<mml:mi>B</mml:mi>
<mml:mi>S</mml:mi>
<mml:mi>R</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>B</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>y</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>R</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>g</mml:mi>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="right">
<mml:mi>D</mml:mi>
<mml:mi>T</mml:mi>
<mml:mi>R</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>D</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>T</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>R</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>g</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>r</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="right">
<mml:mi>S</mml:mi>
<mml:mi>V</mml:mi>
<mml:mi>R</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>S</mml:mi>
<mml:mi>V</mml:mi>
<mml:mi>M</mml:mi>
<mml:mo>.</mml:mo>
<mml:mi>S</mml:mi>
<mml:mi>V</mml:mi>
<mml:mi>R</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>l</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mo>&#x3d;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2032;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mi>l</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>a</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mi>r</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2032;</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mfenced>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
<label>(10a)</label>
</disp-formula>
<disp-formula id="e10b">
<mml:math id="m13">
<mml:mi>V</mml:mi>
<mml:mi>R</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>C</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>V</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>g</mml:mi>
<mml:mi>R</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>g</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>r</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mfenced open="[" close="]">
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>L</mml:mi>
<mml:mi>R</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>,</mml:mo>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>B</mml:mi>
<mml:mi>S</mml:mi>
<mml:mi>R</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>,</mml:mo>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>D</mml:mi>
<mml:mi>T</mml:mi>
<mml:mi>R</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>,</mml:mo>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mi>V</mml:mi>
<mml:mi>R</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
</mml:math>
<label>(10b)</label>
</disp-formula>
</p>
</sec>
<sec id="s2-3-4-2">
<title>2.3.4.2 The Structural Modularity of Protein Complexes</title>
<p>Based on the within-module and between module edges of subgraphs and the size of the subgraph, we present a new formal definition of protein complexes in PPI networks (<xref ref-type="bibr" rid="B62">Wu et&#x20;al., 2009</xref>; <xref ref-type="bibr" rid="B69">Yu et&#x20;al., 2011</xref>; <xref ref-type="bibr" rid="B42">Nepusz et&#x20;al., 2012</xref>; <xref ref-type="bibr" rid="B57">Wang et&#x20;al., 2019</xref>). Given the new module definition, an effective method of quantitative measurement is introduced to estimate the likelihood of a cluster <italic>C</italic>&#x20;&#x3d; (<italic>V</italic>
<sub>
<italic>C</italic>
</sub>, <italic>E</italic>
<sub>
<italic>C</italic>
</sub>, <italic>W</italic>
<sub>
<italic>C</italic>
</sub>) being a protein complex in the PPI network. We introduce a structural modularity (SM) model to estimate the likelihood of a cluster <italic>C</italic>&#x20;&#x3d; (<italic>V</italic>
<sub>
<italic>C</italic>
</sub>, <italic>E</italic>
<sub>
<italic>C</italic>
</sub>, <italic>W</italic>
<sub>
<italic>C</italic>
</sub>) being a protein complex, which can detect both dense and sparser protein complexes in PPI networks. First, structural modularity (SM) is combined by <italic>Cohesion</italic>(<italic>C</italic>) and <italic>Coupling</italic>(<italic>C</italic>), and <italic>Cohesion</italic>(<italic>C</italic>) is defined as <xref ref-type="disp-formula" rid="e11">Eq. 11</xref> and <italic>Coupling</italic>(<italic>C</italic>) is defined as <xref ref-type="disp-formula" rid="e12">Eq. 12</xref>.<disp-formula id="e11">
<mml:math id="m14">
<mml:mi>C</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>h</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>n</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>C</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>W</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>C</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mi>q</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>t</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>C</mml:mi>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#xd7;</mml:mo>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>C</mml:mi>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfrac>
<mml:mo>,</mml:mo>
</mml:math>
<label>(11)</label>
</disp-formula>where <inline-formula id="inf3">
<mml:math id="m15">
<mml:msub>
<mml:mrow>
<mml:mi>W</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mo movablelimits="false" form="prefix">&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>v</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>u</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#x2208;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>E</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>C</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:msub>
<mml:mi>w</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>v</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>u</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula> denotes the total weight of the internal edges contained entirely in cluster <italic>C</italic>, and &#x7c;<italic>C</italic>&#x7c; is the number of nodes in the cluster <italic>C</italic>. <italic>Cohesion</italic>(<italic>C</italic>) could estimate a protein complex with a community structure having dense connections among its nodes. Here, <italic>Cohesion</italic>(<italic>C</italic>) is based on the definition of density of a cluster <italic>C</italic> by density multiplied by the square root of the size of cluster <italic>C</italic> to quantify the likelihood that a cluster is a protein complex. The idea of <italic>Cohesion</italic>(<italic>C</italic>) is that a protein complex in the PPI network is usually relatively sparse, so <italic>Cohesion</italic>(<italic>C</italic>) is used to adopt density as the quality function, and it may be more appropriate.<disp-formula id="e12">
<mml:math id="m16">
<mml:mi>C</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>p</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>g</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>C</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>W</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>o</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>C</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>C</mml:mi>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
</mml:mfrac>
<mml:mo>,</mml:mo>
</mml:math>
<label>(12)</label>
</disp-formula>where <italic>W</italic>
<sub>
<italic>out</italic>
</sub>(<italic>C</italic>) &#x3d; <italic>&#x2211;</italic>
<sub>
<italic>v</italic>&#x2208;<italic>C</italic>,<italic>u</italic>&#x2209;<italic>C</italic>
</sub>
<italic>w</italic>(<italic>v</italic>, <italic>u</italic>) represents the total weight of the boundary edges that connect the cluster <italic>C</italic> with the rest of the PPI network, and it can measure that the cluster <italic>C</italic> has sparse connections with its neighbor&#x20;nodes.</p>
<p>Finally, Structural Modularity (SM) is calculated as <xref ref-type="disp-formula" rid="e13">Eq. 13</xref>:<disp-formula id="e13">
<mml:math id="m17">
<mml:mi>S</mml:mi>
<mml:mi>M</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>C</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>C</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>h</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>n</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>C</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mi>C</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>h</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>n</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>C</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x2b;</mml:mo>
<mml:mi>C</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>p</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>g</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>C</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfrac>
</mml:math>
<label>(13)</label>
</disp-formula>
</p>
<p>In this work, a protein complex will be assigned a higher value of <italic>SM</italic>(<italic>C</italic>) when it has a high adapting density and is well separated from the rest of the network. <italic>SM</italic>(<italic>C</italic>) can identify protein complexes with cohesion and separation topological properties. This shows that proteins in a protein complex displayed intense and frequent connections within the protein complex and weak and rare connections to proteins outside of the protein complex.</p>
</sec>
<sec id="s2-3-4-3">
<title>2.3.4.3 Building an Ensemble Learning Model</title>
<p>In this paper, we propose an ensemble learning model that combines the VotingRegressor model and structural modularity (SM) to quantify the likelihood of a cluster <italic>C</italic>&#x20;&#x3d; (<italic>V</italic>
<sub>
<italic>C</italic>
</sub>, <italic>E</italic>
<sub>
<italic>C</italic>
</sub>, <italic>W</italic>
<sub>
<italic>C</italic>
</sub>) being a candidate protein complex to guide the identification of protein complex processes. An ensemble learning model can improve the robustness and stability of the clusterings by combining the output of several models, thus improving the overall accuracy. For a cluster <italic>C</italic>, its ensemble learning model is defined as <xref ref-type="disp-formula" rid="e14">Eq. 14</xref>:<disp-formula id="e14">
<mml:math id="m18">
<mml:mi>F</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>s</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>C</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>V</mml:mi>
<mml:mi>R</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>C</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>S</mml:mi>
<mml:mi>M</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>C</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:math>
<label>(14)</label>
</disp-formula>
</p>
<p>Based on the ensemble learning model, we will introduce a graph heuristic search strategy by using the ensemble learning model to form protein complexes.</p>
</sec>
</sec>
<sec id="s2-3-5">
<title>2.3.5 Forming Protein Complexes</title>
<p>Based on the fact that a protein complex core and attachment proteins form a protein complex, we obtain some protein complex cores. Next, we extract the attachment proteins of each protein complex core and select reliable attachments cooperating with its protein complex core to form a protein complex. We design a graph heuristic search strategy for each protein complex core to extend the protein complex core to form a whole protein complex. First, it starts with a protein complex core, which iteratively inserts neighboring proteins into the protein complex core and then removes proteins from the protein complex core to search for a locally optimal cluster. In this paper, each protein complex core is subjected to a graph heuristic search strategy and an ensemble learning model to form a protein complex. The basic idea of a graph heuristic search strategy for a protein complex core is iteratively extended and corrected to form a protein complex by maximizing the score of the ensemble learning model (please see Obtaining an ensemble learning model section).</p>
<p>The pseudocode of the graph heuristic search strategy is shown in <xref ref-type="statement" rid="Algorithm_3">Algorithm 3</xref>, which consists of the following steps:<list list-type="simple">
<list-item>
<p>i Input a protein complex&#x20;core.</p>
</list-item>
<list-item>
<p>ii Adding outer boundary proteins process in Lines 3&#x2013;12: First, for the current protein complex core, we construct its outer boundary proteins set. We first obtain all directly connected neighbor proteins of the current protein complex core, and then we rank these neighbor proteins according to the number of shared proteins between the neighbor of the neighbor protein and current protein complex core. We discard the neighboring proteins with fewer than two common proteins to select high-quality candidate neighboring proteins. Then we select only half of the neighboring protein set reserved according to the sorting results as the outer boundary proteins set in Line 3. Second, we calculate the ensemble learning model score for the current protein complex core when each outer boundary protein is temporarily added. The outer boundary protein that allows the ensemble learning model score to reach a maximum will be inserted into the protein complex core in Lines 5&#x2013;11. This process is repeated until the ensemble learning model score of the protein complex core is not increased, or the size of the outer boundary nodes is zero in Lines 10 and&#x20;4.</p>
</list-item>
<list-item>
<p>iii First, for the current protein complex core, inner boundary proteins are the set of proteins that belong to the protein complex core and connect at least one other protein in the PPI networks in Line 16. Second, we calculate the score of the ensemble learning model after each inner boundary node is temporarily removed from the protein complex core. The inner boundary protein that increases the ensemble learning model score is determined, and it will be eliminated from the protein complex core in Lines 19&#x2013;21. This process is continued until the ensemble learning model score of the protein complex core reaches a maximum or the size of the inner boundary protein set is zero, and the number of current protein complex cores is less than or equal to 2 in Lines 22&#x2013;23 and&#x20;17.</p>
</list-item>
<list-item>
<p>iv We repeat ii) and iii) until the protein complex core is no longer changed or no increment in the <italic>Fitness</italic>(<italic>SG</italic>) of the protein complex core in Lines 27&#x2013;30, the current protein complex core is considered to be formed as a locally optimal cluster in Line 2&#x2013;31, and then output it as a detected protein complex in Line&#x20;32.</p>
</list-item>
</list>
</p>
<p>Finally, we select the next protein complex core. Then we repeat this process using a graph heuristic search strategy (<xref ref-type="statement" rid="Algorithm_3">Algorithm 3</xref>) to extend the next protein complex core to form a protein complex until no seed edges remain. In the last step of the algorithm, some redundant protein complexes and protein complexes containing fewer than three proteins are discarded.</p>
<p>
<statement content-type="algorithm" id="Algorithm_3">
<label>Algorithm 3</label>
<p>A graph heuristic search strategy</p>
<p>
<inline-graphic xlink:href="fgene-13-839949-fx3.tif"/>
</p>
</statement>
</p>
</sec>
</sec>
</sec>
<sec id="s3">
<title>3 Experiments and Results</title>
<p>ELF-DPC was implemented in Python three and was successfully executed on a PC with an Intel i7-4790 CPU @3.60 GHz and 80&#xa0;GB&#x20;RAM.</p>
<sec id="s3-1">
<title>3.1 Evaluation Metrics</title>
<p>In this study, to evaluate the proposed method, we need to compare the performance of our method against the compared methods by some statistical metrics. For this purpose, we used the neighborhood affinity, F-measure, CR, ACC, MMR, and Jaccard criteria to evaluate the protein complex detection algorithms. Let <italic>S</italic> denote the known protein complexes, and <italic>D</italic> denote the protein complexes identified by a detection method.</p>
<sec id="s3-1-1">
<title>3.1.1 Neighborhood Affinity</title>
<p>
<italic>S</italic>
<sub>
<italic>i</italic>
</sub> is a standard protein complex in <italic>S</italic>, and <italic>D</italic>
<sub>
<italic>j</italic>
</sub> is a discovered protein complex <italic>D</italic>. Their neighborhood affinity score (<italic>NA</italic>(<italic>S</italic>
<sub>
<italic>i</italic>
</sub>, <italic>D</italic>
<sub>
<italic>j</italic>
</sub>)) (<xref ref-type="bibr" rid="B5">Brohee and Van Helden, 2006</xref>) can describe the similarity of two protein complexes <italic>S</italic>
<sub>
<italic>i</italic>
</sub> and <italic>D</italic>
<sub>
<italic>j</italic>
</sub>, and it is defined as <xref ref-type="disp-formula" rid="e15">Eq.15</xref>:<disp-formula id="e15">
<mml:math id="m19">
<mml:mi>N</mml:mi>
<mml:mi>A</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2229;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mo>&#xd7;</mml:mo>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
</mml:mfrac>
<mml:mo>.</mml:mo>
</mml:math>
<label>(15)</label>
</disp-formula>
</p>
<p>Generally, if <italic>NA</italic>(<italic>S</italic>
<sub>
<italic>i</italic>
</sub>, <italic>D</italic>
<sub>
<italic>j</italic>
</sub>) is larger than or equal to 0.2, protein complexes <italic>S</italic>
<sub>
<italic>i</italic>
</sub> and <italic>D</italic>
<sub>
<italic>j</italic>
</sub> are regarded as matching protein complexes (<xref ref-type="bibr" rid="B33">Li et&#x20;al., 2010</xref>).</p>
</sec>
<sec id="s3-1-2">
<title>3.1.2&#x20;F-Measure</title>
<p>Let <italic>N</italic>
<sub>
<italic>sm</italic>
</sub> be the number of standard protein complexes that match at least one detected protein complex, i.e.,&#x20;<italic>N</italic>
<sub>
<italic>sm</italic>
</sub> &#x3d; &#x7c;{<italic>s</italic>&#x7c;<italic>s</italic>&#x20;&#x2208; <italic>S</italic>, <italic>&#x2203;d</italic> &#x2208; <italic>D</italic>, <italic>NA</italic>(<italic>s</italic>, <italic>d</italic>) &#x2265; <italic>&#x3c9;</italic>}&#x7c; and <italic>N</italic>
<sub>
<italic>im</italic>
</sub> be the number of detected protein complexes that match at least one standard protein complex, i.e.,&#x20;<italic>N</italic>
<sub>
<italic>im</italic>
</sub> &#x3d; &#x7c;{<italic>d</italic>&#x7c;<italic>d</italic>&#x20;&#x2208; <italic>D</italic>, <italic>&#x2203;s</italic> &#x2208; <italic>S</italic>, <italic>NA</italic>(<italic>d</italic>, <italic>s</italic>) &#x2265; <italic>&#x3c9;</italic>}&#x7c;, where <italic>&#x3c9;</italic> is a predefined threshold and is usually 0.20. Recall and precision are defined as <inline-formula id="inf4">
<mml:math id="m20">
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>l</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:math>
</inline-formula> and <inline-formula id="inf5">
<mml:math id="m21">
<mml:mi>p</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>n</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>D</mml:mi>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:math>
</inline-formula>, respectively. Finally, the F-measure is the compromise between precision and recall and is defined by <xref ref-type="disp-formula" rid="e16">Eq. 16</xref>:<disp-formula id="e16">
<mml:math id="m22">
<mml:mi>F</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>m</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>p</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>n</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>n</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>l</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mo>.</mml:mo>
</mml:math>
<label>(16)</label>
</disp-formula>
</p>
</sec>
<sec id="s3-1-3">
<title>3.1.3 ACC</title>
<p>Let <italic>T</italic>
<sub>
<italic>ij</italic>
</sub> be the number of proteins that are included in both standard protein complex <italic>S</italic>
<sub>
<italic>i</italic>
</sub> and detected protein complex <italic>D</italic>
<sub>
<italic>j</italic>
</sub>, and let <italic>N</italic>
<sub>
<italic>i</italic>
</sub> be the number of proteins that are included in standard protein complexes <italic>S</italic>. Meanwhile, Sn and PPV are calculated by <inline-formula id="inf6">
<mml:math id="m23">
<mml:mi>S</mml:mi>
<mml:mi>n</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mo movablelimits="false" form="prefix">&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:msubsup>
<mml:mrow>
<mml:mi>max</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>D</mml:mi>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mfenced open="{" close="}">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mo movablelimits="false" form="prefix">&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:msub>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfrac>
</mml:math>
</inline-formula> and <inline-formula id="inf7">
<mml:math id="m24">
<mml:mi>P</mml:mi>
<mml:mi>P</mml:mi>
<mml:mi>V</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mo movablelimits="false" form="prefix">&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>D</mml:mi>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:msubsup>
<mml:mrow>
<mml:mi>max</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mfenced open="{" close="}">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mo movablelimits="false" form="prefix">&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>D</mml:mi>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:msubsup>
<mml:mrow>
<mml:mo movablelimits="false" form="prefix">&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:msub>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfrac>
</mml:math>
</inline-formula>, respectively. As a result, the accuracy (ACC) is defined by <xref ref-type="disp-formula" rid="e17">Eq. 17</xref>:<disp-formula id="e17">
<mml:math id="m25">
<mml:mtable class="matrix">
<mml:mtr>
<mml:mtd columnalign="center">
<mml:mi>A</mml:mi>
<mml:mi>C</mml:mi>
<mml:mi>C</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:msqrt>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mi>n</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>P</mml:mi>
<mml:mi>P</mml:mi>
<mml:mi>V</mml:mi>
</mml:mrow>
</mml:msqrt>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
<label>(17)</label>
</disp-formula>
</p>
</sec>
<sec id="s3-1-4">
<title>3.1.4 MMR</title>
<p>We used the third metric, the maximum matching ratio (MMR) (<xref ref-type="bibr" rid="B42">Nepusz et&#x20;al., 2012</xref>) based on the maximal one-to-one mapping between standard protein complexes and detected protein complexes. First, we need to construct a bipartite graph between <italic>S</italic> and <italic>D</italic>, and then each standard protein complex <italic>S</italic>
<sub>
<italic>i</italic>
</sub> &#x2208; <italic>S</italic> and detected protein complex <italic>D</italic>
<sub>
<italic>j</italic>
</sub> &#x2208; <italic>D</italic> are connected by the weight <italic>W</italic>(<italic>S</italic>
<sub>
<italic>i</italic>
</sub>, <italic>D</italic>
<sub>
<italic>j</italic>
</sub>) edge. Next, we select disjoint edges from the bipartite graph to maximize the sum of their weights; Finally, the MMR is the sum of the weights of all selected edges divided by &#x7c;<italic>S</italic>&#x7c;, which is denoted by <xref ref-type="disp-formula" rid="e18">Eq. 18</xref>:<disp-formula id="e18">
<mml:math id="m26">
<mml:mi>M</mml:mi>
<mml:mi>M</mml:mi>
<mml:mi>R</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mo movablelimits="false" form="prefix">&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:munder>
<mml:mrow>
<mml:mi>max</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mi>N</mml:mi>
<mml:mi>A</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
</mml:mfrac>
<mml:mo>.</mml:mo>
</mml:math>
<label>(18)</label>
</disp-formula>
</p>
</sec>
<sec id="s3-1-5">
<title>3.1.5 Coverage Rate</title>
<p>The coverage rate (CR) was used to assess how many proteins in the standard protein complexes could be covered by the&#x20;identified complexes. When the standard protein complexes <italic>S</italic> and the detected protein complexes <italic>D</italic> are given, the &#x7c;<italic>S</italic>&#x7c;&#xd7;&#x7c;<italic>D</italic>&#x7c; matrix <italic>T</italic> is constructed, where each element max{<italic>T</italic>
<sub>
<italic>ij</italic>
</sub>} is the most significant number of shared proteins between the <italic>i</italic>th standard protein complex, and the <italic>j</italic>th detected protein complex. The coverage rate is calculated by <xref ref-type="disp-formula" rid="e19">Eq. 19</xref>:<disp-formula id="e19">
<mml:math id="m27">
<mml:mi>C</mml:mi>
<mml:mi>R</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mo movablelimits="false" form="prefix">&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mi mathvariant="italic">max</mml:mi>
<mml:mfenced open="{" close="}">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mo movablelimits="false" form="prefix">&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:msub>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfrac>
<mml:mo>.</mml:mo>
</mml:math>
<label>(19)</label>
</disp-formula>where <italic>N</italic>
<sub>
<italic>i</italic>
</sub> is the number of proteins in the <italic>i</italic>th standard complex.</p>
</sec>
<sec id="s3-1-6">
<title>3.1.6 Jaccard</title>
<p>Jaccard is the final method for measuring the clustering methods (<xref ref-type="bibr" rid="B52">Song and Singh, 2009</xref>). Here, a standard protein complex is <italic>S</italic>
<sub>
<italic>i</italic>
</sub> &#x2208; <italic>S</italic>, and a discovered protein complex is <italic>D</italic>
<sub>
<italic>j</italic>
</sub> &#x2208; <italic>D</italic>. Then, their Jaccard is <inline-formula id="inf8">
<mml:math id="m28">
<mml:mi>J</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2229;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x222a;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:math>
</inline-formula>. For the discovered protein complex <italic>D</italic>
<sub>
<italic>j</italic>
</sub>, its Jaccard is <inline-formula id="inf9">
<mml:math id="m29">
<mml:mi>J</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>m</mml:mi>
<mml:mi>a</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:mi>S</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mi>J</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>. For a standard protein complex <italic>S</italic>
<sub>
<italic>i</italic>
</sub>, its Jaccard is <inline-formula id="inf10">
<mml:math id="m30">
<mml:mi>J</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>m</mml:mi>
<mml:mi>a</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:mi>D</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mi>J</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>. Then, for detected protein complexes <italic>D</italic>, the average of the weighted Jaccard is <inline-formula id="inf11">
<mml:math id="m31">
<mml:mi>J</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>D</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mo movablelimits="false" form="prefix">&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:mi>D</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>J</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mo movablelimits="false" form="prefix">&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:mi>D</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:math>
</inline-formula>. Similarly, for the standard protein complexes <italic>S</italic>, its JaccardS is defined by <inline-formula id="inf12">
<mml:math id="m32">
<mml:mi>J</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>S</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mo movablelimits="false" form="prefix">&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:mi>S</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mi>J</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mo movablelimits="false" form="prefix">&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:mi>S</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:math>
</inline-formula>. Finally, the Jaccard is calculated by <xref ref-type="disp-formula" rid="e20">Eq. 20</xref>:<disp-formula id="e20">
<mml:math id="m33">
<mml:mi>J</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>d</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>J</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>D</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>J</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>S</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mi>J</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>D</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:mi>J</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>S</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mo>.</mml:mo>
</mml:math>
<label>(20)</label>
</disp-formula>
</p>
</sec>
<sec id="s3-1-7">
<title>3.1.7 Functional Enrichment Analysis</title>
<p>In addition to these metrics to measure the performance of ELF-DPC, we investigated whether these identified protein complexes have biological significance by calculating the <italic>p</italic>-value. Generally, a detected protein complex possesses biological significance if its <italic>p</italic>-value is less than 0.01. In this paper, we used the fast tool LAGO (<xref ref-type="bibr" rid="B4">Boyle et&#x20;al., 2004</xref>) to compute a <italic>p</italic>-value, and it is based on the hypergeometric distribution and Bonferroni correction. For more information about it, please refer to the literature (<xref ref-type="bibr" rid="B4">Boyle et&#x20;al., 2004</xref>; <xref ref-type="bibr" rid="B57">Wang et&#x20;al., 2019</xref>). The <italic>p</italic>-value is denoted as <xref ref-type="disp-formula" rid="e21">Eq. 21</xref>
<disp-formula id="e21">
<mml:math id="m34">
<mml:mi>p</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>v</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>e</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:munderover accentunder="false" accent="false">
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:munderover>
<mml:mfrac>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mfrac linethickness="0.0pt">
<mml:mrow>
<mml:mi>F</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mfenced>
<mml:mfenced open="(" close=")">
<mml:mfrac linethickness="0.0pt">
<mml:mrow>
<mml:mi>N</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>F</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>C</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mfrac linethickness="0.0pt">
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>C</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mfenced>
</mml:mrow>
</mml:mfrac>
<mml:mo>,</mml:mo>
</mml:math>
<label>(21)</label>
</disp-formula>where <italic>k</italic> is the number of functional group proteins in the protein complex, and <italic>N</italic> is the number of proteins in the PPI networks. <italic>F</italic> is the size of the functional group in the PPI networks. We assume that a discovered protein complex contains <italic>C</italic> proteins.</p>
</sec>
</sec>
<sec id="s3-2">
<title>3.2 Parameter Selection</title>
<p>To study the effect of parameter <italic>ratio</italic> on the performance of ELF-DPC, we adjusted the value of <italic>ratio</italic> from 1 to 20 by increments of 5 through several experiments and set it to the appropriate values. <xref ref-type="fig" rid="F3">Figures 3</xref>, <xref ref-type="fig" rid="F4">4</xref> show the changing trend of the Total score with the value of <italic>ratio</italic> for the ELF-DPC algorithm with four PPI networks and two standard protein complex combinations. In standard protein complexes 1, <italic>ratio</italic> reaches its maximum value at <italic>ratio</italic> &#x3d; 5. In standard protein complexes 2, <italic>ratio</italic> reaches its maximum value at <italic>ratio</italic> &#x3d; 15. We can see that the Total score is not very sensitive to <italic>ratio</italic>, it tends to be stable when <italic>ratio</italic> falls in (5,15), and the fluctuations of the Total score are not significant. Therefore, the value of <italic>ratio</italic> is set as 5 by the default value in this&#x20;study.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption>
<p>Value of parameters ratio for ELF-DPC based on standard protein complexes 1.</p>
</caption>
<graphic xlink:href="fgene-13-839949-g003.tif"/>
</fig>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption>
<p>Value of parameters ratio for ELF-DPC based on standard protein complexes 2.</p>
</caption>
<graphic xlink:href="fgene-13-839949-g004.tif"/>
</fig>
</sec>
<sec id="s3-3">
<title>3.3 Comparison With State-of-the-art Algorithms</title>
<p>We obtained the software implementations for all the compared methods, and their parameters are shown in <xref ref-type="table" rid="T4">Table&#x20;4</xref>. Although better results could probably be obtained by fine-tuning these parameters, to maintain the fairness of different algorithms, the parameters of the compared algorithms and the ELF-DPC algorithm were set as the recommended values by the authors.</p>
<table-wrap id="T4" position="float">
<label>TABLE 4</label>
<caption>
<p>Parameters of each method used in the&#x20;study.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">ID</th>
<th align="center">Year</th>
<th align="center">Algorithms</th>
<th align="center">Parameter</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">1</td>
<td align="char" char=".">2003</td>
<td align="left">MCL</td>
<td align="left">inflation &#x3d; 2 (default setting)</td>
</tr>
<tr>
<td align="left">2</td>
<td align="char" char=".">2006</td>
<td align="left">DPClus</td>
<td align="left">
<italic>d</italic>
<sub>
<italic>in</italic>
</sub> &#x3d; 0.7, <italic>cp</italic>
<sub>
<italic>in</italic>
</sub> &#x3d; 0.50 (author suggestions)</td>
</tr>
<tr>
<td align="left">3</td>
<td align="char" char=".">2009</td>
<td align="left">CMC</td>
<td align="left">&#x2009; min&#x2009;_<italic>deg</italic>_<italic>ratio</italic> &#x3d; 1, min&#x2009;_<italic>size</italic> &#x3d; 3, <italic>overlap</italic>_<italic>thres</italic> &#x3d; 0.5, <italic>merge</italic>
<sub>
<italic>t</italic>
</sub>
<italic>hres</italic> &#x3d; 0.25(default setting)</td>
</tr>
<tr>
<td align="left">4</td>
<td align="char" char=".">2012</td>
<td align="left">ClusterONE</td>
<td align="left">Density &#x3d; auto, Overlap threshold &#x3d; 0.8(author suggestions)</td>
</tr>
<tr>
<td align="left">5</td>
<td align="char" char=".">2013</td>
<td align="left">PEWCC</td>
<td align="left">Overlap &#x3d; 0.8,-r &#x3d; 0.1, Re-join &#x3d; 0.3(author suggestions)</td>
</tr>
<tr>
<td align="left">6</td>
<td align="char" char=".">2015</td>
<td align="left">WPNCA</td>
<td align="left">lambda &#x3d; 0.3, size &#x3d; 3 (author suggestions)</td>
</tr>
<tr>
<td align="left">7</td>
<td align="char" char=".">2016</td>
<td align="left">CPredictor2.0</td>
<td align="left">
<italic>func</italic>_<italic>lvl</italic> &#x3d; 6, Overlap threshold &#x3d; 0.8, size &#x3d; 3 (default setting)</td>
</tr>
<tr>
<td align="left">8</td>
<td align="char" char=".">2016</td>
<td align="left">Zhang</td>
<td align="left">
<italic>Complex</italic>_<italic>thresh</italic> &#x3d; 0.1 (author suggestions)</td>
</tr>
<tr>
<td align="left">9</td>
<td align="char" char=".">2017</td>
<td align="left">ClusterEPs</td>
<td align="left">NEPs of Complexes (minimum support threshold &#x3d; 0.4, maximum support threshold &#x3d; 0.05); NEPs of non-complexes (maximum support threshold &#x3d; 0.05, minimum support threshold &#x3d; 0.4); maximum overlap &#x3d; 0.9, Maximum size of clusters &#x3d; 100 (author suggestions)</td>
</tr>
<tr>
<td align="left">10</td>
<td align="char" char=".">2018</td>
<td align="left">ClusterSS</td>
<td align="left">numEpochs &#x3d; 500, learnRate &#x3d; 0.2, thresholdIn &#x3d; 1.0, thresholdOut &#x3d; 1.02, negativeTime &#x3d; 20, minimum cluster size &#x3d; 3 (author suggestions)</td>
</tr>
<tr>
<td align="left">11</td>
<td align="char" char=".">2019</td>
<td align="left">ICJointLE</td>
<td align="left">-L &#x3d; 1,-r &#x3d; 999,-d &#x3d; 0.3,-c &#x3d; 0.7,-f &#x3d; 0.75,-p &#x3d; 0.3,-m &#x3d; 0.08, -u &#x3d; 0.01,-e &#x3d; 0.9, size &#x3d; 3 (author suggestions)</td>
</tr>
<tr>
<td align="left">12</td>
<td align="char" char=".">2021</td>
<td align="left">PC2P</td>
<td align="left">minimum cluster size &#x3d; 3</td>
</tr>
<tr>
<td align="left">13</td>
<td align="char" char=".">2022</td>
<td align="left">ELF-DPC</td>
<td align="left">
<italic>ratio</italic> &#x3d; 5, minimum cluster size &#x3d; 3 (default setting)</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>In this section, we tested ELF-DPC on four original PPI networks, i.e.,&#x20;Gavin and Krogan core, DIP, and MIPS, and two known protein complexes were used for training and assessing the performance of ELF-DPC. We used six computational metrics, the F-measure, CR, ACC, MMR, Jaccard, and total score, to evaluate the performance. Here, we define the sum of the top five measures as the Total score. Note that the number of identified protein complexes (Num) was counted by each method. To illustrate the performance of ELF-DPC, we selected ten representative unsupervised methods, including DPClus (<xref ref-type="bibr" rid="B3">Altaf-Ul-Amin et&#x20;al., 2006</xref>), CMC (<xref ref-type="bibr" rid="B35">Liu et&#x20;al., 2009</xref>), ClusterONE (<xref ref-type="bibr" rid="B42">Nepusz et&#x20;al., 2012</xref>), PEWCC (<xref ref-type="bibr" rid="B70">Zaki et&#x20;al., 2013</xref>), WPNCA (<xref ref-type="bibr" rid="B45">Peng et&#x20;al., 2014</xref>), CPredictor2.0 (<xref ref-type="bibr" rid="B65">Xu et&#x20;al., 2017</xref>), Zhang (<xref ref-type="bibr" rid="B74">Zhang et&#x20;al., 2016</xref>), ICJointLE (<xref ref-type="bibr" rid="B72">Zhang et&#x20;al., 2019</xref>), PC2P (<xref ref-type="bibr" rid="B43">Omranian et&#x20;al., 2021</xref>), and two state-of-the-art supervised methods, including ClusterEPs (<xref ref-type="bibr" rid="B36">Liu et&#x20;al., 2016</xref>) and ClusterSS (<xref ref-type="bibr" rid="B8">Dong et&#x20;al., 2018</xref>). <xref ref-type="table" rid="T5">Tables 5</xref>, <xref ref-type="table" rid="T6">6</xref> show the comparison results of all methods on four PPI networks in terms of six evaluation metrics, and the highest value of each metric of each PPI network is in&#x20;bold.</p>
<table-wrap id="T5" position="float">
<label>TABLE 5</label>
<caption>
<p>Experimental results by the different methods using standard protein complexes 1.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Name</th>
<th align="center">Num</th>
<th align="center">F-measure</th>
<th align="center">CR</th>
<th align="center">ACC</th>
<th align="center">MMR</th>
<th align="center">Jaccard</th>
<th align="center">Total score</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td colspan="8" align="left">
<bold>Gavin</bold>
</td>
</tr>
<tr>
<td align="left">&#xa0;MCL</td>
<td align="char" char=".">220</td>
<td align="left">0.535&#x2009;8</td>
<td align="left">0.489&#x2009;1</td>
<td align="left">
<bold>0.365&#x2009;7</bold>
</td>
<td align="left">0.149&#x2009;4</td>
<td align="left">0.361&#x2009;0</td>
<td align="left">1.901&#x2009;0</td>
</tr>
<tr>
<td align="left">&#xa0;DPClus</td>
<td align="char" char=".">285</td>
<td align="left">0.597&#x2009;2</td>
<td align="left">0.438&#x2009;2</td>
<td align="left">0.346&#x2009;6</td>
<td align="left">0.173&#x2009;6</td>
<td align="left">0.402&#x2009;5</td>
<td align="left">1.958&#x2009;1</td>
</tr>
<tr>
<td align="left">&#xa0;CMC</td>
<td align="char" char=".">294</td>
<td align="left">0.584&#x2009;4</td>
<td align="left">0.450&#x2009;1</td>
<td align="left">0.348&#x2009;7</td>
<td align="left">0.222&#x2009;9</td>
<td align="left">0.417&#x2009;9</td>
<td align="left">2.023&#x2009;9</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterONE</td>
<td align="char" char=".">258</td>
<td align="left">0.597&#x2009;6</td>
<td align="left">0.451&#x2009;4</td>
<td align="left">0.345&#x2009;8</td>
<td align="left">0.192&#x2009;1</td>
<td align="left">0.397&#x2009;4</td>
<td align="left">1.984&#x2009;4</td>
</tr>
<tr>
<td align="left">&#xa0;PEWCC</td>
<td align="char" char=".">
<bold>664</bold>
</td>
<td align="left">0.657&#x2009;6</td>
<td align="left">0.431&#x2009;6</td>
<td align="left">0.314&#x2009;6</td>
<td align="left">
<bold>0.353&#x2009;8</bold>
</td>
<td align="left">0.396&#x2009;9</td>
<td align="left">2.154&#x2009;6</td>
</tr>
<tr>
<td align="left">&#xa0;WPNCA</td>
<td align="char" char=".">484</td>
<td align="left">0.642&#x2009;8</td>
<td align="left">
<bold>0.494&#x2009;9</bold>
</td>
<td align="left">0.311&#x2009;4</td>
<td align="left">0.255&#x2009;7</td>
<td align="left">0.355&#x2009;4</td>
<td align="left">2.060&#x2009;2</td>
</tr>
<tr>
<td align="left">&#xa0;CPredictor2.0</td>
<td align="char" char=".">266</td>
<td align="left">0.628&#x2009;6</td>
<td align="left">0.375&#x2009;0</td>
<td align="left">0.306&#x2009;2</td>
<td align="left">0.214&#x2009;4</td>
<td align="left">0.412&#x2009;4</td>
<td align="left">1.936&#x2009;5</td>
</tr>
<tr>
<td align="left">&#xa0;Zhang</td>
<td align="char" char=".">438</td>
<td align="left">0.647&#x2009;5</td>
<td align="left">0.397&#x2009;6</td>
<td align="left">0.315&#x2009;6</td>
<td align="left">0.318&#x2009;2</td>
<td align="left">0.408&#x2009;4</td>
<td align="left">2.087&#x2009;2</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterEPs</td>
<td align="char" char=".">271</td>
<td align="left">0.601&#x2009;4</td>
<td align="left">0.365&#x2009;6</td>
<td align="left">0.284&#x2009;1</td>
<td align="left">0.216&#x2009;6</td>
<td align="left">0.409&#x2009;0</td>
<td align="left">1.876&#x2009;6</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterSS</td>
<td align="char" char=".">482</td>
<td align="left">0.560&#x2009;0</td>
<td align="left">0.394&#x2009;1</td>
<td align="left">0.321&#x2009;8</td>
<td align="left">0.253&#x2009;5</td>
<td align="left">0.368&#x2009;5</td>
<td align="left">1.897&#x2009;9</td>
</tr>
<tr>
<td align="left">&#xa0;ICJointLE</td>
<td align="char" char=".">243</td>
<td align="left">0.632&#x2009;9</td>
<td align="left">0.355&#x2009;7</td>
<td align="left">0.298&#x2009;9</td>
<td align="left">0.261&#x2009;9</td>
<td align="left">0.402&#x2009;1</td>
<td align="left">1.951&#x2009;5</td>
</tr>
<tr>
<td align="left">&#xa0;PC2P</td>
<td align="char" char=".">219</td>
<td align="left">0.576&#x2009;9</td>
<td align="left">0.443&#x2009;9</td>
<td align="left">0.355&#x2009;1</td>
<td align="left">0.182&#x2009;5</td>
<td align="left">0.392&#x2009;2</td>
<td align="left">1.950&#x2009;5</td>
</tr>
<tr>
<td align="left">&#xa0;ELF-DPC</td>
<td align="char" char=".">286</td>
<td align="left">
<bold>0.667&#x2009;4</bold>
</td>
<td align="left">0.479&#x2009;2</td>
<td align="left">0.339&#x2009;1</td>
<td align="left">0.251&#x2009;6</td>
<td align="left">
<bold>0.433&#x2009;0</bold>
</td>
<td align="left">
<bold>2.170&#x2009;2</bold>
</td>
</tr>
<tr>
<td colspan="8" align="left">
<bold>Krogan core</bold>
</td>
</tr>
<tr>
<td align="left">&#xa0;MCL</td>
<td align="char" char=".">370</td>
<td align="left">0.400&#x2009;4</td>
<td align="left">0.389&#x2009;5</td>
<td align="left">
<bold>0.319&#x2009;2</bold>
</td>
<td align="left">0.136&#x2009;1</td>
<td align="left">0.290&#x2009;2</td>
<td align="left">1.535&#x2009;4</td>
</tr>
<tr>
<td align="left">&#xa0;DPClus</td>
<td align="char" char=".">497</td>
<td align="left">0.413&#x2009;8</td>
<td align="left">0.367&#x2009;2</td>
<td align="left">0.307&#x2009;1</td>
<td align="left">0.174&#x2009;5</td>
<td align="left">0.323&#x2009;5</td>
<td align="left">1.586&#x2009;1</td>
</tr>
<tr>
<td align="left">&#xa0;CMC</td>
<td align="char" char=".">264</td>
<td align="left">0.481&#x2009;9</td>
<td align="left">0.365&#x2009;6</td>
<td align="left">0.297&#x2009;8</td>
<td align="left">0.158&#x2009;4</td>
<td align="left">0.368&#x2009;8</td>
<td align="left">1.672&#x2009;4</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterONE</td>
<td align="char" char=".">240</td>
<td align="left">0.469&#x2009;4</td>
<td align="left">0.308&#x2009;5</td>
<td align="left">0.282&#x2009;9</td>
<td align="left">0.152&#x2009;3</td>
<td align="left">0.332&#x2009;4</td>
<td align="left">1.545&#x2009;4</td>
</tr>
<tr>
<td align="left">&#xa0;PEWCC</td>
<td align="char" char=".">383</td>
<td align="left">0.528&#x2009;9</td>
<td align="left">0.323&#x2009;1</td>
<td align="left">0.230&#x2009;9</td>
<td align="left">0.147&#x2009;1</td>
<td align="left">0.378&#x2009;6</td>
<td align="left">1.608&#x2009;5</td>
</tr>
<tr>
<td align="left">&#xa0;WPNCA</td>
<td align="char" char=".">369</td>
<td align="left">0.544&#x2009;6</td>
<td align="left">0.389&#x2009;7</td>
<td align="left">0.275&#x2009;8</td>
<td align="left">0.191&#x2009;2</td>
<td align="left">0.341&#x2009;5</td>
<td align="left">1.742&#x2009;8</td>
</tr>
<tr>
<td align="left">&#xa0;CPredictor2.0</td>
<td align="char" char=".">236</td>
<td align="left">0.589&#x2009;5</td>
<td align="left">0.303&#x2009;7</td>
<td align="left">0.272&#x2009;5</td>
<td align="left">0.195&#x2009;4</td>
<td align="left">0.368&#x2009;8</td>
<td align="left">1.729&#x2009;8</td>
</tr>
<tr>
<td align="left">&#xa0;Zhang</td>
<td align="char" char=".">326</td>
<td align="left">0.556&#x2009;3</td>
<td align="left">0.288&#x2009;4</td>
<td align="left">0.254&#x2009;9</td>
<td align="left">0.218&#x2009;2</td>
<td align="left">0.340&#x2009;8</td>
<td align="left">1.658&#x2009;5</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterEPs</td>
<td align="char" char=".">410</td>
<td align="left">0.583&#x2009;6</td>
<td align="left">0.335&#x2009;2</td>
<td align="left">0.262&#x2009;1</td>
<td align="left">0.220&#x2009;9</td>
<td align="left">0.344&#x2009;8</td>
<td align="left">1.746&#x2009;7</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterSS</td>
<td align="char" char=".">
<bold>722</bold>
</td>
<td align="left">0.437&#x2009;7</td>
<td align="left">0.375&#x2009;8</td>
<td align="left">0.307&#x2009;2</td>
<td align="left">0.240&#x2009;2</td>
<td align="left">0.335&#x2009;7</td>
<td align="left">1.696&#x2009;6</td>
</tr>
<tr>
<td align="left">&#xa0;ICJointLE</td>
<td align="char" char=".">216</td>
<td align="left">0.538&#x2009;9</td>
<td align="left">0.220&#x2009;6</td>
<td align="left">0.228&#x2009;4</td>
<td align="left">0.193&#x2009;6</td>
<td align="left">0.304&#x2009;2</td>
<td align="left">1.485&#x2009;7</td>
</tr>
<tr>
<td align="left">&#xa0;PC2P</td>
<td align="char" char=".">249</td>
<td align="left">0.435&#x2009;6</td>
<td align="left">0.345&#x2009;8</td>
<td align="left">0.297&#x2009;0</td>
<td align="left">0.133&#x2009;7</td>
<td align="left">0.319&#x2009;0</td>
<td align="left">1.531&#x2009;0</td>
</tr>
<tr>
<td align="left">&#xa0;ELF-DPC</td>
<td align="char" char=".">304</td>
<td align="left">
<bold>0.628&#x2009;7</bold>
</td>
<td align="left">
<bold>0.423&#x2009;9</bold>
</td>
<td align="left">0.298&#x2009;4</td>
<td align="left">
<bold>0.268&#x2009;7</bold>
</td>
<td align="left">
<bold>0.430&#x2009;2</bold>
</td>
<td align="left">
<bold>2.049&#x2009;9</bold>
</td>
</tr>
<tr>
<td colspan="8" align="left">
<bold>DIP</bold>
</td>
</tr>
<tr>
<td align="left">&#xa0;MCL</td>
<td align="char" char=".">628</td>
<td align="left">0.310&#x2009;6</td>
<td align="left">0.357&#x2009;8</td>
<td align="left">0.268&#x2009;4</td>
<td align="left">0.093&#x2009;2</td>
<td align="left">0.215&#x2009;5</td>
<td align="left">1.245&#x2009;5</td>
</tr>
<tr>
<td align="left">&#xa0;DPClus</td>
<td align="char" char=".">909</td>
<td align="left">0.308&#x2009;5</td>
<td align="left">0.379&#x2009;2</td>
<td align="left">0.272&#x2009;0</td>
<td align="left">0.123&#x2009;7</td>
<td align="left">0.264&#x2009;5</td>
<td align="left">1.348&#x2009;0</td>
</tr>
<tr>
<td align="left">&#xa0;CMC</td>
<td align="char" char=".">1,192</td>
<td align="left">0.361&#x2009;1</td>
<td align="left">0.355&#x2009;2</td>
<td align="left">0.248&#x2009;8</td>
<td align="left">0.197&#x2009;3</td>
<td align="left">0.296&#x2009;0</td>
<td align="left">1.458&#x2009;4</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterONE</td>
<td align="char" char=".">904</td>
<td align="left">0.511&#x2009;8</td>
<td align="left">
<bold>0.506&#x2009;2</bold>
</td>
<td align="left">
<bold>0.327&#x2009;0</bold>
</td>
<td align="left">0.175&#x2009;2</td>
<td align="left">0.329&#x2009;7</td>
<td align="left">1.849&#x2009;9</td>
</tr>
<tr>
<td align="left">&#xa0;PEWCC</td>
<td align="char" char=".">648</td>
<td align="left">0.600&#x2009;4</td>
<td align="left">0.378&#x2009;3</td>
<td align="left">0.226&#x2009;2</td>
<td align="left">0.157&#x2009;3</td>
<td align="left">
<bold>0.351&#x2009;4</bold>
</td>
<td align="left">1.713&#x2009;6</td>
</tr>
<tr>
<td align="left">&#xa0;WPNCA</td>
<td align="char" char=".">623</td>
<td align="left">0.588&#x2009;8</td>
<td align="left">0.430&#x2009;7</td>
<td align="left">0.259&#x2009;4</td>
<td align="left">0.207&#x2009;0</td>
<td align="left">0.336&#x2009;0</td>
<td align="left">1.821&#x2009;9</td>
</tr>
<tr>
<td align="left">&#xa0;CPredictor2.0</td>
<td align="char" char=".">293</td>
<td align="left">0.500&#x2009;8</td>
<td align="left">0.230&#x2009;2</td>
<td align="left">0.228&#x2009;7</td>
<td align="left">0.111&#x2009;0</td>
<td align="left">0.282&#x2009;5</td>
<td align="left">1.353&#x2009;3</td>
</tr>
<tr>
<td align="left">&#xa0;Zhang</td>
<td align="char" char=".">502</td>
<td align="left">0.562&#x2009;2</td>
<td align="left">0.325&#x2009;7</td>
<td align="left">0.242&#x2009;6</td>
<td align="left">0.181&#x2009;1</td>
<td align="left">0.322&#x2009;3</td>
<td align="left">1.633&#x2009;9</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterEPs</td>
<td align="char" char=".">804</td>
<td align="left">0.573&#x2009;0</td>
<td align="left">0.295&#x2009;4</td>
<td align="left">0.214&#x2009;7</td>
<td align="left">0.215&#x2009;4</td>
<td align="left">0.308&#x2009;7</td>
<td align="left">1.607&#x2009;3</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterSS</td>
<td align="char" char=".">
<bold>2,375</bold>
</td>
<td align="left">0.323&#x2009;0</td>
<td align="left">0.333&#x2009;5</td>
<td align="left">0.257&#x2009;7</td>
<td align="left">
<bold>0.233&#x2009;1</bold>
</td>
<td align="left">0.257&#x2009;3</td>
<td align="left">1.404&#x2009;7</td>
</tr>
<tr>
<td align="left">&#xa0;ICJointLE</td>
<td align="char" char=".">286</td>
<td align="left">0.573&#x2009;3</td>
<td align="left">0.232&#x2009;9</td>
<td align="left">0.204&#x2009;6</td>
<td align="left">0.150&#x2009;7</td>
<td align="left">0.303&#x2009;9</td>
<td align="left">1.465&#x2009;5</td>
</tr>
<tr>
<td align="left">&#xa0;PC2P</td>
<td align="char" char=".">441</td>
<td align="left">0.341&#x2009;9</td>
<td align="left">0.340&#x2009;1</td>
<td align="left">0.254&#x2009;2</td>
<td align="left">0.085&#x2009;4</td>
<td align="left">0.232&#x2009;4</td>
<td align="left">1.254&#x2009;0</td>
</tr>
<tr>
<td align="left">&#xa0;ELF-DPC</td>
<td align="char" char=".">564</td>
<td align="left">
<bold>0.620&#x2009;0</bold>
</td>
<td align="left">0.492&#x2009;2</td>
<td align="left">0.276&#x2009;8</td>
<td align="left">0.227&#x2009;3</td>
<td align="left">0.345&#x2009;4</td>
<td align="left">
<bold>1.961&#x2009;7</bold>
</td>
</tr>
<tr>
<td colspan="8" align="left">
<bold>MIPS</bold>
</td>
</tr>
<tr>
<td align="left">&#xa0;MCL</td>
<td align="char" char=".">594</td>
<td align="left">0.068&#x2009;1</td>
<td align="left">0.168&#x2009;6</td>
<td align="left">0.157&#x2009;7</td>
<td align="left">0.021&#x2009;4</td>
<td align="left">0.106&#x2009;4</td>
<td align="left">0.522&#x2009;1</td>
</tr>
<tr>
<td align="left">&#xa0;DPClus</td>
<td align="char" char=".">207</td>
<td align="left">0.378&#x2009;4</td>
<td align="left">0.203&#x2009;1</td>
<td align="left">0.213&#x2009;3</td>
<td align="left">0.082&#x2009;0</td>
<td align="left">0.226&#x2009;4</td>
<td align="left">1.103&#x2009;1</td>
</tr>
<tr>
<td align="left">&#xa0;CMC</td>
<td align="char" char=".">408</td>
<td align="left">0.334&#x2009;4</td>
<td align="left">0.233&#x2009;4</td>
<td align="left">0.212&#x2009;6</td>
<td align="left">0.099&#x2009;7</td>
<td align="left">0.225&#x2009;8</td>
<td align="left">1.105&#x2009;9</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterONE</td>
<td align="char" char=".">690</td>
<td align="left">0.292&#x2009;5</td>
<td align="left">0.271&#x2009;9</td>
<td align="left">
<bold>0.248&#x2009;9</bold>
</td>
<td align="left">0.098&#x2009;9</td>
<td align="left">0.204&#x2009;4</td>
<td align="left">1.116&#x2009;7</td>
</tr>
<tr>
<td align="left">&#xa0;PEWCC</td>
<td align="char" char=".">382</td>
<td align="left">0.280&#x2009;2</td>
<td align="left">0.190&#x2009;0</td>
<td align="left">0.138&#x2009;9</td>
<td align="left">0.056&#x2009;6</td>
<td align="left">0.167&#x2009;9</td>
<td align="left">0.833&#x2009;5</td>
</tr>
<tr>
<td align="left">&#xa0;WPNCA</td>
<td align="char" char=".">527</td>
<td align="left">0.330&#x2009;1</td>
<td align="left">0.260&#x2009;3</td>
<td align="left">0.182&#x2009;4</td>
<td align="left">0.101&#x2009;7</td>
<td align="left">0.179&#x2009;8</td>
<td align="left">1.054&#x2009;3</td>
</tr>
<tr>
<td align="left">&#xa0;CPredictor2.0</td>
<td align="char" char=".">265</td>
<td align="left">0.434&#x2009;4</td>
<td align="left">0.221&#x2009;2</td>
<td align="left">0.228&#x2009;8</td>
<td align="left">0.114&#x2009;0</td>
<td align="left">0.254&#x2009;5</td>
<td align="left">1.252&#x2009;9</td>
</tr>
<tr>
<td align="left">&#xa0;Zhang</td>
<td align="char" char=".">406</td>
<td align="left">0.370&#x2009;2</td>
<td align="left">0.205&#x2009;1</td>
<td align="left">0.202&#x2009;5</td>
<td align="left">0.107&#x2009;7</td>
<td align="left">0.217&#x2009;6</td>
<td align="left">1.103&#x2009;1</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterEPs</td>
<td align="char" char=".">645</td>
<td align="left">0.461&#x2009;0</td>
<td align="left">0.242&#x2009;6</td>
<td align="left">0.194&#x2009;3</td>
<td align="left">0.158&#x2009;0</td>
<td align="left">0.254&#x2009;3</td>
<td align="left">1.310&#x2009;2</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterSS</td>
<td align="char" char=".">
<bold>1,266</bold>
</td>
<td align="left">0.230&#x2009;9</td>
<td align="left">0.240&#x2009;0</td>
<td align="left">0.232&#x2009;0</td>
<td align="left">0.124&#x2009;2</td>
<td align="left">0.194&#x2009;2</td>
<td align="left">1.021&#x2009;3</td>
</tr>
<tr>
<td align="left">&#xa0;ICJointLE</td>
<td align="char" char=".">121</td>
<td align="left">0.364&#x2009;9</td>
<td align="left">0.134&#x2009;3</td>
<td align="left">0.172&#x2009;3</td>
<td align="left">0.084&#x2009;5</td>
<td align="left">0.206&#x2009;6</td>
<td align="left">0.962&#x2009;6</td>
</tr>
<tr>
<td align="left">&#xa0;PC2P</td>
<td align="char" char=".">374</td>
<td align="left">0.234&#x2009;7</td>
<td align="left">0.237&#x2009;1</td>
<td align="left">0.213&#x2009;7</td>
<td align="left">0.065&#x2009;2</td>
<td align="left">0.166&#x2009;2</td>
<td align="left">0.917&#x2009;0</td>
</tr>
<tr>
<td align="left">&#xa0;ELF-DPC</td>
<td align="char" char=".">483</td>
<td align="left">
<bold>0.481&#x2009;1</bold>
</td>
<td align="left">
<bold>0.291&#x2009;4</bold>
</td>
<td align="left">0.223&#x2009;7</td>
<td align="left">
<bold>0.167&#x2009;8</bold>
</td>
<td align="left">
<bold>0.259&#x2009;9</bold>
</td>
<td align="left">
<bold>1.423&#x2009;9</bold>
</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>The bold values are the highest value of each metric of each PPI network.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<table-wrap id="T6" position="float">
<label>TABLE 6</label>
<caption>
<p>Experimental results by the different methods using standard protein complexes 2.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Name</th>
<th align="center">Num</th>
<th align="center">F-measure</th>
<th align="center">CR</th>
<th align="center">ACC</th>
<th align="center">MMR</th>
<th align="center">Jaccard</th>
<th align="center">Total score</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td colspan="8" align="left">
<bold>Gavin</bold>
</td>
</tr>
<tr>
<td align="left">&#xa0;MCL</td>
<td align="char" char=".">220</td>
<td align="left">0.375&#x2009;6</td>
<td align="left">0.409&#x2009;1</td>
<td align="left">
<bold>0.358&#x2009;7</bold>
</td>
<td align="left">0.115&#x2009;3</td>
<td align="left">0.312&#x2009;6</td>
<td align="left">1.571&#x2009;3</td>
</tr>
<tr>
<td align="left">&#xa0;DPClus</td>
<td align="char" char=".">285</td>
<td align="left">0.385&#x2009;4</td>
<td align="left">0.348&#x2009;3</td>
<td align="left">0.329&#x2009;3</td>
<td align="left">0.140&#x2009;5</td>
<td align="left">0.314&#x2009;7</td>
<td align="left">1.518&#x2009;2</td>
</tr>
<tr>
<td align="left">&#xa0;CMC</td>
<td align="char" char=".">294</td>
<td align="left">0.380&#x2009;3</td>
<td align="left">0.357&#x2009;5</td>
<td align="left">0.330&#x2009;1</td>
<td align="left">0.145&#x2009;9</td>
<td align="left">0.325&#x2009;7</td>
<td align="left">1.539&#x2009;5</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterONE</td>
<td align="char" char=".">258</td>
<td align="left">0.409&#x2009;0</td>
<td align="left">0.363&#x2009;3</td>
<td align="left">0.335&#x2009;9</td>
<td align="left">0.141&#x2009;9</td>
<td align="left">0.320&#x2009;0</td>
<td align="left">1.570&#x2009;3</td>
</tr>
<tr>
<td align="left">&#xa0;PEWCC</td>
<td align="char" char=".">
<bold>664</bold>
</td>
<td align="left">0.418&#x2009;5</td>
<td align="left">0.348&#x2009;3</td>
<td align="left">0.313&#x2009;7</td>
<td align="left">
<bold>0.215&#x2009;2</bold>
</td>
<td align="left">0.299&#x2009;9</td>
<td align="left">1.595&#x2009;5</td>
</tr>
<tr>
<td align="left">&#xa0;WPNCA</td>
<td align="char" char=".">484</td>
<td align="left">0.421&#x2009;7</td>
<td align="left">
<bold>0.411&#x2009;6</bold>
</td>
<td align="left">0.330&#x2009;5</td>
<td align="left">0.167&#x2009;0</td>
<td align="left">0.296&#x2009;2</td>
<td align="left">1.627&#x2009;0</td>
</tr>
<tr>
<td align="left">&#xa0;CPredictor2.0</td>
<td align="char" char=".">266</td>
<td align="left">
<bold>0.482&#x2009;0</bold>
</td>
<td align="left">0.307&#x2009;6</td>
<td align="left">0.281&#x2009;6</td>
<td align="left">0.156&#x2009;4</td>
<td align="left">0.330&#x2009;9</td>
<td align="left">1.558&#x2009;4</td>
</tr>
<tr>
<td align="left">&#xa0;Zhang</td>
<td align="char" char=".">438</td>
<td align="left">0.436&#x2009;5</td>
<td align="left">0.320&#x2009;9</td>
<td align="left">0.294&#x2009;2</td>
<td align="left">0.205&#x2009;7</td>
<td align="left">0.318&#x2009;6</td>
<td align="left">1.575&#x2009;8</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterEPs</td>
<td align="char" char=".">271</td>
<td align="left">0.433&#x2009;1</td>
<td align="left">0.290&#x2009;6</td>
<td align="left">0.271&#x2009;5</td>
<td align="left">0.167&#x2009;0</td>
<td align="left">0.317&#x2009;3</td>
<td align="left">1.479&#x2009;5</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterSS</td>
<td align="char" char=".">487</td>
<td align="left">0.372&#x2009;9</td>
<td align="left">0.327&#x2009;9</td>
<td align="left">0.317&#x2009;0</td>
<td align="left">0.171&#x2009;6</td>
<td align="left">0.292&#x2009;4</td>
<td align="left">1.481&#x2009;9</td>
</tr>
<tr>
<td align="left">&#xa0;ICJointLE</td>
<td align="char" char=".">243</td>
<td align="left">0.486&#x2009;1</td>
<td align="left">0.292&#x2009;0</td>
<td align="left">0.283&#x2009;4</td>
<td align="left">0.191&#x2009;2</td>
<td align="left">0.325&#x2009;7</td>
<td align="left">1.578&#x2009;5</td>
</tr>
<tr>
<td align="left">&#xa0;PC2P</td>
<td align="char" char=".">219</td>
<td align="left">0.402&#x2009;5</td>
<td align="left">0.361&#x2009;0</td>
<td align="left">0.341&#x2009;3</td>
<td align="left">0.129&#x2009;5</td>
<td align="left">0.320&#x2009;4</td>
<td align="left">1.554&#x2009;7</td>
</tr>
<tr>
<td align="left">&#xa0;ELF-DPC</td>
<td align="char" char=".">265</td>
<td align="left">0.454&#x2009;6</td>
<td align="left">0.383&#x2009;8</td>
<td align="left">0.325&#x2009;9</td>
<td align="left">
<bold>0.174&#x2009;5</bold>
</td>
<td align="left">
<bold>0.361&#x2009;9</bold>
</td>
<td align="left">
<bold>1.700&#x2009;6</bold>
</td>
</tr>
<tr>
<td colspan="8" align="left">
<bold>Krogan core</bold>
</td>
</tr>
<tr>
<td align="left">&#xa0;MCL</td>
<td align="char" char=".">370</td>
<td align="left">0.321&#x2009;4</td>
<td align="left">0.353&#x2009;4</td>
<td align="left">
<bold>0.308&#x2009;8</bold>
</td>
<td align="left">0.094&#x2009;4</td>
<td align="left">0.255&#x2009;9</td>
<td align="left">1.333&#x2009;9</td>
</tr>
<tr>
<td align="left">&#xa0;DPClus</td>
<td align="char" char=".">
<bold>497</bold>
</td>
<td align="left">0.357&#x2009;7</td>
<td align="left">0.333&#x2009;5</td>
<td align="left">0.289&#x2009;9</td>
<td align="left">0.120&#x2009;0</td>
<td align="left">0.289&#x2009;3</td>
<td align="left">1.390&#x2009;4</td>
</tr>
<tr>
<td align="left">&#xa0;CMC</td>
<td align="char" char=".">264</td>
<td align="left">0.399&#x2009;9</td>
<td align="left">0.319&#x2009;2</td>
<td align="left">0.273&#x2009;2</td>
<td align="left">0.110&#x2009;1</td>
<td align="left">0.314&#x2009;9</td>
<td align="left">1.417&#x2009;3</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterONE</td>
<td align="char" char=".">240</td>
<td align="left">0.391&#x2009;3</td>
<td align="left">0.272&#x2009;9</td>
<td align="left">0.275&#x2009;6</td>
<td align="left">0.105&#x2009;8</td>
<td align="left">0.282&#x2009;6</td>
<td align="left">1.328&#x2009;2</td>
</tr>
<tr>
<td align="left">&#xa0;PEWCC</td>
<td align="char" char=".">383</td>
<td align="left">0.422&#x2009;8</td>
<td align="left">0.291&#x2009;3</td>
<td align="left">0.212&#x2009;5</td>
<td align="left">0.098&#x2009;7</td>
<td align="left">0.324&#x2009;7</td>
<td align="left">1.350&#x2009;0</td>
</tr>
<tr>
<td align="left">&#xa0;WPNCA</td>
<td align="char" char=".">369</td>
<td align="left">0.436&#x2009;1</td>
<td align="left">0.357&#x2009;2</td>
<td align="left">0.261&#x2009;4</td>
<td align="left">0.125&#x2009;0</td>
<td align="left">0.296&#x2009;0</td>
<td align="left">1.475&#x2009;7</td>
</tr>
<tr>
<td align="left">&#xa0;CPredictor2.0</td>
<td align="char" char=".">236</td>
<td align="left">0.493&#x2009;2</td>
<td align="left">0.278&#x2009;7</td>
<td align="left">0.242&#x2009;1</td>
<td align="left">0.125&#x2009;8</td>
<td align="left">0.321&#x2009;6</td>
<td align="left">1.461&#x2009;4</td>
</tr>
<tr>
<td align="left">&#xa0;Zhang</td>
<td align="char" char=".">326</td>
<td align="left">0.463&#x2009;7</td>
<td align="left">0.263&#x2009;4</td>
<td align="left">0.237&#x2009;3</td>
<td align="left">0.145&#x2009;6</td>
<td align="left">0.295&#x2009;7</td>
<td align="left">1.405&#x2009;7</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterEPs</td>
<td align="char" char=".">410</td>
<td align="left">0.465&#x2009;8</td>
<td align="left">0.302&#x2009;1</td>
<td align="left">0.239&#x2009;0</td>
<td align="left">0.144&#x2009;4</td>
<td align="left">0.297&#x2009;5</td>
<td align="left">1.448&#x2009;8</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterSS</td>
<td align="char" char=".">342</td>
<td align="left">0.430&#x2009;4</td>
<td align="left">0.320&#x2009;1</td>
<td align="left">0.270&#x2009;5</td>
<td align="left">0.131&#x2009;8</td>
<td align="left">0.314&#x2009;0</td>
<td align="left">1.466&#x2009;9</td>
</tr>
<tr>
<td align="left">&#xa0;ICJointLE</td>
<td align="char" char=".">216</td>
<td align="left">0.451&#x2009;6</td>
<td align="left">0.208&#x2009;3</td>
<td align="left">0.214&#x2009;7</td>
<td align="left">0.123&#x2009;0</td>
<td align="left">0.272&#x2009;6</td>
<td align="left">1.270&#x2009;2</td>
</tr>
<tr>
<td align="left">&#xa0;PC2P</td>
<td align="char" char=".">249</td>
<td align="left">0.363&#x2009;6</td>
<td align="left">0.314&#x2009;1</td>
<td align="left">0.288&#x2009;4</td>
<td align="left">0.095&#x2009;1</td>
<td align="left">0.281&#x2009;8</td>
<td align="left">1.342&#x2009;9</td>
</tr>
<tr>
<td align="left">&#xa0;ELF-DPC</td>
<td align="char" char=".">281</td>
<td align="left">
<bold>0.533&#x2009;6</bold>
</td>
<td align="left">
<bold>0.376&#x2009;8</bold>
</td>
<td align="left">0.282&#x2009;7</td>
<td align="left">0.175&#x2009;0</td>
<td align="left">
<bold>0.378&#x2009;5</bold>
</td>
<td align="left">
<bold>1.746&#x2009;7</bold>
</td>
</tr>
<tr>
<td colspan="8" align="left">
<bold>DIP</bold>
</td>
</tr>
<tr>
<td align="left">&#xa0;MCL</td>
<td align="char" char=".">628</td>
<td align="left">0.240&#x2009;9</td>
<td align="left">0.302&#x2009;5</td>
<td align="left">0.250&#x2009;4</td>
<td align="left">0.061&#x2009;3</td>
<td align="left">0.192&#x2009;1</td>
<td align="left">1.047&#x2009;3</td>
</tr>
<tr>
<td align="left">&#xa0;DPClus</td>
<td align="char" char=".">909</td>
<td align="left">0.278&#x2009;4</td>
<td align="left">0.342&#x2009;4</td>
<td align="left">0.249&#x2009;3</td>
<td align="left">0.089&#x2009;8</td>
<td align="left">0.244&#x2009;5</td>
<td align="left">1.204&#x2009;4</td>
</tr>
<tr>
<td align="left">&#xa0;CMC</td>
<td align="char" char=".">1,192</td>
<td align="left">0.313&#x2009;0</td>
<td align="left">0.321&#x2009;3</td>
<td align="left">0.219&#x2009;3</td>
<td align="left">0.132&#x2009;9</td>
<td align="left">0.266&#x2009;4</td>
<td align="left">1.253&#x2009;0</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterONE</td>
<td align="char" char=".">904</td>
<td align="left">0.423&#x2009;2</td>
<td align="left">
<bold>0.435&#x2009;8</bold>
</td>
<td align="left">
<bold>0.293&#x2009;7</bold>
</td>
<td align="left">0.118&#x2009;4</td>
<td align="left">0.287&#x2009;4</td>
<td align="left">1.558&#x2009;5</td>
</tr>
<tr>
<td align="left">&#xa0;PEWCC</td>
<td align="char" char=".">648</td>
<td align="left">0.481&#x2009;2</td>
<td align="left">0.333&#x2009;6</td>
<td align="left">0.218&#x2009;2</td>
<td align="left">0.095&#x2009;0</td>
<td align="left">0.298&#x2009;6</td>
<td align="left">1.426&#x2009;6</td>
</tr>
<tr>
<td align="left">&#xa0;WPNCA</td>
<td align="char" char=".">623</td>
<td align="left">0.460&#x2009;3</td>
<td align="left">0.370&#x2009;9</td>
<td align="left">0.247&#x2009;2</td>
<td align="left">0.122&#x2009;6</td>
<td align="left">0.286&#x2009;6</td>
<td align="left">1.487&#x2009;6</td>
</tr>
<tr>
<td align="left">&#xa0;CPredictor2.0</td>
<td align="char" char=".">293</td>
<td align="left">0.465&#x2009;3</td>
<td align="left">0.226&#x2009;5</td>
<td align="left">0.207&#x2009;7</td>
<td align="left">0.073&#x2009;6</td>
<td align="left">0.263&#x2009;5</td>
<td align="left">1.236&#x2009;7</td>
</tr>
<tr>
<td align="left">&#xa0;Zhang</td>
<td align="char" char=".">502</td>
<td align="left">0.492&#x2009;9</td>
<td align="left">0.292&#x2009;8</td>
<td align="left">0.221&#x2009;5</td>
<td align="left">0.122&#x2009;3</td>
<td align="left">0.281&#x2009;8</td>
<td align="left">1.411&#x2009;3</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterEPs</td>
<td align="char" char=".">804</td>
<td align="left">0.461&#x2009;1</td>
<td align="left">0.264&#x2009;6</td>
<td align="left">0.192&#x2009;9</td>
<td align="left">0.132&#x2009;3</td>
<td align="left">0.265&#x2009;2</td>
<td align="left">1.316&#x2009;2</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterSS</td>
<td align="char" char=".">
<bold>2,179</bold>
</td>
<td align="left">0.367&#x2009;6</td>
<td align="left">0.316&#x2009;8</td>
<td align="left">0.236&#x2009;0</td>
<td align="left">0.158&#x2009;8</td>
<td align="left">0.234&#x2009;0</td>
<td align="left">1.313&#x2009;2</td>
</tr>
<tr>
<td align="left">&#xa0;ICJointLE</td>
<td align="char" char=".">286</td>
<td align="left">0.473&#x2009;4</td>
<td align="left">0.216&#x2009;8</td>
<td align="left">0.202&#x2009;7</td>
<td align="left">0.096&#x2009;1</td>
<td align="left">0.266&#x2009;8</td>
<td align="left">1.255&#x2009;8</td>
</tr>
<tr>
<td align="left">&#xa0;PC2P</td>
<td align="char" char=".">441</td>
<td align="left">0.266&#x2009;2</td>
<td align="left">0.296&#x2009;7</td>
<td align="left">0.233&#x2009;7</td>
<td align="left">0.058&#x2009;8</td>
<td align="left">0.208&#x2009;3</td>
<td align="left">1.063&#x2009;6</td>
</tr>
<tr>
<td align="left">&#xa0;ELF-DPC</td>
<td align="char" char=".">545</td>
<td align="left">
<bold>0.512&#x2009;6</bold>
</td>
<td align="left">0.399&#x2009;8</td>
<td align="left">0.260&#x2009;7</td>
<td align="left">
<bold>0.138&#x2009;6</bold>
</td>
<td align="left">
<bold>0.302&#x2009;0</bold>
</td>
<td align="left">
<bold>1.613&#x2009;7</bold>
</td>
</tr>
<tr>
<td colspan="8" align="left">
<bold>MIPS</bold>
</td>
</tr>
<tr>
<td align="left">&#xa0;MCL</td>
<td align="char" char=".">594</td>
<td align="left">0.055&#x2009;1</td>
<td align="left">0.164&#x2009;0</td>
<td align="left">0.147&#x2009;5</td>
<td align="left">0.012&#x2009;5</td>
<td align="left">0.103&#x2009;1</td>
<td align="left">0.482&#x2009;2</td>
</tr>
<tr>
<td align="left">&#xa0;DPClus</td>
<td align="char" char=".">207</td>
<td align="left">0.330&#x2009;7</td>
<td align="left">0.193&#x2009;4</td>
<td align="left">0.194&#x2009;8</td>
<td align="left">0.054&#x2009;7</td>
<td align="left">0.204&#x2009;9</td>
<td align="left">0.978&#x2009;5</td>
</tr>
<tr>
<td align="left">&#xa0;CMC</td>
<td align="char" char=".">408</td>
<td align="left">0.298&#x2009;1</td>
<td align="left">0.212&#x2009;5</td>
<td align="left">0.187&#x2009;3</td>
<td align="left">0.064&#x2009;2</td>
<td align="left">0.199&#x2009;9</td>
<td align="left">0.962&#x2009;0</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterONE</td>
<td align="char" char=".">690</td>
<td align="left">0.247&#x2009;3</td>
<td align="left">0.238&#x2009;4</td>
<td align="left">
<bold>0.214&#x2009;8</bold>
</td>
<td align="left">0.063&#x2009;0</td>
<td align="left">0.180&#x2009;1</td>
<td align="left">0.943&#x2009;5</td>
</tr>
<tr>
<td align="left">&#xa0;PEWCC</td>
<td align="char" char=".">382</td>
<td align="left">0.230&#x2009;9</td>
<td align="left">0.170&#x2009;0</td>
<td align="left">0.116&#x2009;6</td>
<td align="left">0.029&#x2009;6</td>
<td align="left">0.130&#x2009;1</td>
<td align="left">0.677&#x2009;3</td>
</tr>
<tr>
<td align="left">&#xa0;WPNCA</td>
<td align="char" char=".">527</td>
<td align="left">0.264&#x2009;0</td>
<td align="left">0.238&#x2009;3</td>
<td align="left">0.154&#x2009;9</td>
<td align="left">0.062&#x2009;1</td>
<td align="left">0.152&#x2009;2</td>
<td align="left">0.871&#x2009;6</td>
</tr>
<tr>
<td align="left">&#xa0;CPredictor2.0</td>
<td align="char" char=".">265</td>
<td align="left">0.384&#x2009;3</td>
<td align="left">0.208&#x2009;6</td>
<td align="left">0.196&#x2009;6</td>
<td align="left">0.067&#x2009;2</td>
<td align="left">0.226&#x2009;4</td>
<td align="left">1.083&#x2009;1</td>
</tr>
<tr>
<td align="left">&#xa0;Zhang</td>
<td align="char" char=".">406</td>
<td align="left">0.341&#x2009;3</td>
<td align="left">0.194&#x2009;4</td>
<td align="left">0.185&#x2009;7</td>
<td align="left">0.071&#x2009;0</td>
<td align="left">0.200&#x2009;2</td>
<td align="left">0.992&#x2009;5</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterEPs</td>
<td align="char" char=".">645</td>
<td align="left">0.358&#x2009;2</td>
<td align="left">0.211&#x2009;5</td>
<td align="left">0.172&#x2009;0</td>
<td align="left">0.088&#x2009;4</td>
<td align="left">0.212&#x2009;0</td>
<td align="left">1.042&#x2009;1</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterSS</td>
<td align="char" char=".">
<bold>1,581</bold>
</td>
<td align="left">0.253&#x2009;9</td>
<td align="left">0.256&#x2009;6</td>
<td align="left">0.207&#x2009;4</td>
<td align="left">0.089&#x2009;4</td>
<td align="left">0.186&#x2009;7</td>
<td align="left">0.994&#x2009;0</td>
</tr>
<tr>
<td align="left">&#xa0;ICJointLE</td>
<td align="char" char=".">121</td>
<td align="left">0.295&#x2009;9</td>
<td align="left">0.122&#x2009;4</td>
<td align="left">0.159&#x2009;3</td>
<td align="left">0.053&#x2009;8</td>
<td align="left">0.178&#x2009;7</td>
<td align="left">0.810&#x2009;1</td>
</tr>
<tr>
<td align="left">&#xa0;PC2P</td>
<td align="char" char=".">374</td>
<td align="left">0.207&#x2009;8</td>
<td align="left">0.213&#x2009;6</td>
<td align="left">0.194&#x2009;1</td>
<td align="left">0.043&#x2009;2</td>
<td align="left">0.152&#x2009;4</td>
<td align="left">0.811&#x2009;2</td>
</tr>
<tr>
<td align="left">&#xa0;ELF-DPC</td>
<td align="char" char=".">469</td>
<td align="left">
<bold>0.402&#x2009;6</bold>
</td>
<td align="left">
<bold>0.259&#x2009;9</bold>
</td>
<td align="left">0.193&#x2009;7</td>
<td align="left">
<bold>0.101&#x2009;1</bold>
</td>
<td align="left">
<bold>0.224&#x2009;9</bold>
</td>
<td align="left">
<bold>1.182&#x2009;2</bold>
</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>The bold values are the highest value of each metric of each PPI network.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>As shown in <xref ref-type="table" rid="T5">Table&#x20;5</xref>, when standard protein complexes 2 was used as the training set and standard protein complexes 1 was used as the test set, the ELF-DPC achieved the highest F-measure, Jaccard, and Total score based on most of the four PPI networks. For the Gavin dataset shown in <xref ref-type="table" rid="T5">Table&#x20;5</xref>, the ELF-DPC algorithm ranks third in terms of CR, sixth in terms of ACC, and sixth in terms of MMR. The Krogan core dataset shown in <xref ref-type="table" rid="T5">Table&#x20;5</xref> shows that the ELF-DPC achieves first place on CR and obtains four places on the ACC statistics. However, ELF-DPC achieves first place on MMR, it is 0.2687. For the DIP dataset shown in <xref ref-type="table" rid="T5">Table&#x20;5</xref>, the ELF-DPC method takes second in terms of CR and ACC metrics, the ELF-DPC algorithm has the second-highest top level in terms of MMR, and the ELF-DPC method takes second in terms of Jaccard, which is slightly lower than the best at 0.3454. For the MIPS dataset shown in <xref ref-type="table" rid="T5">Table&#x20;5</xref>, it can be seen that the ELF-DPC method takes first in terms of CR, at 0.2914. The ELF-DPC algorithm has the fourth-highest top level in terms of ACC, and the ELF-DPC algorithm is the first place in terms of&#x20;MMR.</p>
<p>We used standard protein complexes 1 as the positive training set and standard protein complexes 2 as the test set. The results are presented in <xref ref-type="table" rid="T6">Table&#x20;6</xref>. One can quickly find that ELF-DPC has the best F-measure, MMR, Jaccard, and Total score on most tested datasets. Although ELF-DPC did not obtain the highest score in terms of CR, and ACC, the experimental comparison results are similar, taking standard protein complexes 1 in <xref ref-type="table" rid="T5">Table&#x20;5</xref> as the test set. According to the experimental results in <xref ref-type="table" rid="T1">Tables 1</xref> and <xref ref-type="table" rid="T2">2</xref>, in some cases, some algorithms that identify more protein complexes achieve the highest MMR, such as PEWCC and ClusterSS, which means that detection algorithms that detect more protein complexes are suitable for MMR. Meanwhile, the number of protein complexes identified by the ELF-DPC algorithm is relatively small. However, it also achieves the highest values on some datasets, indicating that identifying protein complexes by the ELF-DPC algorithm can obtain a better maximal one-to-one mapping to standard protein complexes. On the whole, comparative experimental results show that ELF-DPC can achieve a higher Total score than all the compared methods on all datasets, which means that ELF-DPC performs better than these competitive methods on most computational evaluation metrics in the tested datasets.</p>
</sec>
<sec id="s3-4">
<title>3.4 Comparison With Functional Enrichment Analysis</title>
<p>We further substantiated the biological significance of the detected protein complexes by different methods by comparing the <italic>p</italic>-value of the identified proteins in GO (Gene Ontology) databases, which cover three domains: biological process, molecular function, and cellular component. Since the <italic>p</italic>-values of identified protein complexes are closely related to their size (<xref ref-type="bibr" rid="B57">Wang et&#x20;al., 2019</xref>), we need to perform a comprehensive analysis of these statistics. Therefore, the number of significantly identified protein complexes and the percentage of them in different values of the <italic>p</italic>-value from 1E-2 to 1E-20 were used to estimate their functional enrichment. We analyzed the protein complexes discovered by ELF-DPC and compared algorithms using the <italic>p</italic>-value test. Generally, a protein complex with a lower <italic>p</italic>-value is significant. The functional enrichment analysis results for these methods are shown in <xref ref-type="table" rid="T7">Tables 7</xref> and <xref ref-type="table" rid="T8">8</xref>, where <italic>Num</italic> is the total number of identified protein complexes, and <italic>AS</italic> is the mean of the sizes of identified protein complexes.</p>
<table-wrap id="T7" position="float">
<label>TABLE 7</label>
<caption>
<p>Results of function enrichment test with different thresholds of <italic>p</italic>-value on Gavin and Krogan&#x20;core.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Algorithms</th>
<th align="center">Num</th>
<th align="center">As</th>
<th align="center">
<inline-formula id="inf13">
<mml:math id="m35">
<mml:mo>&#x3c;</mml:mo>
</mml:math>
</inline-formula>E-20</th>
<th align="center">
<inline-formula id="inf14">
<mml:math id="m36">
<mml:mo>&#x3c;</mml:mo>
</mml:math>
</inline-formula>E-15</th>
<th align="center">
<inline-formula id="inf15">
<mml:math id="m37">
<mml:mo>&#x3c;</mml:mo>
</mml:math>
</inline-formula>E-10</th>
<th align="center">
<inline-formula id="inf16">
<mml:math id="m38">
<mml:mo>&#x3c;</mml:mo>
</mml:math>
</inline-formula>E-5</th>
<th align="center">Significant</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td colspan="8" align="left">
<bold>Gavin</bold>
</td>
</tr>
<tr>
<td align="left">&#xa0;MCL</td>
<td align="char" char=".">220</td>
<td align="char" char=".">7.56</td>
<td align="left">39(17.73%)</td>
<td align="left">48(21.82%)</td>
<td align="left">83(37.73%)</td>
<td align="left">183(83.18%)</td>
<td align="left">194(88.18%)</td>
</tr>
<tr>
<td align="left">&#xa0;DPClus</td>
<td align="char" char=".">285</td>
<td align="char" char=".">6.09</td>
<td align="left">30(10.53%)</td>
<td align="left">49(17.2%)</td>
<td align="left">88(30.88%)</td>
<td align="left">182(63.86%)</td>
<td align="left">208(72.98%)</td>
</tr>
<tr>
<td align="left">&#xa0;CMC</td>
<td align="char" char=".">294</td>
<td align="char" char=".">5.83</td>
<td align="left">43(14.63%)</td>
<td align="left">57(19.39%)</td>
<td align="left">82(27.89%)</td>
<td align="left">171(58.16%)</td>
<td align="left">206(70.06%)</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterONE</td>
<td align="char" char=".">258</td>
<td align="char" char=".">7.24</td>
<td align="left">39(15.12%)</td>
<td align="left">53(20.55%)</td>
<td align="left">101(39.15%)</td>
<td align="left">187(72.48%)</td>
<td align="left">205(79.46%)</td>
</tr>
<tr>
<td align="left">&#xa0;PEWCC</td>
<td align="char" char=".">
<bold>664</bold>
</td>
<td align="char" char=".">8.14</td>
<td align="left">61(9.19%)</td>
<td align="left">117(17.62%)</td>
<td align="left">238(35.84%)</td>
<td align="left">480(72.29%)</td>
<td align="left">546(82.23%)</td>
</tr>
<tr>
<td align="left">&#xa0;CPredictor2.0</td>
<td align="char" char=".">266</td>
<td align="char" char=".">6.04</td>
<td align="left">29(10.9%)</td>
<td align="left">51(19.17%)</td>
<td align="left">122(45.86%)</td>
<td align="left">231(86.84%)</td>
<td align="left">244(91.73%)</td>
</tr>
<tr>
<td align="left">&#xa0;WPNCA</td>
<td align="char" char=".">484</td>
<td align="char" char=".">16.62</td>
<td align="left">
<bold>125(25.83%)</bold>
</td>
<td align="left">
<bold>180(37.19%)</bold>
</td>
<td align="left">
<bold>281(58.06%)</bold>
</td>
<td align="left">423(87.4%)</td>
<td align="left">449(92.77%)</td>
</tr>
<tr>
<td align="left">&#xa0;Zhang</td>
<td align="char" char=".">438</td>
<td align="char" char=".">6.30</td>
<td align="left">44(10.05%)</td>
<td align="left">83(18.95%)</td>
<td align="left">164(37.44%)</td>
<td align="left">318(72.6%)</td>
<td align="left">354(80.82%)</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterEPs</td>
<td align="char" char=".">271</td>
<td align="char" char=".">6.25</td>
<td align="left">53(19.56%)</td>
<td align="left">86(31.74%)</td>
<td align="left">143(52.77%)</td>
<td align="left">
<bold>240(88.56%)</bold>
</td>
<td align="left">
<bold>256(94.46%)</bold>
</td>
</tr>
<tr>
<td rowspan="2" align="left">&#xa0;ClusterSS</td>
<td align="char" char=".">482</td>
<td align="char" char=".">5.62</td>
<td align="left">63(13.07%)</td>
<td align="left">95(19.71%)</td>
<td align="left">167(34.65%)</td>
<td align="left">336(69.71%)</td>
<td align="left">368(76.35%)</td>
</tr>
<tr>
<td align="char" char=".">487</td>
<td align="char" char=".">5.36</td>
<td align="left">50(10.27%)</td>
<td align="left">83(17.05%)</td>
<td align="left">147(30.19%)</td>
<td align="left">324(66.53%)</td>
<td align="left">368(75.56%)</td>
</tr>
<tr>
<td align="left">&#xa0;ICJointLE</td>
<td align="char" char=".">243</td>
<td align="char" char=".">5.73</td>
<td align="left">25(10.29%)</td>
<td align="left">27(11.11%)</td>
<td align="left">83(34.16%)</td>
<td align="left">196(80.66%)</td>
<td align="left">207(85.19%)</td>
</tr>
<tr>
<td align="left">&#xa0;PC2P</td>
<td align="char" char=".">219</td>
<td align="char" char=".">6.91</td>
<td align="left">17(7.76%)</td>
<td align="left">11(5.02%)</td>
<td align="left">40(18.26%)</td>
<td align="left">106(48.4%)</td>
<td align="left">119(54.34%)</td>
</tr>
<tr>
<td rowspan="2" align="left">&#xa0;ELF-DPC</td>
<td align="char" char=".">286</td>
<td align="char" char=".">8.81</td>
<td align="left">59(20.63%)</td>
<td align="left">104(36.36%)</td>
<td align="left">154(53.84%)</td>
<td align="left">244(85.31%)</td>
<td align="left">262(91.6%)</td>
</tr>
<tr>
<td align="char" char=".">265</td>
<td align="char" char=".">8.66</td>
<td align="left">65(24.53%)</td>
<td align="left">89(33.59%)</td>
<td align="left">140(52.84%)</td>
<td align="left">231(87.18%)</td>
<td align="left">244(92.09%)</td>
</tr>
<tr>
<td colspan="8" align="left">
<bold>Krogan core</bold>
</td>
</tr>
<tr>
<td align="left">&#xa0;MCL</td>
<td align="char" char=".">370</td>
<td align="char" char=".">5.91</td>
<td align="left">82(22.16%)</td>
<td align="left">119(32.16%)</td>
<td align="left">173(46.75%)</td>
<td align="left">275(74.32%)</td>
<td align="left">293(79.18%)</td>
</tr>
<tr>
<td align="left">&#xa0;DPClus</td>
<td align="char" char=".">497</td>
<td align="char" char=".">4.23</td>
<td align="left">20(4.02%)</td>
<td align="left">43(8.65%)</td>
<td align="left">75(15.09%)</td>
<td align="left">253(50.9%)</td>
<td align="left">303(60.96%)</td>
</tr>
<tr>
<td align="left">&#xa0;CMC</td>
<td align="char" char=".">264</td>
<td align="char" char=".">5.05</td>
<td align="left">20(7.58%)</td>
<td align="left">29(10.99%)</td>
<td align="left">44(16.67%)</td>
<td align="left">60(22.73%)</td>
<td align="left">63(23.87%)</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterONE</td>
<td align="char" char=".">240</td>
<td align="char" char=".">5.27</td>
<td align="left">44(18.33%)</td>
<td align="left">75(31.25%)</td>
<td align="left">121(50.42%)</td>
<td align="left">202(84.17%)</td>
<td align="left">216(90.0%)</td>
</tr>
<tr>
<td align="left">&#xa0;PEWCC</td>
<td align="char" char=".">383</td>
<td align="char" char=".">10.16</td>
<td align="left">
<bold>152(39.69%)</bold>
</td>
<td align="left">
<bold>205(53.53%)</bold>
</td>
<td align="left">
<bold>277(72.33%)</bold>
</td>
<td align="left">
<bold>358(93.48%)</bold>
</td>
<td align="left">
<bold>377(98.44%)</bold>
</td>
</tr>
<tr>
<td align="left">&#xa0;CPredictor2.0</td>
<td align="char" char=".">236</td>
<td align="char" char=".">5.19</td>
<td align="left">24(10.17%)</td>
<td align="left">46(19.49%)</td>
<td align="left">93(39.41%)</td>
<td align="left">213(90.26%)</td>
<td align="left">219(92.8%)</td>
</tr>
<tr>
<td align="left">&#xa0;WPNCA</td>
<td align="char" char=".">369</td>
<td align="char" char=".">12.59</td>
<td align="left">43(11.65%)</td>
<td align="left">81(21.95%)</td>
<td align="left">172(46.61%)</td>
<td align="left">321(86.99%)</td>
<td align="left">339(91.87%)</td>
</tr>
<tr>
<td align="left">&#xa0;Zhang</td>
<td align="char" char=".">326</td>
<td align="char" char=".">5.41</td>
<td align="left">37(11.35%)</td>
<td align="left">65(19.94%)</td>
<td align="left">118(36.2%)</td>
<td align="left">259(79.45%)</td>
<td align="left">279(85.58%)</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterEPs</td>
<td align="char" char=".">410</td>
<td align="char" char=".">6.18</td>
<td align="left">59(14.39%)</td>
<td align="left">95(23.17%)</td>
<td align="left">168(40.97%)</td>
<td align="left">341(83.17%)</td>
<td align="left">365(89.02%)</td>
</tr>
<tr>
<td rowspan="2" align="left">&#xa0;ClusterSS</td>
<td align="char" char=".">
<bold>722</bold>
</td>
<td align="char" char=".">4.86</td>
<td align="left">47(6.51%)</td>
<td align="left">95(13.16%)</td>
<td align="left">160(22.16%)</td>
<td align="left">371(51.38%)</td>
<td align="left">454(62.88%)</td>
</tr>
<tr>
<td align="char" char=".">342</td>
<td align="char" char=".">7.01</td>
<td align="left">48(14.04%)</td>
<td align="left">88(25.74%)</td>
<td align="left">155(45.33%)</td>
<td align="left">280(81.88%)</td>
<td align="left">304(88.9%)</td>
</tr>
<tr>
<td align="left">&#xa0;ICJointLE</td>
<td align="char" char=".">216</td>
<td align="char" char=".">4.41</td>
<td align="left">16(7.41%)</td>
<td align="left">21(9.72%)</td>
<td align="left">68(31.48%)</td>
<td align="left">184(85.18%)</td>
<td align="left">192(88.88%)</td>
</tr>
<tr>
<td align="left">&#xa0;PC2P</td>
<td align="char" char=".">249</td>
<td align="char" char=".">5.81</td>
<td align="left">16(6.43%)</td>
<td align="left">23(9.24%)</td>
<td align="left">46(18.48%)</td>
<td align="left">136(54.62%)</td>
<td align="left">159(63.86%)</td>
</tr>
<tr>
<td rowspan="2" align="left">&#xa0;ELF-DPC</td>
<td align="char" char=".">304</td>
<td align="char" char=".">9.55</td>
<td align="left">80(26.32%)</td>
<td align="left">115(37.83%)</td>
<td align="left">163(53.62%)</td>
<td align="left">277(91.12%)</td>
<td align="left">292(96.05%)</td>
</tr>
<tr>
<td align="char" char=".">281</td>
<td align="char" char=".">9.13</td>
<td align="left">81(28.83%)</td>
<td align="left">111(39.51%)</td>
<td align="left">155(55.17%)</td>
<td align="left">262(93.25%)</td>
<td align="left">269(95.74%)</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>The bold values are the highest value of each metric of each PPI network.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<table-wrap id="T8" position="float">
<label>TABLE 8</label>
<caption>
<p>Results of function enrichment test with different thresholds of <italic>p</italic>-value on DIP and MIPS.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Algorithms</th>
<th align="center">Num</th>
<th align="center">As</th>
<th align="center">
<inline-formula id="inf17">
<mml:math id="m39">
<mml:mo>&#x3c;</mml:mo>
</mml:math>
</inline-formula>E-20</th>
<th align="center">
<inline-formula id="inf18">
<mml:math id="m40">
<mml:mo>&#x3c;</mml:mo>
</mml:math>
</inline-formula>E-15</th>
<th align="center">
<inline-formula id="inf19">
<mml:math id="m41">
<mml:mo>&#x3c;</mml:mo>
</mml:math>
</inline-formula>E-10</th>
<th align="center">
<inline-formula id="inf20">
<mml:math id="m42">
<mml:mo>&#x3c;</mml:mo>
</mml:math>
</inline-formula>E-5</th>
<th align="center">Significant</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td colspan="8" align="left">
<bold>DIP</bold>
</td>
</tr>
<tr>
<td align="left">&#xa0;MCL</td>
<td align="char" char=".">628</td>
<td align="char" char=".">6.31</td>
<td align="left">74(11.78%)</td>
<td align="left">125(19.9%)</td>
<td align="left">209(33.28%)</td>
<td align="left">414(65.92%)</td>
<td align="left">471(75.0%)</td>
</tr>
<tr>
<td align="left">&#xa0;DPClus</td>
<td align="char" char=".">909</td>
<td align="char" char=".">4.28</td>
<td align="left">45(4.95%)</td>
<td align="left">64(7.04%)</td>
<td align="left">112(12.32%)</td>
<td align="left">364(40.04%)</td>
<td align="left">470(51.7%)</td>
</tr>
<tr>
<td align="left">&#xa0;CMC</td>
<td align="char" char=".">1,192</td>
<td align="char" char=".">3.81</td>
<td align="left">90(7.55%)</td>
<td align="left">150(12.58%)</td>
<td align="left">304(25.5%)</td>
<td align="left">692(58.05%)</td>
<td align="left">829(69.54%)</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterONE</td>
<td align="char" char=".">904</td>
<td align="char" char=".">6.40</td>
<td align="left">54(5.97%)</td>
<td align="left">110(12.16%)</td>
<td align="left">259(28.64%)</td>
<td align="left">606(67.02%)</td>
<td align="left">705(77.97%)</td>
</tr>
<tr>
<td align="left">&#xa0;PEWCC</td>
<td align="char" char=".">648</td>
<td align="char" char=".">10.10</td>
<td align="left">156(24.07%)</td>
<td align="left">
<bold>249(38.42%)</bold>
</td>
<td align="left">
<bold>379(58.48%)</bold>
</td>
<td align="left">584(90.12%)</td>
<td align="left">605(93.36%)</td>
</tr>
<tr>
<td align="left">&#xa0;CPredictor2.0</td>
<td align="char" char=".">293</td>
<td align="char" char=".">4.54</td>
<td align="left">18(6.14%)</td>
<td align="left">49(16.72%)</td>
<td align="left">124(42.32%)</td>
<td align="left">274(93.51%)</td>
<td align="left">
<bold>285(97.26%)</bold>
</td>
</tr>
<tr>
<td align="left">&#xa0;WPNCA</td>
<td align="char" char=".">623</td>
<td align="char" char=".">12.41</td>
<td align="left">81(13.0%)</td>
<td align="left">137(21.99%)</td>
<td align="left">228(36.6%)</td>
<td align="left">431(69.18%)</td>
<td align="left">481(77.21%)</td>
</tr>
<tr>
<td align="left">&#xa0;Zhang</td>
<td align="char" char=".">502</td>
<td align="char" char=".">5.18</td>
<td align="left">44(8.76%)</td>
<td align="left">99(19.72%)</td>
<td align="left">200(39.84%)</td>
<td align="left">424(84.46%)</td>
<td align="left">448(89.24%)</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterEPs</td>
<td align="char" char=".">804</td>
<td align="char" char=".">4.26</td>
<td align="left">91(11.32%)</td>
<td align="left">145(18.04%)</td>
<td align="left">268(33.34%)</td>
<td align="left">625(77.74%)</td>
<td align="left">683(84.95%)</td>
</tr>
<tr>
<td rowspan="2" align="left">&#xa0;ClusterSS</td>
<td align="char" char=".">
<bold>2,375</bold>
</td>
<td align="char" char=".">3.57</td>
<td align="left">156(6.57%)</td>
<td align="left">253(10.65%)</td>
<td align="left">437(18.4%)</td>
<td align="left">1,047(44.08%)</td>
<td align="left">1,289(54.27%)</td>
</tr>
<tr>
<td align="char" char=".">2,179</td>
<td align="char" char=".">5.74</td>
<td align="left">110(5.05%)</td>
<td align="left">230(10.56%)</td>
<td align="left">501(23.0%)</td>
<td align="left">1,332(61.14%)</td>
<td align="left">1,574(72.25%)</td>
</tr>
<tr>
<td align="left">&#xa0;ICJointLE</td>
<td align="char" char=".">286</td>
<td align="char" char=".">3.84</td>
<td align="left">29(10.14%)</td>
<td align="left">27(9.44%)</td>
<td align="left">103(36.01%)</td>
<td align="left">248(86.71%)</td>
<td align="left">253(88.46%)</td>
</tr>
<tr>
<td align="left">&#xa0;PC2P</td>
<td align="char" char=".">441</td>
<td align="char" char=".">6.25</td>
<td align="left">25(5.67%)</td>
<td align="left">14(3.17%)</td>
<td align="left">45(10.2%)</td>
<td align="left">185(41.95%)</td>
<td align="left">230(52.15%)</td>
</tr>
<tr>
<td rowspan="2" align="left">&#xa0;ELF-DPC</td>
<td align="char" char=".">564</td>
<td align="char" char=".">14.43</td>
<td align="left">140(24.82%)</td>
<td align="left">186(32.98%)</td>
<td align="left">289(51.24%)</td>
<td align="left">
<bold>512(90.78%)</bold>
</td>
<td align="left">542(96.1%)</td>
</tr>
<tr>
<td align="char" char=".">545</td>
<td align="char" char=".">12.77</td>
<td align="left">
<bold>142(26.06%)</bold>
</td>
<td align="left">203(37.25%)</td>
<td align="left">307(56.33%)</td>
<td align="left">493(90.46%)</td>
<td align="left">517(94.86%)</td>
</tr>
<tr>
<td colspan="8" align="left">
<bold>MIPS</bold>
</td>
</tr>
<tr>
<td align="left">&#xa0;MCL</td>
<td align="char" char=".">594</td>
<td align="char" char=".">6.16</td>
<td align="left">17(2.86%)</td>
<td align="left">29(4.88%)</td>
<td align="left">80(13.47%)</td>
<td align="left">165(27.78%)</td>
<td align="left">230(38.72%)</td>
</tr>
<tr>
<td align="left">&#xa0;DPClus</td>
<td align="char" char=".">207</td>
<td align="char" char=".">4.94</td>
<td align="left">17(8.21%)</td>
<td align="left">27(13.04%)</td>
<td align="left">85(41.06%)</td>
<td align="left">169(81.64%)</td>
<td align="left">184(88.89%)</td>
</tr>
<tr>
<td align="left">&#xa0;CMC</td>
<td align="char" char=".">408</td>
<td align="char" char=".">4.87</td>
<td align="left">30(7.35%)</td>
<td align="left">49(12.01%)</td>
<td align="left">101(24.76%)</td>
<td align="left">234(57.36%)</td>
<td align="left">278(68.14%)</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterONE</td>
<td align="char" char=".">690</td>
<td align="char" char=".">6.03</td>
<td align="left">22(3.19%)</td>
<td align="left">47(6.81%)</td>
<td align="left">137(19.85%)</td>
<td align="left">327(47.39%)</td>
<td align="left">483(70.0%)</td>
</tr>
<tr>
<td align="left">&#xa0;PEWCC</td>
<td align="char" char=".">382</td>
<td align="char" char=".">24.70</td>
<td align="left">67(17.54%)</td>
<td align="left">94(24.61%)</td>
<td align="left">172(45.03%)</td>
<td align="left">308(80.63%)</td>
<td align="left">325(85.08%)</td>
</tr>
<tr>
<td align="left">&#xa0;CPredictor2.0</td>
<td align="char" char=".">265</td>
<td align="char" char=".">4.60</td>
<td align="left">19(7.17%)</td>
<td align="left">40(15.09%)</td>
<td align="left">118(44.52%)</td>
<td align="left">
<bold>249(93.95%)</bold>
</td>
<td align="left">258(97.35%)</td>
</tr>
<tr>
<td align="left">&#xa0;WPNCA</td>
<td align="char" char=".">527</td>
<td align="char" char=".">18.27</td>
<td align="left">60(11.39%)</td>
<td align="left">103(19.55%)</td>
<td align="left">234(44.41%)</td>
<td align="left">436(82.74%)</td>
<td align="left">471(89.38%)</td>
</tr>
<tr>
<td align="left">&#xa0;Zhang</td>
<td align="char" char=".">406</td>
<td align="char" char=".">5.14</td>
<td align="left">16(3.94%)</td>
<td align="left">37(9.11%)</td>
<td align="left">111(27.34%)</td>
<td align="left">319(78.57%)</td>
<td align="left">355(87.44%)</td>
</tr>
<tr>
<td align="left">&#xa0;ClusterEPs</td>
<td align="char" char=".">645</td>
<td align="char" char=".">4.78</td>
<td align="left">22(3.41%)</td>
<td align="left">45(6.98%)</td>
<td align="left">150(23.26%)</td>
<td align="left">443(68.69%)</td>
<td align="left">500(77.53%)</td>
</tr>
<tr>
<td rowspan="2" align="left">&#xa0;ClusterSS</td>
<td align="char" char=".">1,266</td>
<td align="char" char=".">4.22</td>
<td align="left">33(2.61%)</td>
<td align="left">70(5.53%)</td>
<td align="left">176(13.9%)</td>
<td align="left">607(47.94%)</td>
<td align="left">752(59.39%)</td>
</tr>
<tr>
<td align="char" char=".">
<bold>1,581</bold>
</td>
<td align="char" char=".">5.81</td>
<td align="left">25(1.58%)</td>
<td align="left">67(4.24%)</td>
<td align="left">237(14.99%)</td>
<td align="left">845(53.45%)</td>
<td align="left">1,069(67.62%)</td>
</tr>
<tr>
<td align="left">&#xa0;ICJointLE</td>
<td align="char" char=".">121</td>
<td align="char" char=".">3.70</td>
<td align="left">14(11.57%)</td>
<td align="left">16(13.22%)</td>
<td align="left">42(34.71%)</td>
<td align="left">102(84.3%)</td>
<td align="left">103(85.13%)</td>
</tr>
<tr>
<td align="left">&#xa0;PC2P</td>
<td align="char" char=".">374</td>
<td align="char" char=".">6.29</td>
<td align="left">7(1.87%)</td>
<td align="left">4(1.07%)</td>
<td align="left">41(10.96%)</td>
<td align="left">171(45.72%)</td>
<td align="left">202(54.01%)</td>
</tr>
<tr>
<td rowspan="2" align="left">&#xa0;ELF-DPC</td>
<td align="char" char=".">483</td>
<td align="char" char=".">9.33</td>
<td align="left">
<bold>109(22.57%)</bold>
</td>
<td align="left">
<bold>166(34.37%)</bold>
</td>
<td align="left">246(50.93%)</td>
<td align="left">441(91.3%)</td>
<td align="left">463(95.85%)</td>
</tr>
<tr>
<td align="char" char=".">469</td>
<td align="char" char=".">8.86</td>
<td align="left">105(22.39%)</td>
<td align="left">155(33.05%)</td>
<td align="left">
<bold>253(53.95%)</bold>
</td>
<td align="left">437(93.18%)</td>
<td align="left">
<bold>458(97.66%)</bold>
</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>The bold values are the highest value of each metric of each PPI network.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>As <xref ref-type="table" rid="T7">Table&#x20;7</xref> shows, for the PPI Gavin dataset, ClusterEPs obtains a higher proportion of significantly identified protein complexes, which reaches 94.46<italic>%</italic>, higher than our ELF-DPC. However, ELF-DPC achieves a high proportion of significantly identified protein complexes with a <italic>p</italic>-value &#x2265; E-15. For the Krogan core PPI datasets, PEWCC attains a higher proportion of significantly identified protein complexes than our ELF-DPC. The reason is that ClusterEPs identifies the mean size of the identified protein complexes (<italic>AS</italic>) as 10.16. The <italic>AS</italic> of our ELF-DPC is 9.55 and 9.13, respectively. Generally, the <italic>p</italic>-value of an identified protein complex is closely associated with the size of the identified protein complex. Then the <italic>p</italic>-value decreases gradually when the size of the detected protein complexes increases (<xref ref-type="bibr" rid="B62">Wu et&#x20;al., 2009</xref>; <xref ref-type="bibr" rid="B45">Peng et&#x20;al., 2014</xref>). As <xref ref-type="table" rid="T8">Table&#x20;8</xref> shows, for the PPI dataset DIP, CPredictor2.0 obtains a higher proportion of significantly identified protein complexes than our ELF-DPC. At the same time, ELF-DPC achieves a high proportion of significantly identified protein complexes with <italic>p</italic>-value &#x2265; E-20. For dataset MIPS, ELF-DPC performs better than other competing methods regarding the proportion of significantly identified complexes.</p>
<p>Therefore, we can conclude that ELF-DPC could detect more protein complexes with biological significance. Although some detected protein complexes currently do not match known protein complexes, they are more likely to be verified as actual protein complexes by laboratory techniques. Based on the above results, the protein complexes identified by ELF-DPC have significant biological meaning.</p>
</sec>
<sec id="s3-5">
<title>3.5 Case Study</title>
<p>To clearly show the clustering results, we visualized the 208th standard protein complex of standard protein complexes 1 in <xref ref-type="fig" rid="F5">Figure&#x20;5</xref>. We define a format to allow readers to obtain information. For example, (b) ELF-DPC-1.0&#x2013;10, which means that the neighborhood affinity (<xref ref-type="disp-formula" rid="e15">Eq. 15</xref>) of ELF-DPC is 1.0, and it contains 10 proteins. Here, the red nodes are proteins that are correctly identified by this method, the yellow nodes are proteins that are missed by this method, and the blue nodes are the proteins that are incorrectly identified by this method. <xref ref-type="fig" rid="F5">Figure&#x20;5</xref> (a) shows that there were 10 proteins in the 208th standard protein complex. The clustering results&#x20;of&#x20;the other thirteen methods (b) ELF-DPC, (c) ClusterONE and ClusterSS, (d) CPredictor2.0, (e) PEWCC, (f) MCL, (g) ClusterEPs, (h) ICJointLE, (i) CMC, DPClus, PC2P, (j) WPNCA, and (k) Zhang are all from the Krogan core dataset. (c) ClusterONE and ClusterSS, (d) CPredictor2.0, (e) PEWCC, (g) ClusterEPs, (h) ICJointLE, (i) CMC, DPClus, PC2P, and (k) Zhang only successfully identified part of the 208th standard protein complex, and they also did not identify some proteins. Meanwhile, (j) WPNCA and (f) MCL missed some proteins and incorrectly identified some proteins. However, our ELF-DPC method accurately identified 10 proteins and achieved the best performance in identifying the 208th standard protein complex.</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption>
<p>An example protein complex identified by different methods on the Krogan core PPI network. For example, (b) ELF-DPC-1.0&#x2013;10, which means that the neighborhood affinity (<xref ref-type="disp-formula" rid="e15">Eq. 15</xref>) of ELF-DPC is 1.0, and it contains 10 proteins. Here, the red nodes are proteins that are correctly identified by this method, the yellow nodes are proteins that are missed by this method, and the blue nodes are the proteins that are incorrectly identified by this method.</p>
</caption>
<graphic xlink:href="fgene-13-839949-g005.tif"/>
</fig>
<p>Moreover, <xref ref-type="table" rid="T9">Table&#x20;9</xref> provides 16 protein complexes with vital biological significance identified by the ELF-DPC algorithm in four PPI networks, which provide helpful biological knowledge to related researchers.</p>
<table-wrap id="T9" position="float">
<label>TABLE 9</label>
<caption>
<p>The identified protein complexes with small <italic>p</italic>-values.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Num</th>
<th align="center">
<italic>p</italic>-value</th>
<th align="center">GOID</th>
<th align="center">Gene ontology term</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td colspan="4" align="left">
<bold>Gavin</bold>
</td>
</tr>
<tr>
<td align="left">1</td>
<td align="left">9.72&#x2009;641e-59</td>
<td align="left">GO:0000&#x2009;502</td>
<td align="left">proteasome complex</td>
</tr>
<tr>
<td align="left">2</td>
<td align="left">4.53&#x2009;112e-61</td>
<td align="left">GO:0005&#x2009;762</td>
<td align="left">mitochondrial large ribosomal subunit</td>
</tr>
<tr>
<td align="left">3</td>
<td align="left">9.18&#x2009;655e-68</td>
<td align="left">GO:0030&#x2009;686</td>
<td align="left">90S preribosome</td>
</tr>
<tr>
<td align="left">4</td>
<td align="left">2.61&#x2009;255e-65</td>
<td align="left">GO:0030&#x2009;532</td>
<td align="left">small nuclear ribonucleoprotein complex</td>
</tr>
<tr>
<td colspan="4" align="left">
<bold>Krogan core</bold>
</td>
</tr>
<tr>
<td align="left">1</td>
<td align="left">2.50&#x2009;943e-71</td>
<td align="left">GO:0000&#x2009;375</td>
<td align="left">RNA splicing, <italic>via</italic> transesterification reactions</td>
</tr>
<tr>
<td align="left">2</td>
<td align="left">1.21&#x2009;735e-66</td>
<td align="left">GO:0005&#x2009;681</td>
<td align="left">spliceosomal complex</td>
</tr>
<tr>
<td align="left">3</td>
<td align="left">7.46&#x2009;423e-67</td>
<td align="left">GO:0000&#x2009;377</td>
<td align="left">RNA splicing, <italic>via</italic> transesterification reactions with bulged adenosine as nucleophile</td>
</tr>
<tr>
<td align="left">4</td>
<td align="left">5.5&#x2009;331e-62</td>
<td align="left">GO:0003&#x2009;899</td>
<td align="left">DNA-directed 5&#x2032;-3&#x2032; RNA polymerase activity</td>
</tr>
<tr>
<td colspan="4" align="left">
<bold>DIP</bold>
</td>
</tr>
<tr>
<td align="left">1</td>
<td align="left">2.14&#x2009;679e-64</td>
<td align="left">GO:0042&#x2009;254</td>
<td align="left">ribosome biogenesis</td>
</tr>
<tr>
<td align="left">2</td>
<td align="left">5.5&#x2009;228e-53</td>
<td align="left">GO:0042&#x2009;274</td>
<td align="left">ribosomal small subunit biogenesis</td>
</tr>
<tr>
<td align="left">3</td>
<td align="left">5.18&#x2009;295e-62</td>
<td align="left">GO:0016&#x2009;592</td>
<td align="left">mediator complex</td>
</tr>
<tr>
<td align="left">4</td>
<td align="left">6.85&#x2009;479e-66</td>
<td align="left">GO:0097&#x2009;525</td>
<td align="left">spliceosomal snRNP complex</td>
</tr>
<tr>
<td colspan="4" align="left">
<bold>MIPS</bold>
</td>
</tr>
<tr>
<td align="left">1</td>
<td align="left">1.22&#x2009;375e-47</td>
<td align="left">GO:0050&#x2009;657</td>
<td align="left">nucleic acid transport</td>
</tr>
<tr>
<td align="left">2</td>
<td align="left">1.27&#x2009;336e-44</td>
<td align="left">GO:0030&#x2009;687</td>
<td align="left">preribosome, large subunit precursor</td>
</tr>
<tr>
<td align="left">3</td>
<td align="left">1.58&#x2009;322e-42</td>
<td align="left">GO:0022&#x2009;624</td>
<td align="left">proteasome accessory complex</td>
</tr>
<tr>
<td align="left">4</td>
<td align="left">9.71&#x2009;714e-32</td>
<td align="left">GO:0000&#x2009;124</td>
<td align="left">SAGA complex</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
</sec>
<sec id="s4">
<title>4 Conclusion</title>
<p>Although many protein complex detection methods have been presented in the recent decades, the detection method with excellent performance is still a bottleneck in bioinformatics. This study presented an ensemble learning framework to identify protein complexes according to the core-attachment structure of protein complexes. First, a weighted PPI network was constructed by integrating the gene expression data, gene ontology data, and subcellular location data, as well as topological structure. Next, we used the protein complex core mining strategy to find protein complex cores. After that, we provided a new model training method to construct a training dataset and then extracted various topological features for training a VotingRegressor model to describe protein complexes based on supervised learning. Furthermore, we defined structural modularity for modeling the internal organization of protein complexes. As a result, an ensemble learning model is presented to guide the search for protein complexes. Finally, we designed a graph heuristic search strategy for extending protein complex cores to form protein complexes in the PPI networks. The experimental results show that ELF-DPC performs better than other competing methods. Moreover, our ELF-DPC can mine protein complexes with high biological significance. Because our ELF-DPC can not detect small protein complexes (size &#x2264;2), we will consider integrating other data sources (<xref ref-type="bibr" rid="B54">Tan et&#x20;al., 2018</xref>) to identify small protein complexes. In the future, we can infer drug-disease associations by constructing a heterogeneous network consisting of drugs, detected protein complexes, and diseases to unveil disease mechanisms, and discover available drugs (<xref ref-type="bibr" rid="B68">Yu et&#x20;al., 2015</xref>). In addition, we also consider using graph attention networks and deep learning methods to identify protein complexes.</p>
</sec>
</body>
<back>
<sec id="s5">
<title>Data Availability Statement</title>
<p>The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.</p>
</sec>
<sec id="s6">
<title>Author Contributions</title>
<p>RW was responsible for the main algorithm&#x2019;s development&#x20;phase and drafted the article. HM and CW also revised the drafted article and approved the paper&#x2019;s content. All authors were responsible for designing the algorithm.</p>
</sec>
<sec id="s7">
<title>Funding</title>
<p>This work was supported by the Fundamental Research Funds for the Central Universities (No. FRF-TP-20-064A1Z), the R&#x0026;D Program of CAAC Key Laboratory of Flight Techniques and Flight Safety (NO. FZ2021ZZ05), and the National Natural Science Foundation of China (No. U20B2062 and No. 62172036). The funders provided financial support to the research but had no role in the study&#x2019;s design, analysis, interpretations of data, and writing the manuscript.</p>
</sec>
<sec sec-type="COI-statement" id="s8">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s9">
<title>Publisher&#x2019;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Abduljabbar</surname>
<given-names>D. A.</given-names>
</name>
<name>
<surname>Hashim</surname>
<given-names>S. Z. M.</given-names>
</name>
<name>
<surname>Sallehuddin</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Nature-inspired Optimization Algorithms for Community Detection in Complex Networks: a Review and Future Trends</article-title>. <source>Telecommun Syst.</source> <volume>74</volume>, <fpage>225</fpage>&#x2013;<lpage>252</lpage>. <pub-id pub-id-type="doi">10.1007/s11235-019-00636-x</pub-id> </citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Aloy</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Bo&#x308;ttcher</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Ceulemans</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Leutwein</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Mellwig</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Fischer</surname>
<given-names>S.</given-names>
</name>
<etal/>
</person-group> (<year>2004</year>). <article-title>Structure-based Assembly of Protein Complexes in Yeast</article-title>. <source>Science</source> <volume>303</volume>, <fpage>2026</fpage>&#x2013;<lpage>2029</lpage>. <pub-id pub-id-type="doi">10.1126/science.1092645</pub-id> </citation>
</ref>
<ref id="B3">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Altaf-Ul-Amin</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Shinbo</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Mihara</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Kurokawa</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Kanaya</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2006</year>). <article-title>Development and Implementation of an Algorithm for Detection of Protein Complexes in Large Interaction Networks</article-title>. <source>BMC bioinformatics</source> <volume>7</volume>, <fpage>207</fpage>&#x2013;<lpage>213</lpage>. <pub-id pub-id-type="doi">10.1186/1471-2105-7-207</pub-id> </citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Boyle</surname>
<given-names>E. I.</given-names>
</name>
<name>
<surname>Weng</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Gollub</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Jin</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Botstein</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Cherry</surname>
<given-names>J.&#x20;M.</given-names>
</name>
<etal/>
</person-group> (<year>2004</year>). <article-title>GO:TermFinder--open Source Software for Accessing Gene Ontology Information and Finding Significantly Enriched Gene Ontology Terms Associated with a List of Genes</article-title>. <source>Bioinformatics</source> <volume>20</volume>, <fpage>3710</fpage>&#x2013;<lpage>3715</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/bth456</pub-id> </citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Broh&#xe9;e</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Van Helden</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2006</year>). <article-title>Evaluation of Clustering Algorithms for Protein-Protein Interaction Networks</article-title>. <source>BMC bioinformatics</source> <volume>7</volume>, <fpage>488</fpage>. <pub-id pub-id-type="doi">10.1186/1471-2105-7-488</pub-id> </citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Fan</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>F.-X.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Identifying Protein Complexes and Functional Modules-Ffrom Static PPI Networks to Dynamic PPI Networks</article-title>. <source>Brief. Bioinformatics</source> <volume>15</volume>, <fpage>177</fpage>&#x2013;<lpage>194</lpage>. <pub-id pub-id-type="doi">10.1093/bib/bbt039</pub-id> </citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>D.</given-names>
</name>
</person-group> (<year>2004</year>). <article-title>Global Protein Function Annotation through Mining Genome-Scale Data in Yeast saccharomyces Cerevisiae</article-title>. <source>Nucleic Acids Res.</source> <volume>32</volume>, <fpage>6414</fpage>&#x2013;<lpage>6424</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkh978</pub-id> </citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dong</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Qin</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Predicting Protein Complexes Using a Supervised Learning Method Combined with Local Structural Information</article-title>. <source>PloS one</source> <volume>13</volume>, <fpage>e0194124</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0194124</pub-id> </citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Eisen</surname>
<given-names>M. B.</given-names>
</name>
<name>
<surname>Spellman</surname>
<given-names>P. T.</given-names>
</name>
<name>
<surname>Brown</surname>
<given-names>P. O.</given-names>
</name>
<name>
<surname>Botstein</surname>
<given-names>D.</given-names>
</name>
</person-group> (<year>1998</year>). <article-title>Cluster Analysis and Display of Genome-wide Expression Patterns</article-title>. <source>Proc. Natl. Acad. Sci.</source> <volume>95</volume>, <fpage>14863</fpage>&#x2013;<lpage>14868</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.95.25.14863</pub-id> </citation>
</ref>
<ref id="B10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fortunato</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2010</year>). <article-title>Community Detection in Graphs</article-title>. <source>Phys. Rep.</source> <volume>486</volume>, <fpage>75</fpage>&#x2013;<lpage>174</lpage>. <pub-id pub-id-type="doi">10.1016/j.physrep.2009.11.002</pub-id> </citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Friedel</surname>
<given-names>C. C.</given-names>
</name>
<name>
<surname>Krumsiek</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Zimmer</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>Bootstrapping the Interactome: Unsupervised Identification of Protein Complexes in Yeast</article-title>. <source>J.&#x20;Comput. Biol.</source> <volume>16</volume>, <fpage>971</fpage>&#x2013;<lpage>987</lpage>. <pub-id pub-id-type="doi">10.1089/cmb.2009.0023</pub-id> </citation>
</ref>
<ref id="B12">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gavin</surname>
<given-names>A.-C.</given-names>
</name>
<name>
<surname>Aloy</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Grandi</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Krause</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Boesche</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Marzioch</surname>
<given-names>M.</given-names>
</name>
<etal/>
</person-group> (<year>2006</year>). <article-title>Proteome Survey Reveals Modularity of the Yeast Cell Machinery</article-title>. <source>Nature</source> <volume>440</volume>, <fpage>631</fpage>&#x2013;<lpage>636</lpage>. <pub-id pub-id-type="doi">10.1038/nature04532</pub-id> </citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gavin</surname>
<given-names>A.-C.</given-names>
</name>
<name>
<surname>B&#xf6;sche</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Krause</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Grandi</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Marzioch</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Bauer</surname>
<given-names>A.</given-names>
</name>
<etal/>
</person-group> (<year>2002</year>). <article-title>Functional Organization of the Yeast Proteome by Systematic Analysis of Protein Complexes</article-title>. <source>Nature</source> <volume>415</volume>, <fpage>141</fpage>&#x2013;<lpage>147</lpage>. <pub-id pub-id-type="doi">10.1038/415141a</pub-id> </citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Girvan</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Newman</surname>
<given-names>M. E. J.</given-names>
</name>
</person-group> (<year>2002</year>). <article-title>Community Structure in Social and Biological Networks</article-title>. <source>Proc. Natl. Acad. Sci.</source> <volume>99</volume>, <fpage>7821</fpage>&#x2013;<lpage>7826</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.122653799</pub-id> </citation>
</ref>
<ref id="B15">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Grover</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Leskovec</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2016</year>). &#x201c;<article-title>node2vec: Scalable Feature Learning for Networks</article-title>,&#x201d; in <conf-name>Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining</conf-name>, <fpage>855</fpage>&#x2013;<lpage>864</lpage>. </citation>
</ref>
<ref id="B16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>G&#xfc;ldener</surname>
<given-names>U.</given-names>
</name>
<name>
<surname>M&#xfc;nsterk&#xf6;tter</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Oesterheld</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Pagel</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Ruepp</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Mewes</surname>
<given-names>H.-W.</given-names>
</name>
<etal/>
</person-group> (<year>2006</year>). <article-title>Mpact: the Mips Protein Interaction Resource on Yeast</article-title>. <source>Nucleic Acids Res.</source> <volume>34</volume>, <fpage>D436</fpage>&#x2013;<lpage>D441</lpage>. </citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>He</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Chan</surname>
<given-names>K. C. C.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Evolutionary Graph Clustering for Protein Complex Identification</article-title>. <source>Ieee/acm Trans. Comput. Biol. Bioinform</source> <volume>15</volume>, <fpage>892</fpage>&#x2013;<lpage>904</lpage>. <pub-id pub-id-type="doi">10.1109/TCBB.2016.2642107</pub-id> </citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>He</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Ko</surname>
<given-names>T. H.</given-names>
</name>
<name>
<surname>Chan</surname>
<given-names>K. C. C.</given-names>
</name>
<name>
<surname>Ong</surname>
<given-names>Y. S.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Contextual Correlation Preserving Multiview Featured Graph Clustering</article-title>. <source>IEEE Trans. Cybern</source> <volume>50</volume>, <fpage>4318</fpage>&#x2013;<lpage>4331</lpage>. <pub-id pub-id-type="doi">10.1109/TCYB.2019.2926431</pub-id> </citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>He</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Bai</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Ong</surname>
<given-names>Y.-S.</given-names>
</name>
</person-group> (<year>2021a</year>). <article-title>Vicinal Vertex Allocation for Matrix Factorization in Networks</article-title>. <source>IEEE Trans. Cybernetics</source>. <pub-id pub-id-type="doi">10.1109/tcyb.2021.3051606</pub-id> </citation>
</ref>
<ref id="B20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>He</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Ong</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Bai</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2021b</year>). <article-title>Learning Conjoint Attentions for Graph Neural Nets</article-title>. <source>Adv. Neural Inf. Process. Syst.</source> <volume>34</volume>. </citation>
</ref>
<ref id="B21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hong</surname>
<given-names>E. L.</given-names>
</name>
<name>
<surname>Balakrishnan</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Dong</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Christie</surname>
<given-names>K. R.</given-names>
</name>
<name>
<surname>Park</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Binkley</surname>
<given-names>G.</given-names>
</name>
<etal/>
</person-group> (<year>2007</year>). <article-title>Gene Ontology Annotations at Sgd: New Data Sources and Annotation Methods</article-title>. <source>Nucleic Acids Res.</source> <volume>36</volume>, <fpage>D577</fpage>&#x2013;<lpage>D581</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkm909</pub-id> </citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hu</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Chan</surname>
<given-names>K. C.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>A Density-Based Clustering Approach for Identifying Overlapping Protein Complexes with Functional Preferences</article-title>. <source>BMC bioinformatics</source> <volume>16</volume>, <fpage>174</fpage>. <pub-id pub-id-type="doi">10.1186/s12859-015-0583-3</pub-id> </citation>
</ref>
<ref id="B23">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hu</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Yuan</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Xiong</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Luo</surname>
<given-names>X.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Efficiently Detecting Protein Complexes from Protein Interaction Networks via Alternating Direction Method of Multipliers</article-title>. <source>Ieee/acm Trans. Comput. Biol. Bioinform</source> <volume>16</volume>, <fpage>1922</fpage>&#x2013;<lpage>1935</lpage>. <pub-id pub-id-type="doi">10.1109/TCBB.2018.2844256</pub-id> </citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jianxin Wang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Jun Ren</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Min Li</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Fang-Xiang Wu</surname>
<given-names>F.-X.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>Identification of Hierarchical and Overlapping Functional Modules in Ppi Networks</article-title>. <source>IEEE Trans.on Nanobioscience</source> <volume>11</volume>, <fpage>386</fpage>&#x2013;<lpage>393</lpage>. <pub-id pub-id-type="doi">10.1109/tnb.2012.2210907</pub-id> </citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Keretsu</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Sarmah</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Weighted Edge Based Clustering to Identify Protein Complexes in Protein-Protein Interaction Networks Incorporating Gene Expression Profile</article-title>. <source>Comput. Biol. Chem.</source> <volume>65</volume>, <fpage>69</fpage>&#x2013;<lpage>79</lpage>. <pub-id pub-id-type="doi">10.1016/j.compbiolchem.2016.10.001</pub-id> </citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>King</surname>
<given-names>A. D.</given-names>
</name>
<name>
<surname>Przulj</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Jurisica</surname>
<given-names>I.</given-names>
</name>
</person-group> (<year>2004</year>). <article-title>Protein Complex Prediction via Cost-Based Clustering</article-title>. <source>Bioinformatics</source> <volume>20</volume>, <fpage>3013</fpage>&#x2013;<lpage>3020</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/bth351</pub-id> </citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kipf</surname>
<given-names>T. N.</given-names>
</name>
<name>
<surname>Welling</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Semi-supervised Classification with Graph Convolutional Networks</article-title>. <source>arXiv preprint arXiv:1609.02907</source>. </citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Krogan</surname>
<given-names>N. J.</given-names>
</name>
<name>
<surname>Cagney</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Zhong</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Guo</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Ignatchenko</surname>
<given-names>A.</given-names>
</name>
<etal/>
</person-group> (<year>2006</year>). <article-title>Global Landscape of Protein Complexes in the Yeast saccharomyces Cerevisiae</article-title>. <source>Nature</source> <volume>440</volume>, <fpage>637</fpage>&#x2013;<lpage>643</lpage>. <pub-id pub-id-type="doi">10.1038/nature04670</pub-id> </citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lakizadeh</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Jalili</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Marashi</surname>
<given-names>S.-A.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Camwi: Detecting Protein Complexes Using Weighted Clustering Coefficient and Weighted Density</article-title>. <source>Comput. Biol. Chem.</source> <volume>58</volume>, <fpage>231</fpage>&#x2013;<lpage>240</lpage>. <pub-id pub-id-type="doi">10.1016/j.compbiolchem.2015.07.012</pub-id> </citation>
</ref>
<ref id="B30">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lei</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Ding</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Fujita</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Identification of Dynamic Protein Complexes Based on Fruit Fly Optimization Algorithm</article-title>. <source>Knowledge-Based Syst.</source> <volume>105</volume>, <fpage>270</fpage>&#x2013;<lpage>277</lpage>. <pub-id pub-id-type="doi">10.1016/j.knosys.2016.05.019</pub-id> </citation>
</ref>
<ref id="B31">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lei</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Cheng</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>F.-X.</given-names>
</name>
<name>
<surname>Pedrycz</surname>
<given-names>W.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Topology Potential Based Seed-Growth Method to Identify Protein Complexes on Dynamic Ppi Data</article-title>. <source>Inf. Sci.</source> <volume>425</volume>, <fpage>140</fpage>&#x2013;<lpage>153</lpage>. <pub-id pub-id-type="doi">10.1016/j.ins.2017.10.013</pub-id> </citation>
</ref>
<ref id="B32">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Pan</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>Towards the Identification of Protein Complexes and Functional Modules by Integrating Ppi Network and Gene Expression Data</article-title>. <source>BMC bioinformatics</source> <volume>13</volume>, <fpage>109</fpage>&#x2013;<lpage>115</lpage>. <pub-id pub-id-type="doi">10.1186/1471-2105-13-109</pub-id> </citation>
</ref>
<ref id="B33">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Kwoh</surname>
<given-names>C. K.</given-names>
</name>
<name>
<surname>Ng</surname>
<given-names>S. K.</given-names>
</name>
</person-group> (<year>2010</year>). <article-title>Computational Approaches for Detecting Protein Complexes from Protein Interaction Networks: a Survey</article-title>. <source>BMC genomics</source> <volume>11 Suppl 1</volume>, <fpage>S3</fpage>&#x2013;<lpage>S19</lpage>. <pub-id pub-id-type="doi">10.1186/1471-2164-11-S1-S3</pub-id> </citation>
</ref>
<ref id="B34">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>X.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Identifying Protein Complexes with clear Module Structure Using Pairwise Constraints in Protein Interaction Networks</article-title>. <source>Front. Genet.</source> <volume>12</volume>, <fpage>664786</fpage>. <pub-id pub-id-type="doi">10.3389/fgene.2021.664786</pub-id> </citation>
</ref>
<ref id="B35">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Wong</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Chua</surname>
<given-names>H. N.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>Complex Discovery from Weighted Ppi Networks</article-title>. <source>Bioinformatics</source> <volume>25</volume>, <fpage>1891</fpage>&#x2013;<lpage>1897</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btp311</pub-id> </citation>
</ref>
<ref id="B36">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Song</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Using Contrast Patterns between True Complexes and Random Subgraphs in Ppi Networks to Predict Unknown Protein Complexes</article-title>. <source>Sci. Rep.</source> <volume>6</volume>, <fpage>21223</fpage>. <pub-id pub-id-type="doi">10.1038/srep21223</pub-id> </citation>
</ref>
<ref id="B37">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Sang</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Y.</given-names>
</name>
<etal/>
</person-group> (<year>2018</year>). <article-title>Identifying Protein Complexes Based on Node Embeddings Obtained from Protein-Protein Interaction Networks</article-title>. <source>BMC bioinformatics</source> <volume>19</volume>, <fpage>332</fpage>. <pub-id pub-id-type="doi">10.1186/s12859-018-2364-2</pub-id> </citation>
</ref>
<ref id="B38">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ma</surname>
<given-names>C. Y.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>Y. P.</given-names>
</name>
<name>
<surname>Berger</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Liao</surname>
<given-names>C. S.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Identification of Protein Complexes by Integrating Multiple Alignment of Protein Interaction Networks</article-title>. <source>Bioinformatics</source> <volume>33</volume>, <fpage>1681</fpage>&#x2013;<lpage>1688</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btx043</pub-id> </citation>
</ref>
<ref id="B39">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mei</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>A Framework Combines Supervised Learning and Dense Subgraphs Discovery to Predict Protein Complexes</article-title>. <source>Front. Comput. Sci.</source> <volume>16</volume>, <fpage>1</fpage>&#x2013;<lpage>14</lpage>. <pub-id pub-id-type="doi">10.1007/s11704-021-0476-8</pub-id> </citation>
</ref>
<ref id="B40">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Meng</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Peng</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>F.-X.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2019</year>). &#x201c;<article-title>Detecting Protein Complex Based on Hierarchical Compressing Network Embedding</article-title>,&#x201d; in <conf-name>2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)</conf-name>, <fpage>215</fpage>&#x2013;<lpage>218</lpage>. <pub-id pub-id-type="doi">10.1109/bibm47256.2019.8983423</pub-id> </citation>
</ref>
<ref id="B41">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mewes</surname>
<given-names>H. W.</given-names>
</name>
<name>
<surname>Amid</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Arnold</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Frishman</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>G&#xfc;ldener</surname>
<given-names>U.</given-names>
</name>
<name>
<surname>Mannhaupt</surname>
<given-names>G.</given-names>
</name>
<etal/>
</person-group> (<year>2004</year>). <article-title>Mips: Analysis and Annotation of Proteins from Whole Genomes</article-title>. <source>Nucleic Acids Res.</source> <volume>32</volume>, <fpage>D41</fpage>&#x2013;<lpage>D44</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkh092</pub-id> </citation>
</ref>
<ref id="B42">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nepusz</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Paccanaro</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>Detecting Overlapping Protein Complexes in Protein-Protein Interaction Networks</article-title>. <source>Nat. Methods</source> <volume>9</volume>, <fpage>471</fpage>&#x2013;<lpage>472</lpage>. <pub-id pub-id-type="doi">10.1038/nmeth.1938</pub-id> </citation>
</ref>
<ref id="B43">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Omranian</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Angeleska</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Nikoloski</surname>
<given-names>Z.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Pc2p: Parameter-free Network-Based Prediction of Protein Complexes</article-title>. <source>Bioinformatics</source> <volume>37</volume>, <fpage>73</fpage>&#x2013;<lpage>81</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btaa1089</pub-id> </citation>
</ref>
<ref id="B44">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pedregosa</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Varoquaux</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Gramfort</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Michel</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Thirion</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Grisel</surname>
<given-names>O.</given-names>
</name>
<etal/>
</person-group> (<year>2011</year>). <article-title>Scikit-learn: Machine Learning in Python</article-title>. <source>J.&#x20;Machine Learn. Res.</source> <volume>12</volume>, <fpage>2825</fpage>&#x2013;<lpage>2830</lpage>. </citation>
</ref>
<ref id="B45">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Peng</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Identification of Protein Complexes Using Weighted Pagerank-Nibble Algorithm and Core-Attachment Structure</article-title>. <source>Ieee/acm Trans. Comput. Biol. Bioinform</source> <volume>12</volume>, <fpage>179</fpage>&#x2013;<lpage>192</lpage>. <pub-id pub-id-type="doi">10.1109/TCBB.2014.2343954</pub-id> </citation>
</ref>
<ref id="B46">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pourkazemi</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Keyvanpour</surname>
<given-names>M. R.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Community Detection in Social Network by Using a Multi-Objective Evolutionary Algorithm</article-title>. <source>Intell. Data Anal.</source> <volume>21</volume>, <fpage>385</fpage>&#x2013;<lpage>409</lpage>. <pub-id pub-id-type="doi">10.3233/ida-150429</pub-id> </citation>
</ref>
<ref id="B47">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pu</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Wong</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Turner</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Cho</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Wodak</surname>
<given-names>S. J.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>Up-to-date Catalogues of Yeast Protein Complexes</article-title>. <source>Nucleic Acids Res.</source> <volume>37</volume>, <fpage>825</fpage>&#x2013;<lpage>831</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkn1005</pub-id> </citation>
</ref>
<ref id="B48">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Qi</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Balem</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Faloutsos</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Klein-Seetharaman</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Bar-Joseph</surname>
<given-names>Z.</given-names>
</name>
</person-group> (<year>2008</year>). <article-title>Protein Complex Identification by Supervised Graph Local Clustering</article-title>. <source>Bioinformatics</source> <volume>24</volume>, <fpage>i250</fpage>&#x2013;<lpage>i268</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btn164</pub-id> </citation>
</ref>
<ref id="B49">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shang</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>B.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Identifying Essential Proteins Based on Dynamic Protein-Protein Interaction Networks and Rna-Seq Datasets</article-title>. <source>Sci. China Inf. Sci.</source> <volume>59</volume>, <fpage>1</fpage>&#x2013;<lpage>11</lpage>. <pub-id pub-id-type="doi">10.1007/s11432-016-5583-z</pub-id> </citation>
</ref>
<ref id="B50">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shi</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Lei</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>Protein Complex Detection with Semi-supervised Learning in Protein Interaction Networks</article-title>. <source>Proteome Sci.</source> <volume>9 Suppl 1</volume>, <fpage>S5</fpage>&#x2013;<lpage>S9</lpage>. <pub-id pub-id-type="doi">10.1186/1477-5956-9-S1-S5</pub-id> </citation>
</ref>
<ref id="B51">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sikandar</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Anwar</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Bajwa</surname>
<given-names>U. I.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Sikandar</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Yao</surname>
<given-names>L.</given-names>
</name>
<etal/>
</person-group> (<year>2018</year>). <article-title>Decision Tree Based Approaches for Detecting Protein Complex in Protein Protein Interaction Network (Ppi) via Link and Sequence Analysis</article-title>. <source>IEEE Access</source> <volume>6</volume>, <fpage>22108</fpage>&#x2013;<lpage>22120</lpage>. <pub-id pub-id-type="doi">10.1109/access.2018.2807811</pub-id> </citation>
</ref>
<ref id="B52">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Song</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Singh</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>How and when Should Interactome-Derived Clusters Be Used to Predict Functional Modules and Protein Function?</article-title> <source>Bioinformatics</source> <volume>25</volume>, <fpage>3143</fpage>&#x2013;<lpage>3150</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btp551</pub-id> </citation>
</ref>
<ref id="B53">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Spirin</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Mirny</surname>
<given-names>L. A.</given-names>
</name>
</person-group> (<year>2003</year>). <article-title>Protein Complexes and Functional Modules in Molecular Networks</article-title>. <source>Proc. Natl. Acad. Sci.</source> <volume>100</volume>, <fpage>12123</fpage>&#x2013;<lpage>12128</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.2032324100</pub-id> </citation>
</ref>
<ref id="B54">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tan</surname>
<given-names>C. S. H.</given-names>
</name>
<name>
<surname>Go</surname>
<given-names>K. D.</given-names>
</name>
<name>
<surname>Bisteau</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Dai</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Yong</surname>
<given-names>C. H.</given-names>
</name>
<name>
<surname>Prabhu</surname>
<given-names>N.</given-names>
</name>
<etal/>
</person-group> (<year>2018</year>). <article-title>Thermal Proximity Coaggregation for System-wide Profiling of Protein Complex Dynamics in Cells</article-title>. <source>Science</source> <volume>359</volume>, <fpage>1170</fpage>&#x2013;<lpage>1177</lpage>. <pub-id pub-id-type="doi">10.1126/science.aan0346</pub-id> </citation>
</ref>
<ref id="B55">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Deng</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Pan</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2010</year>). <article-title>Recent Advances in Clustering Methods for Protein Interaction Networks</article-title>. <source>BMC genomics</source> <volume>11 Suppl 3</volume>, <fpage>S10</fpage>&#x2013;<lpage>S19</lpage>. <pub-id pub-id-type="doi">10.1186/1471-2164-11-S3-S10</pub-id> </citation>
</ref>
<ref id="B56">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Peng</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Pan</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>Construction and Application of Dynamic Protein Interaction Network Based on Time Course Gene Expression Data</article-title>. <source>Proteomics</source> <volume>13</volume>, <fpage>301</fpage>&#x2013;<lpage>312</lpage>. <pub-id pub-id-type="doi">10.1002/pmic.201200277</pub-id> </citation>
</ref>
<ref id="B57">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>A Seed-Extended Algorithm for Detecting Protein Complexes Based on Density and Modularity with Topological Structure and Go Annotations</article-title>. <source>BMC genomics</source> <volume>20</volume>, <fpage>637</fpage>. <pub-id pub-id-type="doi">10.1186/s12864-019-5956-y</pub-id> </citation>
</ref>
<ref id="B58">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Ma</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>An Improved Memetic Algorithm for Detecting Protein Complexes in Protein Interaction Networks</article-title>. <source>Front. Genet.</source> <volume>12</volume>, <fpage>794354</fpage>. <pub-id pub-id-type="doi">10.3389/fgene.2021.794354</pub-id> </citation>
</ref>
<ref id="B59">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>A Novel Graph Clustering Method with a Greedy Heuristic Search Algorithm for Mining Protein Complexes from Dynamic and Static Ppi Networks</article-title>. <source>Inf. Sci.</source> <volume>522</volume>, <fpage>275</fpage>&#x2013;<lpage>298</lpage>. <pub-id pub-id-type="doi">10.1016/j.ins.2020.02.063</pub-id> </citation>
</ref>
<ref id="B60">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>A New Method for Recognizing Protein Complexes Based on Protein Interaction Networks and Go Terms</article-title>. <source>Front. Genet.</source> <volume>12</volume>, <fpage>792265</fpage>. <pub-id pub-id-type="doi">10.3389/fgene.2021.792265</pub-id> </citation>
</ref>
<ref id="B61">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Gao</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>Z.</given-names>
</name>
</person-group> (<year>2011</year>). &#x201c;<article-title>An Edge Based Core-Attachment Method to Detect Protein Complexes in Ppi Networks</article-title>,&#x201d; in <conf-name>2011 IEEE International Conference on Systems Biology (ISB)</conf-name>, <fpage>72</fpage>&#x2013;<lpage>77</lpage>. <pub-id pub-id-type="doi">10.1109/isb.2011.6033123</pub-id> </citation>
</ref>
<ref id="B62">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wu</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Kwoh</surname>
<given-names>C. K.</given-names>
</name>
<name>
<surname>Ng</surname>
<given-names>S. K.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>A Core-Attachment Based Method to Detect Protein Complexes in Ppi Networks</article-title>. <source>BMC bioinformatics</source> <volume>10</volume>, <fpage>169</fpage>. <pub-id pub-id-type="doi">10.1186/1471-2105-10-169</pub-id> </citation>
</ref>
<ref id="B63">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xenarios</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Salwinski</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Duan</surname>
<given-names>X. J.</given-names>
</name>
<name>
<surname>Higney</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Kim</surname>
<given-names>S.-M.</given-names>
</name>
<name>
<surname>Eisenberg</surname>
<given-names>D.</given-names>
</name>
</person-group> (<year>2002</year>). <article-title>Dip, the Database of Interacting Proteins: a Research Tool for Studying Cellular Networks of Protein Interactions</article-title>. <source>Nucleic Acids Res.</source> <volume>30</volume>, <fpage>303</fpage>&#x2013;<lpage>305</lpage>. <pub-id pub-id-type="doi">10.1093/nar/30.1.303</pub-id> </citation>
</ref>
<ref id="B64">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xiao-Fei Zhang</surname>
<given-names>X.-F.</given-names>
</name>
<name>
<surname>Dao-Qing Dai</surname>
<given-names>D.-Q.</given-names>
</name>
<name>
<surname>Xiao-Xin Li</surname>
<given-names>X.-X.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>Protein Complexes Discovery Based on Protein-Protein Interaction Data via a Regularized Sparse Generative Network Model</article-title>. <source>Ieee/acm Trans. Comput. Biol. Bioinf.</source> <volume>9</volume>, <fpage>857</fpage>&#x2013;<lpage>870</lpage>. <pub-id pub-id-type="doi">10.1109/tcbb.2012.20</pub-id> </citation>
</ref>
<ref id="B65">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xu</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Guan</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>An Effective Approach to Detecting Both Small and Large Complexes from Protein-Protein Interaction Networks</article-title>. <source>BMC bioinformatics</source> <volume>18</volume>, <fpage>419</fpage>&#x2013;<lpage>428</lpage>. <pub-id pub-id-type="doi">10.1186/s12859-017-1820-8</pub-id> </citation>
</ref>
<ref id="B66">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yao</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Shi</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Guan</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Accurately Detecting Protein Complexes by Graph Embedding and Combining Functions with Interactions</article-title>. <source>Ieee/acm Trans. Comput. Biol. Bioinform</source> <volume>17</volume>, <fpage>777</fpage>&#x2013;<lpage>787</lpage>. <pub-id pub-id-type="doi">10.1109/TCBB.2019.2897769</pub-id> </citation>
</ref>
<ref id="B67">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yu</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Tang</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Lin</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>Z.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Predicting Protein Complex in Protein Interaction Network - a Supervised Learning Based Method</article-title>. <source>BMC Syst. Biol.</source> <volume>8 Suppl 3</volume>, <fpage>S4</fpage>&#x2013;<lpage>S16</lpage>. <pub-id pub-id-type="doi">10.1186/1752-0509-8-S3-S4</pub-id> </citation>
</ref>
<ref id="B68">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yu</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Ma</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Zou</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Gao</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Inferring Drug-Disease Associations Based on Known Protein Complexes</article-title>. <source>BMC Med. Genomics</source> <volume>8 Suppl 2</volume>, <fpage>S2</fpage>&#x2013;<lpage>S13</lpage>. <pub-id pub-id-type="doi">10.1186/1755-8794-8-S2-S2</pub-id> </citation>
</ref>
<ref id="B69">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yu</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Gao</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Chiu</surname>
<given-names>D. K. Y.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>A Degree-Distribution Based Hierarchical Agglomerative Clustering Algorithm for Protein Complexes Identification</article-title>. <source>Comput. Biol. Chem.</source> <volume>35</volume>, <fpage>298</fpage>&#x2013;<lpage>307</lpage>. <pub-id pub-id-type="doi">10.1016/j.compbiolchem.2011.07.005</pub-id> </citation>
</ref>
<ref id="B70">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zaki</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Efimov</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Berengueres</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>Protein Complex Detection Using Interaction Reliability Assessment and Weighted Clustering Coefficient</article-title>. <source>BMC bioinformatics</source> <volume>14</volume>, <fpage>163</fpage>&#x2013;<lpage>169</lpage>. <pub-id pub-id-type="doi">10.1186/1471-2105-14-163</pub-id> </citation>
</ref>
<ref id="B71">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zaki</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Singh</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Mohamed</surname>
<given-names>E. A.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Identifying Protein Complexes in Protein-Protein Interaction Data Using Graph Convolutional Network</article-title>. <source>IEEE Access</source> <volume>9</volume>, <fpage>123717</fpage>&#x2013;<lpage>123726</lpage>. <pub-id pub-id-type="doi">10.1109/access.2021.3110845</pub-id> </citation>
</ref>
<ref id="B72">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Zhong</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Lin</surname>
<given-names>H. X.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>A Method for Identifying Protein Complexes with the Features of Joint Co-localization and Joint Co-expression in Static Ppi Networks</article-title>. <source>Comput. Biol. Med.</source> <volume>111</volume>, <fpage>103333</fpage>. <pub-id pub-id-type="doi">10.1016/j.compbiomed.2019.103333</pub-id> </citation>
</ref>
<ref id="B73">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>X. F.</given-names>
</name>
<name>
<surname>Dai</surname>
<given-names>D. Q.</given-names>
</name>
<name>
<surname>Ou-Yang</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Yan</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Detecting Overlapping Protein Complexes Based on a Generative Model with Functional and Topological Properties</article-title>. <source>BMC bioinformatics</source> <volume>15</volume>, <fpage>186</fpage>. <pub-id pub-id-type="doi">10.1186/1471-2105-15-186</pub-id> </citation>
</ref>
<ref id="B74">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Lin</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Sang</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>A Method for Predicting Protein Complex in Dynamic Ppi Networks</article-title>. <source>BMC bioinformatics</source> <volume>17 Suppl 7</volume>, <fpage>229</fpage>&#x2013;<lpage>543</lpage>. <pub-id pub-id-type="doi">10.1186/s12859-016-1101-y</pub-id> </citation>
</ref>
<ref id="B75">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Lei</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Tan</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2017</year>). &#x201c;<article-title>Firefly Clustering Method for Mining Protein Complexes</article-title>,&#x201d; in <conf-name>International Conference on Swarm Intelligence</conf-name>, <fpage>601</fpage>&#x2013;<lpage>610</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-319-61824-1_65</pub-id> </citation>
</ref>
</ref-list>
</back>
</article>
