<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Mol. Biosci.</journal-id>
<journal-title>Frontiers in Molecular Biosciences</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Mol. Biosci.</abbrev-journal-title>
<issn pub-type="epub">2296-889X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fmolb.2021.626729</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Molecular Biosciences</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Abundance Imparts Evolutionary Constraints of Similar Magnitude on the Buried, Surface, and Disordered Regions of Proteins</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Dubreuil</surname> <given-names>Benjamin</given-names></name>
<uri xlink:href="https://loop.frontiersin.org/people/1203145/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Levy</surname> <given-names>Emmanuel D.</given-names></name>
<xref ref-type="corresp" rid="c001"><sup>&#x002A;</sup></xref>
<uri xlink:href="https://loop.frontiersin.org/people/1134827/overview"/>
</contrib>
</contrib-group>
<aff><institution>Department of Structural Biology, Weizmann Institute of Science</institution>, <addr-line>Rehovot</addr-line>, <country>Israel</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Paolo Marcatili, Technical University of Denmark, Denmark</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Andrew James Doig, The University of Manchester, United Kingdom; Karen N. Allen, Boston University, United States</p></fn>
<corresp id="c001">&#x002A;Correspondence: Emmanuel D. Levy, <email>emmanuel.levy@weizmann.ac.il</email></corresp>
<fn fn-type="other" id="fn004"><p>This article was submitted to Structural Biology, a section of the journal Frontiers in Molecular Biosciences</p></fn>
</author-notes>
<pub-date pub-type="epub">
<day>30</day>
<month>04</month>
<year>2021</year>
</pub-date>
<pub-date pub-type="collection">
<year>2021</year>
</pub-date>
<volume>8</volume>
<elocation-id>626729</elocation-id>
<history>
<date date-type="received">
<day>06</day>
<month>11</month>
<year>2020</year>
</date>
<date date-type="accepted">
<day>29</day>
<month>03</month>
<year>2021</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2021 Dubreuil and Levy.</copyright-statement>
<copyright-year>2021</copyright-year>
<copyright-holder>Dubreuil and Levy</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract>
<p>An understanding of the forces shaping protein conservation is key, both for the fundamental knowledge it represents and to allow for optimal use of evolutionary information in practical applications. Sequence conservation is typically examined at one of two levels. The first is a residue-level, where intra-protein differences are analyzed and the second is a protein-level, where inter-protein differences are studied. At a residue level, we know that solvent-accessibility is a prime determinant of conservation. By inverting this logic, we inferred that disordered regions are slightly more solvent-accessible on average than the most exposed surface residues in domains. By integrating abundance information with evolutionary data within and across proteins, we confirmed a previously reported strong surface-core association in the evolution of structured regions, but we found a comparatively weak association between disordered and structured regions. The facts that disordered and structured regions experience different structural constraints and evolve independently provide a unique setup to examine an outstanding question: why is a protein&#x2019;s abundance the main determinant of its sequence conservation? Indeed, any structural or biophysical property linked to the abundance-conservation relationship should increase the relative conservation of regions concerned with that property (e.g., disordered residues with mis-interactions, domain residues with misfolding). Surprisingly, however, we found the conservation of disordered and structured regions to increase in equal proportion with abundance. This observation implies that either abundance-related constraints are structure-independent, or multiple constraints apply to different regions and perfectly balance each other.</p>
</abstract>
<kwd-group>
<kwd>protein abundance</kwd>
<kwd>protein evolution</kwd>
<kwd>protein structure</kwd>
<kwd>misfolding</kwd>
<kwd>intrinsic disorder</kwd>
<kwd>contact number</kwd>
<kwd>misinteraction</kwd>
<kwd>yeast proteome</kwd>
</kwd-group>
<counts>
<fig-count count="5"/>
<table-count count="0"/>
<equation-count count="0"/>
<ref-count count="120"/>
<page-count count="11"/>
<word-count count="0"/>
</counts>
</article-meta>
</front>
<body>
<sec id="S1">
<title>Introduction</title>
<p>During the course of evolution, mutations arise throughout genomes and can impact every protein at every site. However, contemplating a multiple sequence alignment of orthologous sequences typically shows widely differing levels of conservation across sites. Additionally, comparing multiple sequence alignments of different orthogroups shows even larger differences: certain groups such as those of ribosomal genes can be well conserved despite hundreds of millions of years of divergence, while others accumulate mutations much faster.</p>
<p>Amino-acid residues within proteins are subject to functional, biophysical, and structural constraints that are interconnected. These constraints result in different degrees of purifying selection along the sequence (i.e., purging of deleterious mutations by natural selection), which yields different levels of positional conservation. We discuss here structural aspects related to these constraints while placing an emphasis on works of Cyrus Chothia, to whom this issue is dedicated, and refer the reader to several reviews for a comprehensive overview (<xref ref-type="bibr" rid="B62">Liberles et al., 2012</xref>; <xref ref-type="bibr" rid="B94">Sikosek and Chan, 2014</xref>; <xref ref-type="bibr" rid="B28">Echave et al., 2016</xref>; <xref ref-type="bibr" rid="B26">Echave and Wilke, 2017</xref>). Following the characterization of the first few structures of proteins, their comparative analysis made it clear that the burial of non-polar residues accompanied with Van der Waals interactions and hydrogen bonding were the main contributors to the folding free energy (<xref ref-type="bibr" rid="B10">Chothia, 1974</xref>, <xref ref-type="bibr" rid="B11">1975</xref>, <xref ref-type="bibr" rid="B12">1976</xref>; <xref ref-type="bibr" rid="B71">Miller et al., 1987</xref>). Confirming the &#x201C;hydrophobic bonding&#x201D; intuition of Kauzmann (<xref ref-type="bibr" rid="B48">Kauzmann, 1959</xref>) and relying on calculations of molecular surfaces based on the algorithm of <xref ref-type="bibr" rid="B55">Lee and Richards (1971)</xref>, Chothia estimated that each square &#x00C5;ngstrom of accessible surface removed from contact with water provides a free energy gain of 25 cal. Mol<sup>&#x2013;1</sup> (<xref ref-type="bibr" rid="B10">Chothia, 1974</xref>, <xref ref-type="bibr" rid="B11">1975</xref>). At the same time, he provided universal relationships governing protein folding, e.g., on the proportion of the total accessible surface of a polypeptide chain that becomes buried upon folding (<xref ref-type="bibr" rid="B11">Chothia, 1975</xref>). This simple relationship has a profound meaning with respect to surface-to-volume ratios in folded proteins, notably that longer proteins should fold following a beads-on-a-string model rather than by forming larger beads (<xref ref-type="bibr" rid="B114">Wetlaufer, 1973</xref>) &#x2013; indeed it was soon realized that beads (domains) are fundamental units of protein evolution (<xref ref-type="bibr" rid="B13">Chothia, 1992</xref>; <xref ref-type="bibr" rid="B72">Murzin et al., 1995</xref>; <xref ref-type="bibr" rid="B4">Bateman et al., 2002</xref>; <xref ref-type="bibr" rid="B41">Gough and Chothia, 2002</xref>). On top of hydrophobic bonding energy, a high degree of steric complementarity creates a well-packed protein interior (<xref ref-type="bibr" rid="B11">Chothia, 1975</xref>), in which mutations are incrementally accommodated by small structural changes (<xref ref-type="bibr" rid="B56">Lesk and Chothia, 1980</xref>). Ultimately, as sequences diverge, structures do too, albeit more slowly (<xref ref-type="bibr" rid="B14">Chothia and Lesk, 1986</xref>, <xref ref-type="bibr" rid="B15">1987</xref>). Considering that structures are globally maintained during the course of evolution, it is intuitive that buried residues, which contribute to folding and stability more than surface residues (<xref ref-type="bibr" rid="B17">Creighton and Chothia, 1989</xref>; <xref ref-type="bibr" rid="B64">Lim and Sauer, 1989</xref>; <xref ref-type="bibr" rid="B100">Tokuriki et al., 2007</xref>), are more conserved (<xref ref-type="bibr" rid="B51">Koshi and Goldstein, 1995</xref>; <xref ref-type="bibr" rid="B38">Goldman et al., 1998</xref>; <xref ref-type="bibr" rid="B43">Guo et al., 2004</xref>; <xref ref-type="bibr" rid="B7">Bloom et al., 2006</xref>; <xref ref-type="bibr" rid="B87">Sasidharan and Chothia, 2007</xref>; <xref ref-type="bibr" rid="B39">Goldstein, 2008</xref>; <xref ref-type="bibr" rid="B16">Conant and Stadler, 2009</xref>; <xref ref-type="bibr" rid="B32">Franzosa and Xia, 2009</xref>; <xref ref-type="bibr" rid="B62">Liberles et al., 2012</xref>; <xref ref-type="bibr" rid="B118">Yeh et al., 2014</xref>; <xref ref-type="bibr" rid="B27">Echave et al., 2015</xref>; <xref ref-type="bibr" rid="B91">Shahmoradi and Wilke, 2016</xref>; <xref ref-type="bibr" rid="B95">Spielman and Wilke, 2016</xref>; <xref ref-type="bibr" rid="B26">Echave and Wilke, 2017</xref>; <xref ref-type="bibr" rid="B66">Liu et al., 2017</xref>).</p>
<p>We saw that the structure of a protein could help explain why certain positions &#x2013; notably those buried and in contact with a large number of neighboring residues, are more conserved than others. Protein structure can also help to rationalize why certain proteins, e.g., those with more designable folds, evolve faster than others (<xref ref-type="bibr" rid="B92">Shakhnovich et al., 2005</xref>; <xref ref-type="bibr" rid="B7">Bloom et al., 2006</xref>). Globally, however, structural information only explains a small fraction of the heterogeneity in evolutionary rates seen across different proteins. Several studies have singled out other protein-centric properties associated with this heterogeneity (<xref ref-type="bibr" rid="B119">Zhang and Yang, 2015</xref>), including function (<xref ref-type="bibr" rid="B110">Wall et al., 2005</xref>; <xref ref-type="bibr" rid="B67">Lopez-Bigas et al., 2008</xref>; <xref ref-type="bibr" rid="B116">Xia et al., 2009</xref>), essentiality (<xref ref-type="bibr" rid="B46">Hurst and Smith, 1999</xref>; <xref ref-type="bibr" rid="B45">Hirsh and Fraser, 2001</xref>; <xref ref-type="bibr" rid="B47">Jordan et al., 2002</xref>; <xref ref-type="bibr" rid="B61">Liao et al., 2006</xref>), the number of interaction partners (<xref ref-type="bibr" rid="B34">Fraser et al., 2002</xref>; <xref ref-type="bibr" rid="B6">Bloom and Adami, 2004</xref>; <xref ref-type="bibr" rid="B33">Fraser and Hirsh, 2004</xref>; <xref ref-type="bibr" rid="B44">Hahn and Kern, 2005</xref>; <xref ref-type="bibr" rid="B49">Kim et al., 2006</xref>; <xref ref-type="bibr" rid="B116">Xia et al., 2009</xref>), or cellular abundance (<xref ref-type="bibr" rid="B74">Pal et al., 2001</xref>; <xref ref-type="bibr" rid="B52">Krylov et al., 2003</xref>; <xref ref-type="bibr" rid="B83">Rocha and Danchin, 2004</xref>; <xref ref-type="bibr" rid="B98">Subramanian and Kumar, 2004</xref>; <xref ref-type="bibr" rid="B23">Drummond et al., 2005</xref>; <xref ref-type="bibr" rid="B7">Bloom et al., 2006</xref>; <xref ref-type="bibr" rid="B61">Liao et al., 2006</xref>; <xref ref-type="bibr" rid="B80">Popescu et al., 2006</xref>; <xref ref-type="bibr" rid="B75">P&#x00E1;l et al., 2006</xref>; <xref ref-type="bibr" rid="B86">S&#x00E4;llstr&#x00F6;m et al., 2006</xref>; <xref ref-type="bibr" rid="B22">Drummond and Wilke, 2008</xref>; <xref ref-type="bibr" rid="B116">Xia et al., 2009</xref>; <xref ref-type="bibr" rid="B119">Zhang and Yang, 2015</xref>). The latter is, by far, the most significant, in particular among unicellular organisms where there is no complexity added by tissue-specific expression. Several mechanistic interpretations of this abundance-conservation association have been proposed (<xref ref-type="bibr" rid="B23">Drummond et al., 2005</xref>; <xref ref-type="bibr" rid="B22">Drummond and Wilke, 2008</xref>; <xref ref-type="bibr" rid="B8">Cherry, 2010</xref>; <xref ref-type="bibr" rid="B42">Gout et al., 2010</xref>; <xref ref-type="bibr" rid="B79">Plata et al., 2010</xref>; <xref ref-type="bibr" rid="B58">Levy et al., 2012</xref>; <xref ref-type="bibr" rid="B117">Yang et al., 2012</xref>; <xref ref-type="bibr" rid="B76">Park et al., 2013</xref>; <xref ref-type="bibr" rid="B119">Zhang and Yang, 2015</xref>) and remain a matter of active debate (<xref ref-type="bibr" rid="B78">Plata and Vitkup, 2018</xref>; <xref ref-type="bibr" rid="B82">Razban, 2019</xref>). We will scrutinize this relationship further in the results and discussion section, in the context of the results presented.</p>
<p>We have seen how protein structure helped to interpret and rationalize data on evolutionary conservation. Here, we invert this logic to characterize structural properties of disordered regions from data on their evolutionary conservation. First, we compared the evolutionary rate of disordered regions to that of surface residues in the same protein and found that disordered regions are equivalent to super-accessible surface residues. Second, we know that the divergence of surface and core residues is interdependent. In other words, a protein&#x2019;s surface can hardly diverge without mutations arising in its interior as well, and vice-versa. We confirmed this finding in showing that evolutionary rates of surface and interior regions are correlated within proteins (<italic>R</italic> &#x003E; 0.85). In contrast, the evolutionary rates of disordered and domain regions were poorly coupled (<italic>R</italic> &#x223C; 0.25), indicating that disordered regions are, for the most part, structurally independent from domains in the same sequence. Finally, the structural differences and the lack of interdependence between disordered and structured regions supports that they can be influenced differently by biophysical and structural constraints. For example, an increased purifying selection for protein stability is expected to impact buried residues more than disordered ones. This idea led us to examine whether abundance impacts the relative conservation between these regions. Surprisingly, however, the relative conservation between different regions appeared independent from abundance.</p>
</sec>
<sec id="S2">
<title>Results and Discussion</title>
<sec id="S2.SS1">
<title>Disordered Regions Are Equivalent to Super-Accessible Surface Residues in Terms of Their Conservation</title>
<p>Among proteins that need to fold into stable structures to function, amino-acid residues buried in the protein interior contribute the most to stability. Consequently, these residues are under stronger purifying selection than surface amino-acid residues, and are, on average, more conserved in the sequence. Two measures of residue burial have been associated with the heterogeneity of conservation in sequences: (i) solvent accessible surface area or ASA (<xref ref-type="bibr" rid="B55">Lee and Richards, 1971</xref>; <xref ref-type="bibr" rid="B93">Shrake and Rupley, 1973</xref>; <xref ref-type="bibr" rid="B38">Goldman et al., 1998</xref>; <xref ref-type="bibr" rid="B7">Bloom et al., 2006</xref>; <xref ref-type="bibr" rid="B65">Lin et al., 2007</xref>; <xref ref-type="bibr" rid="B16">Conant and Stadler, 2009</xref>; <xref ref-type="bibr" rid="B32">Franzosa and Xia, 2009</xref>), which measures the surface or fractional surface of an amino-acid residue that is in contact with bulk water, and (ii) the packing density of an amino-acid residue, which measure the density of its neighbors. Different metrics capture this information, including the contact number and the weighted contact number, with the latter containing longer-range information (<xref ref-type="bibr" rid="B32">Franzosa and Xia, 2009</xref>; <xref ref-type="bibr" rid="B118">Yeh et al., 2014</xref>). While not strictly equivalent, both accessible surface area and packing density correlate strongly (<xref ref-type="bibr" rid="B28">Echave et al., 2016</xref>), and both measures show that the less buried is a residue, the less conserved it is within a protein sequence.</p>
<p>This conservation-structure relationship prompts us to infer structural properties of disordered regions from their pattern of conservation within proteins. We know that disordered regions are devoid of a hydrophobic core and therefore cannot autonomously adopt a stable three-dimensional structure. However, if we consider the spectrum of solvent accessibility and packing density found among folded domains, where would disordered regions position themselves on average? Would they appear much less conserved than even the most solvent-exposed regions? Some disordered regions serve purely as linkers or entropic springs and are expected to show very weak sequence conservation (<xref ref-type="bibr" rid="B25">Dyson and Wright, 2005</xref>; <xref ref-type="bibr" rid="B107">Van der Lee et al., 2014</xref>). At the same time, disordered regions can also form secondary structure elements and bind to partners (<xref ref-type="bibr" rid="B101">Tompa, 2005</xref>; <xref ref-type="bibr" rid="B106">Vacic et al., 2007</xref>; <xref ref-type="bibr" rid="B105">Uversky and Dunker, 2010</xref>; <xref ref-type="bibr" rid="B115">Wright and Dyson, 2015</xref>; <xref ref-type="bibr" rid="B3">Banani et al., 2017</xref>; <xref ref-type="bibr" rid="B20">Dignon et al., 2019</xref>), thereby burying residues and transiently increasing their packing density. For example, p27Kip1 can wrap around the structure of Cdk2 to regulate its function (<xref ref-type="bibr" rid="B85">Russo et al., 1996</xref>; <xref ref-type="bibr" rid="B35">Galea et al., 2008</xref>).</p>
<p>To position disordered regions on the solvent accessibility spectrum observed in structured regions, we compared the evolutionary rate of residues in both region types. Specifically, we selected 3,350 proteins from <italic>Saccharomyces cerevisiae</italic>, which contain at least 20 residues in both structured regions and disordered regions. We inferred residue-level conservation using Rate4Site (<xref ref-type="bibr" rid="B81">Pupko et al., 2002</xref>) on multiple sequence alignments of orthologs from 14 fungal species (see section &#x201C;Materials and Methods&#x201D;). Evolutionary and structural information were mapped along the reference sequence from the multiple alignment as illustrated for STI1, a conserved Hsp90 co-chaperone (<xref ref-type="fig" rid="F1">Figure 1A</xref>). We calculated a ratio per protein <italic>i</italic>, corresponding to the mean evolutionary rate of residues in disordered regions (<italic>R</italic><sub><italic>i</italic></sub><sup><italic>diso</italic></sup>) divided by the mean rate of residues in a domain (<italic>R</italic><sub><italic>i</italic></sub><sup><italic>domain</italic></sup>). Overall, considering 2607 proteins with known orthologs, containing both types of regions, the median ratio (<italic>R</italic><sub><italic>i</italic></sub><sup><italic>diso</italic></sup><italic><sup>/</sup>R<sub><italic>i</italic></sub></italic><sup><italic>domain</italic></sup>) is equal to 2.2 (<xref ref-type="fig" rid="F1">Figure 1B</xref>). If we now consider domains of known structure (i.e., present in PDB, currently &#x223C;670) instead of those predicted, we find a similar median ratio equal to 2.0. For those proteins, we compared the conservation of disordered regions to that of buried and surface residues separately and found ratios equal to 3.1 and 1.4, respectively. Thus, in an average protein of this dataset, disordered regions evolve 3.1 and 1.4-fold faster than buried and surface residues, respectively (<xref ref-type="fig" rid="F1">Figure 1B</xref>).</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption><p>The evolutionary rate of disordered regions is comparable to that of super-exposed regions in folded proteins. <bold>(A)</bold> Evolutionary information and structural features are mapped onto protein sequences from <italic>S. cerevisiae</italic>. The minimap represents the multiple sequence alignment of orthologous sequences to STI1. The amino acids are colored using CLUSTAL&#x2019;s color scale (<xref ref-type="bibr" rid="B99">Thompson et al., 1994</xref>) depending on residue type and conservation. The zoomed-in panel illustrates residue-level conservation, which we calculated with Rate4Site (<xref ref-type="bibr" rid="B81">Pupko et al., 2002</xref>). We mapped the positions of PFAM (<xref ref-type="bibr" rid="B4">Bateman et al., 2002</xref>) and SUPERFAMILY (<xref ref-type="bibr" rid="B41">Gough and Chothia, 2002</xref>) domains (gray box), and of disordered regions predicted by IUPRED (<xref ref-type="bibr" rid="B21">Doszt&#x00E1;nyi, 2018</xref>) (cyan ribbon). We also mapped structural information available from PDB (<xref ref-type="bibr" rid="B84">Rose et al., 2017</xref>; <xref ref-type="bibr" rid="B2">Armstrong et al., 2019</xref>) and 3DComplex (<xref ref-type="bibr" rid="B60">Levy et al., 2006</xref>) on sequences. For this particular sequence, structural information was partially available based on PDB code 3UQ3 (<xref ref-type="bibr" rid="B88">Schmid et al., 2012</xref>). <bold>(B)</bold> Within proteins, the evolutionary rate of residues in different regions are averaged, and we compare the ratio of these averages. We show the median of ratios with error bars corresponding to the median absolute deviation. Surface and buried residues are defined based on relative ASA of &#x003E;25 and &#x2264;25%, respectively (<xref ref-type="bibr" rid="B57">Levy, 2010</xref>). <bold>(C)</bold> We calculate the same ratio as in panel <bold>(B)</bold>, between disordered regions and surface regions, using an increasingly stringent relative ASA cut-off to define surface residues. As we increase the cutoff, the median ratio tends toward 1, which highlights that disordered residues evolve only slightly faster than the most exposed residues at protein surfaces.</p></caption>
<graphic xlink:href="fmolb-08-626729-g001.tif"/>
</fig>
<p>This result is based on a definition of surface that includes residues with &#x003E;25% relative ASA. As higher ASA is associated with lower conservation, we asked whether increasing the cut-off progressively from &#x003E;25 to &#x003E;80% would yield a point where surface residues evolve faster than disordered ones (<xref ref-type="fig" rid="F1">Figure 1C</xref>). We did not reach such a point as the ratio remained above 1 for all values. However, the ratio did converge to a value close to 1, highlighting that in an average protein, disordered residues are almost equivalent in their conservation to the most exposed residues at the surface of structured regions.</p>
<p>If we assume that the differential conservation of sites within protein sequences largely reflects different structural constraints, we can infer that disordered regions are, on average, highly solvent-exposed and under weak structural constraints. In sum, our results place disordered regions in the continuum of protein structure, at the end of the solvent-accessibility spectrum. It will be interesting to refine this relationship in the future. For example, by comparing additional properties such as hydrophobicity (<xref ref-type="bibr" rid="B53">Kyte and Doolittle, 1982</xref>) or stickiness (<xref ref-type="bibr" rid="B57">Levy, 2010</xref>), by considering where disordered segments fall in the sequence (e.g., N/C-ter and inside domains), or by breaking down disorder into different types (<xref ref-type="bibr" rid="B5">Bellay et al., 2011</xref>).</p>
</sec>
<sec id="S2.SS2">
<title>Conservation of Disorder Versus Domains Is Poorly Correlated Among Low Abundance Proteins and the Correlation Increases With Abundance</title>
<p>Individual residues within a structure contribute to stability together. As a result, we can expect the evolutionary conservation of residues within a structure to be uniform. To examine this idea, we compared the average evolutionary rate of surface and buried amino-acid residues within structures. Importantly, we know that protein abundance imposes global constraints on the conservation of proteins, which may also result in a uniform evolutionary pressure across the sequence, independently of the structure. Thus, we initially focused on low abundance proteins in which such global constraints are minimized. We observed the conservation of surface and buried regions to correlate strongly (<italic>R</italic> &#x003E; 0.83, <xref ref-type="fig" rid="F2">Figure 2A</xref>), which is reminiscent of the surface-core association described previously (<xref ref-type="bibr" rid="B102">T&#x00F3;th-Petr&#x00F3;czy and Tawfik, 2011</xref>).</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption><p>The correlation in the conservation of disorder vs domain regions is poor among low abundance proteins and increases with abundance. <bold>(A)</bold> The top row shows the average evolutionary rate (ER) of surface residues (<italic>x</italic>-axis) vs buried residues (<italic>y</italic>-axis) per protein, for two classes of abundance (0&#x2013;3 and 3&#x2013;18 ppm or parts per millions). The lower row shows the average ER of disordered residues (<italic>x</italic>-axis) vs residues in domains (<italic>y</italic>-axis) per protein, for the same two classes of abundance. A protein falling on the diagonal (dashed line) means that residues in the two regions being compared have equal evolutionary rates (i.e., a ratio of 1). The Spearman rank correlation coefficient (r), the associated <italic>p</italic>-value (<italic>p</italic>, two-sided Spearman&#x2019;s rank correlation test), and the number of proteins (n) within each class of abundance are given above each scatterplot. <bold>(B)</bold> Same as in panel <bold>(A)</bold>, for three classes of abundance (18&#x2013;59, 59&#x2013;352, and 352&#x2013;21,866 ppm or parts per million).</p></caption>
<graphic xlink:href="fmolb-08-626729-g002.tif"/>
</fig>
<p>We next compared the association in evolutionary conservation between disordered regions and domains found in the same protein. In this case, the correlation was reduced greatly (<italic>R</italic> = 0.25), indicating that the structural connectivity and interdependence between disordered regions and domains are globally weak. These results are consistent with those of the previous section, which depict disordered regions as being highly solvent-accessible and structurally independent from domains. However, proteins expressed at higher levels show increasing correlation, from <italic>R</italic> = 0.40 among medium abundance proteins, to <italic>R</italic> = 0.63 in the class of proteins with the highest abundance (<xref ref-type="fig" rid="F2">Figure 2B</xref>, lower row). This apparent coupling in evolutionary rates is unlikely to have a structural origin. Rather, it probably results from global constraints linked to abundance and exerted on the whole protein sequence. This apparent coupling also implies that different regions in a sequence all experience increasingly strong purifying selection with increasing abundance. This observation led us to quantify whether such negative selection increases equally in all regions, or whether some regions become more constrained than others.</p>
</sec>
<sec id="S2.SS3">
<title>Evolutionary Constraints Imparted by Protein Abundance Scale Similarly Among Surface, Buried, and Disordered Regions</title>
<p>We saw that surface residues in a protein evolve twice as fast as buried residues on average. This difference, which has long been recognized, is mainly explained by solvent-accessibility/packing density and reflects that protein structures are more likely to be destabilized by mutations at buried positions than by mutations at the surface (<xref ref-type="bibr" rid="B51">Koshi and Goldstein, 1995</xref>; <xref ref-type="bibr" rid="B38">Goldman et al., 1998</xref>; <xref ref-type="bibr" rid="B43">Guo et al., 2004</xref>; <xref ref-type="bibr" rid="B7">Bloom et al., 2006</xref>; <xref ref-type="bibr" rid="B87">Sasidharan and Chothia, 2007</xref>; <xref ref-type="bibr" rid="B39">Goldstein, 2008</xref>; <xref ref-type="bibr" rid="B16">Conant and Stadler, 2009</xref>; <xref ref-type="bibr" rid="B32">Franzosa and Xia, 2009</xref>; <xref ref-type="bibr" rid="B62">Liberles et al., 2012</xref>; <xref ref-type="bibr" rid="B118">Yeh et al., 2014</xref>; <xref ref-type="bibr" rid="B27">Echave et al., 2015</xref>; <xref ref-type="bibr" rid="B91">Shahmoradi and Wilke, 2016</xref>; <xref ref-type="bibr" rid="B95">Spielman and Wilke, 2016</xref>; <xref ref-type="bibr" rid="B26">Echave and Wilke, 2017</xref>; <xref ref-type="bibr" rid="B66">Liu et al., 2017</xref>). Similarly, residues in disordered regions evolve faster than those in domains. Interestingly, this reflects that surface, buried, and disordered residues experience different structural and biophysical constraints. Thus, we propose to examine whether the ratio of their conservation is changing as a function of abundance. For example, observing that buried residues are twice more conserved than surface residues among low abundance proteins, and become four-times more conserved among high abundance proteins would suggest that stability is increasingly constrained with higher abundance.</p>
<p>We analyzed the ratio of conservation (<xref ref-type="fig" rid="F3">Figures 3A</xref>, <xref ref-type="fig" rid="F4">4A</xref>) of surface and buried residues as a function of abundance. The distribution of these ratios showed comparable median values of about &#x223C;2. In the highest abundance class, this ratio reached &#x223C;2.2 (<xref ref-type="fig" rid="F3">Figure 3A</xref>) creating a significant albeit weak (<italic>R</italic> = 0.2) correlation (<xref ref-type="fig" rid="F4">Figure 4A</xref>). Overall, the ratio is relatively stable, implying that both regions are constrained to a similar extent with increasing abundance. Alternatively, a relatively constant ratio could be favored by the coupling we observed between interior and surface regions (<xref ref-type="fig" rid="F2">Figure 2</xref>, top row). Accordingly, constraints placed on the protein surface could percolate to interior regions and vice versa (<xref ref-type="bibr" rid="B102">T&#x00F3;th-Petr&#x00F3;czy and Tawfik, 2011</xref>). To control for this effect, we next compared disordered and domain regions, which show minimal structural coupling. We also observed a stable ratio of &#x223C;2 across the five same abundance classes (<xref ref-type="fig" rid="F3">Figure 3B</xref>), and we observed no dependence of the ratio with abundance even at the highest levels (<italic>R</italic> = &#x2212;0.02, <xref ref-type="fig" rid="F4">Figure 4B</xref>). Additionally, focusing on disorder and domain regions increased the size of the dataset as we were not limited by the availability of atomic-resolution structures, so this observation applies to the yeast proteome.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption><p>The relative evolutionary rates of different protein regions are steady with abundance. Distribution of evolutionary rates ratio between different regions in the sequence (<italic>y</italic>-axis), across five classes of protein abundance (<italic>x</italic>-axis). A ratio is calculated by dividing the average evolutionary rate of residues found in two regions panel <bold>(A)</bold> surface vs. buried, panel <bold>(B)</bold> disorder vs. domain. The white dashed line highlights the median ratio across bins of abundance. Overlaid box plots show the interquartile range (IQR = 25 to 75% quantiles) with their whiskers extending to 1.58 &#x00D7; IQR. Beyond this interval, the three most extreme outlier values are annotated. The number of proteins contributing to each distribution is given. We also highlight the relative rates for a pair of proteins, one with low and one with high abundance (STI1 and DBF4). These two proteins show comparable structural features, different evolutionary rates (respectively, 0.575 and 1.34 for their full sequence), and similar ratios.</p></caption>
<graphic xlink:href="fmolb-08-626729-g003.tif"/>
</fig>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption><p>Evolutionary rates of different regions and their ratio as a function of abundance. <bold>(A)</bold> Evolutionary rates (<italic>y</italic>-axis) as a function of protein abundance (<italic>x</italic>-axis) for surface regions, full-length structures, and buried regions. The ratio of evolutionary rate for surface vs buried regions is also shown as a function of abundance. Contour lines show the density of points. The median evolutionary rate and median protein abundance are shown by a vertical and horizontal line, respectively. The Spearman rank correlation coefficient and <italic>p</italic>-value are given with the number of proteins in each dataset. A black line shows the fitted sigmoidal regression for each plot. We highlight two proteins, one with a low and one with a high abundance (DBF4 and STI1). Both have comparable structural features but different evolutionary rates. <bold>(B)</bold> Same representation as in panel <bold>(A)</bold>, now considering disordered versus domain regions.</p></caption>
<graphic xlink:href="fmolb-08-626729-g004.tif"/>
</fig>
<p>By definition, disordered regions and domains should experience distinct structural and biophysical constraints. Thus, the fact that these two regions appear equally constrained with increasing abundance is puzzling and can be interpreted in different ways. One possible explanation is that constraints associated with abundance apply to entire sequences independently of structure. Such constraints could include translational selection (<xref ref-type="bibr" rid="B1">Akashi, 2003</xref>), although region-specific codon-bias constraints may exist as well (<xref ref-type="bibr" rid="B103">Tuller et al., 2010</xref>; <xref ref-type="bibr" rid="B77">Pechmann and Frydman, 2013</xref>), cost of expression (<xref ref-type="bibr" rid="B19">Dekel and Alon, 2005</xref>; <xref ref-type="bibr" rid="B109">Wagner, 2005</xref>; <xref ref-type="bibr" rid="B8">Cherry, 2010</xref>; <xref ref-type="bibr" rid="B42">Gout et al., 2010</xref>; <xref ref-type="bibr" rid="B79">Plata et al., 2010</xref>), as well as other functional elements and sequence properties that may impact transcription or translation (<xref ref-type="bibr" rid="B96">Stergachis et al., 2013</xref>; <xref ref-type="bibr" rid="B120">Zhou et al., 2016</xref>). Alternatively or in addition, region-specific structural and biophysical constraints associated with protein concentration could increase in similar proportions with abundance, resulting in a stable ratio. In this case, two primary constraints have been characterized: a first on protein stability (<xref ref-type="bibr" rid="B90">Serohijos et al., 2012</xref>, <xref ref-type="bibr" rid="B89">2013</xref>) leading to selection against misfolding (<xref ref-type="bibr" rid="B23">Drummond et al., 2005</xref>; <xref ref-type="bibr" rid="B22">Drummond and Wilke, 2008</xref>), would dominate among interior residues. A second, on protein solubility (<xref ref-type="bibr" rid="B50">Knowles et al., 2014</xref>; <xref ref-type="bibr" rid="B36">Garcia-Seisdedos et al., 2017</xref>, <xref ref-type="bibr" rid="B37">2018</xref>; <xref ref-type="bibr" rid="B24">Dubreuil et al., 2019</xref>; <xref ref-type="bibr" rid="B31">Foy et al., 2019</xref>; <xref ref-type="bibr" rid="B68">Macossay-Castillo et al., 2019</xref>; <xref ref-type="bibr" rid="B108">Vecchi et al., 2020</xref>), with selection against promiscuous interactions (<xref ref-type="bibr" rid="B18">Deeds et al., 2007</xref>; <xref ref-type="bibr" rid="B59">Levy et al., 2009</xref>, <xref ref-type="bibr" rid="B58">2012</xref>; <xref ref-type="bibr" rid="B63">Liberles et al., 2011</xref>; <xref ref-type="bibr" rid="B117">Yang et al., 2012</xref>), would dominate among solvent-exposed residues. However, the fact that constraints on different regions scale proportionally with abundance may appear surprising and will need to be explored in future works.</p>
</sec>
</sec>
<sec id="S3">
<title>Conclusion</title>
<p>We analyzed the evolutionary conservation of sites within proteins, and of proteins within proteomes. We found that disordered regions evolve about three-fold faster than buried regions, and 1.4-fold faster than surface regions. Additionally, disordered regions evolve about as fast as the most solvent-exposed surface regions, highlighting that they extend the continuum of protein structure as a &#x201C;super-accessible&#x201D; surface. Unlike regular surface residues, however, disordered regions evolve more independently from domains in the same sequence. This independence allowed us to examine how abundance constrains different regions that are not structurally connected in sequences. Notably, the evolution of disordered regions and domains changed in a similar proportion with abundance: on average, disordered regions evolved twice as fast as domains across the entire range of abundance. Since different regions are subject to different structural and biophysical constraints, we foresee that such comparative analyses of conservation-ratios as a function of abundance will help identify mechanisms underlying the abundance-conservation relationship. It is likely that multiple mechanisms are at play (<xref ref-type="bibr" rid="B69">Mehlhoff et al., 2020</xref>) and may be captured by targeted analyses of specific regions and protein subsets.</p>
</sec>
<sec id="S4" sec-type="materials|methods">
<title>Materials and Methods</title>
<sec id="S4.SS1">
<title>Reference Proteome Sequences</title>
<p>The sequences were taken from the reference <italic>S. cerevisiae</italic> proteome maintained by SGD (<xref ref-type="bibr" rid="B9">Cherry et al., 2012</xref>). To facilitate data integration, we also mapped those reference sequences against the UniprotKB complete proteome for <italic>S. cerevisiae</italic> (<xref ref-type="bibr" rid="B97">Stutz et al., 2006</xref>; <xref ref-type="bibr" rid="B104">UniProt Consortium, 2019</xref>).</p>
</sec>
<sec id="S4.SS2">
<title>Crystallographic Structures</title>
<p>We relied on the 3DComplex database (<xref ref-type="bibr" rid="B60">Levy et al., 2006</xref>) to map UNIPROT sequences onto atomic coordinates of protein structures. For each yeast protein, the structures matching the UNIPROT sequence with the largest sequence overlap (minimum 20%) and identity above 90% were retained. Only experimentally determined crystallographic structures with resolutions below 3.0 &#x00C5;ngtrsoms were considered.</p>
</sec>
<sec id="S4.SS3">
<title>Cellular Abundance</title>
<p>Protein abundances were obtained from Pax-Db (v4.0, May 2015) (<xref ref-type="bibr" rid="B112">Wang et al., 2012</xref>, <xref ref-type="bibr" rid="B111">2015</xref>), which provides relative abundances for unicellular and multicellular organisms including tissue-specific data. We use overall abundance inferred from all available data sets (integrated data set).</p>
</sec>
<sec id="S4.SS4">
<title>Orthologs Alignment and Position-Specific Evolutionary Rate</title>
<p>The orthologs&#x2019; alignments were obtained from the original work by <xref ref-type="bibr" rid="B113">Wapinski et al. (2007)</xref>. Briefly, genes sharing significant sequence similarity were clustered into putative orthogroups and their phylogeny was constructed by a modified neighbor-joining procedure based on pre-computed residues similarities and shared synteny scores. This process was repeated and optimized until each orthogroup consisted of genes that shared a single common ancestor. Here, we used 3798 groups of orthologous proteins along with their multiple sequence alignment encompassing 14 fungal species (<italic>S.cerevisiae, Saccharomyces paradoxus, Saccharomyces mikatae, Saccharomyces bayanus, Naumovozyma castellii (Saccharomyces castellii), Candida glabrata, Kluyveromyces lactis, Debaryomyces hansenii, Yarrowia lipolytica, Eremothecium gossypii (Ashbya gossypii), Lachancea waltii (Kluyveromyces waltii), Candida albicans, Aspergillus nidulans, Fusarium graminearum, Magnaporthe grisea, Neurospora crassa, Cryptococcus neoformans, Schizosaccharomyces pombe</italic>) were used. Only 6 orthogroups had one sequence missing and these were replaced by indels. The median pairwise sequence identity within these 3,798 orthogroups is 58.3% (<xref ref-type="fig" rid="F5">Figure 5</xref>).</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption><p>Pairwise sequence identity across orthologs pairs. For each orthogroup we calculate the average percent sequence-identity using all ortholog pairs or only pairs that include the <italic>S. cerevisiae</italic> protein. The distribution for these two measures are shown with dark and light blue, respectively. Vertical lines highlight the median. The number of orthogroups is 3,798.</p></caption>
<graphic xlink:href="fmolb-08-626729-g005.tif"/>
</fig>
<p>All alignments were computed using MUSCLE (<xref ref-type="bibr" rid="B29">Edgar, 2004</xref>) and then concatenated to estimate residue-level evolutionary rate using the software Rate4Site (<xref ref-type="bibr" rid="B81">Pupko et al., 2002</xref>). Additional details on how evolutionary rates were estimated are available in <xref ref-type="bibr" rid="B54">Landry et al. (2009)</xref>.</p>
</sec>
<sec id="S4.SS5">
<title>Intrinsic Disorder Predictions</title>
<p>We predicted disordered regions in the yeast proteome by combining short and long disorder segments predicted by IUPred (<xref ref-type="bibr" rid="B70">M&#x00E9;sz&#x00E1;ros et al., 2009</xref>; <xref ref-type="bibr" rid="B21">Doszt&#x00E1;nyi, 2018</xref>). We considered the 20% amino-acid residues with the highest disorder probabilities among all proteins. In all analyses, we required a minimum number of 20 residues in a particular region to calculate an average evolutionary rate. When fewer residues were available, the average rate of the region was considered undefined.</p>
</sec>
<sec id="S4.SS6">
<title>Domains Assignment</title>
<p>To assign domains, we aligned profiles from Pfam-A (v27.0, May 2013) (<xref ref-type="bibr" rid="B4">Bateman et al., 2002</xref>; <xref ref-type="bibr" rid="B30">Finn et al., 2014</xref>) and SUPERFAMILY (v1.75, March 2013) (<xref ref-type="bibr" rid="B40">Gough, 2002</xref>; <xref ref-type="bibr" rid="B73">Oates et al., 2015</xref>) to reference proteome sequences, filtering the hits with an <italic>E</italic>-value score above 10<sup>&#x2013;3</sup>. Finally, domain residues are those that were identified as part of a hit from either Pfam, SUPERFAMILY, or both.</p>
</sec>
</sec>
<sec id="S5">
<title>Data Availability Statement</title>
<p>The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author/s. Data used in this work are available on Figshare in a tabulated format: <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.6084/m9.figshare.13738657">https://doi.org/10.6084/m9.figshare.13738657</ext-link>.</p>
</sec>
<sec id="S6">
<title>Author Contributions</title>
<p>BD and EL designed the analyses and experiments, analyzed the data, and wrote the manuscript. BD carried out the analyses. Both authors contributed to the article and approved the submitted version.</p>
</sec>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
</body>
<back>
<fn-group>
<fn fn-type="financial-disclosure">
<p><bold>Funding.</bold> This work was supported by the Israel Science Foundation (grant No. 1452/18), by the European Research Council (ERC) under the European Union&#x2019;s Horizon 2020 research and innovation program (grant agreement No. 819318), by a research grant from A.-M. Boucher, by research grants from the Estelle Funk Foundation, the Estate of Fannie Sherr, the Estate of Albert Delighter, the Merle S. Cahn Foundation, Mildred S. Gosden, the Estate of Elizabeth Wachsman, and the Arnold Bortman Family Foundation.</p>
</fn>
</fn-group>
<ack>
<p>We thank H. Greenblatt for helping with the computer infrastructure and Tal Pupko for his advice.</p>
</ack>
<ref-list>
<title>References</title>
<ref id="B1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Akashi</surname> <given-names>H.</given-names></name></person-group> (<year>2003</year>). <article-title>Translational selection and yeast proteome evolution.</article-title> <source><italic>Genetics</italic></source> <volume>164</volume> <fpage>1291</fpage>&#x2013;<lpage>1303</lpage>.</citation></ref>
<ref id="B2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Armstrong</surname> <given-names>D. R.</given-names></name> <name><surname>Berrisford</surname> <given-names>J. M.</given-names></name> <name><surname>Conroy</surname> <given-names>M. J.</given-names></name> <name><surname>Gutmanas</surname> <given-names>A.</given-names></name> <name><surname>Anyango</surname> <given-names>S.</given-names></name> <name><surname>Choudhary</surname> <given-names>P.</given-names></name><etal/></person-group> (<year>2019</year>). <article-title>PDBe: improved findability of macromolecular structure data in the PDB.</article-title> <source><italic>Nucleic Acids Res.</italic></source> <volume>48</volume> <fpage>D335</fpage>&#x2013;<lpage>D343</lpage>.</citation></ref>
<ref id="B3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Banani</surname> <given-names>S. F.</given-names></name> <name><surname>Lee</surname> <given-names>H. O.</given-names></name> <name><surname>Hyman</surname> <given-names>A. A.</given-names></name> <name><surname>Rosen</surname> <given-names>M. K.</given-names></name></person-group> (<year>2017</year>). <article-title>Biomolecular condensates: organizers of cellular biochemistry.</article-title> <source><italic>Nat. Rev. Mol. Cell Biol.</italic></source> <volume>18</volume> <fpage>285</fpage>&#x2013;<lpage>298</lpage>.</citation></ref>
<ref id="B4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bateman</surname> <given-names>A.</given-names></name> <name><surname>Birney</surname> <given-names>E.</given-names></name> <name><surname>Cerruti</surname> <given-names>L.</given-names></name> <name><surname>Durbin</surname> <given-names>R.</given-names></name> <name><surname>Etwiller</surname> <given-names>L.</given-names></name> <name><surname>Eddy</surname> <given-names>S. R.</given-names></name><etal/></person-group> (<year>2002</year>). <article-title>The Pfam protein families database.</article-title> <source><italic>Nucleic Acids Res.</italic></source> <volume>30</volume> <fpage>276</fpage>&#x2013;<lpage>280</lpage>.</citation></ref>
<ref id="B5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bellay</surname> <given-names>J.</given-names></name> <name><surname>Han</surname> <given-names>S.</given-names></name> <name><surname>Michaut</surname> <given-names>M.</given-names></name> <name><surname>Kim</surname> <given-names>T.</given-names></name> <name><surname>Costanzo</surname> <given-names>M.</given-names></name> <name><surname>Andrews</surname> <given-names>B. J.</given-names></name><etal/></person-group> (<year>2011</year>). <article-title>Bringing order to protein disorder through comparative genomics and genetic interactions.</article-title> <source><italic>Genome Biol.</italic></source> <volume>12</volume>:<fpage>R14</fpage>.</citation></ref>
<ref id="B6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bloom</surname> <given-names>J. D.</given-names></name> <name><surname>Adami</surname> <given-names>C.</given-names></name></person-group> (<year>2004</year>). <article-title>Evolutionary rate depends on number of protein-protein interactions independently of gene expression level: response.</article-title> <source><italic>BMC Evol. Biol.</italic></source> <volume>4</volume>:<fpage>14</fpage>. <pub-id pub-id-type="doi">10.1186/1471-2148-4-14</pub-id> <pub-id pub-id-type="pmid">15171796</pub-id></citation></ref>
<ref id="B7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bloom</surname> <given-names>J. D.</given-names></name> <name><surname>Drummond</surname> <given-names>D. A.</given-names></name> <name><surname>Arnold</surname> <given-names>F. H.</given-names></name> <name><surname>Wilke</surname> <given-names>C. O.</given-names></name></person-group> (<year>2006</year>). <article-title>Structural determinants of the rate of protein evolution in yeast.</article-title> <source><italic>Mol. Biol. Evol.</italic></source> <volume>23</volume> <fpage>1751</fpage>&#x2013;<lpage>1761</lpage>.</citation></ref>
<ref id="B8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cherry</surname> <given-names>J. L.</given-names></name></person-group> (<year>2010</year>). <article-title>Expression level, evolutionary rate, and the cost of expression.</article-title> <source><italic>Genome Biol. Evol.</italic></source> <volume>2</volume> <fpage>757</fpage>&#x2013;<lpage>769</lpage>.</citation></ref>
<ref id="B9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cherry</surname> <given-names>J. M.</given-names></name> <name><surname>Hong</surname> <given-names>E. L.</given-names></name> <name><surname>Amundsen</surname> <given-names>C.</given-names></name> <name><surname>Balakrishnan</surname> <given-names>R.</given-names></name> <name><surname>Binkley</surname> <given-names>G.</given-names></name> <name><surname>Chan</surname> <given-names>E. T.</given-names></name><etal/></person-group> (<year>2012</year>). <article-title><italic>Saccharomyces</italic> genome database: the genomics resource of budding yeast.</article-title> <source><italic>Nucleic Acids Res.</italic></source> <volume>40</volume> <fpage>D700</fpage>&#x2013;<lpage>D705</lpage>.</citation></ref>
<ref id="B10"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chothia</surname> <given-names>C.</given-names></name></person-group> (<year>1974</year>). <article-title>Hydrophobic bonding and accessible surface area in proteins.</article-title> <source><italic>Nature</italic></source> <volume>248</volume> <fpage>338</fpage>&#x2013;<lpage>339</lpage>.</citation></ref>
<ref id="B11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chothia</surname> <given-names>C.</given-names></name></person-group> (<year>1975</year>). <article-title>Structural invariants in protein folding.</article-title> <source><italic>Nature</italic></source> <volume>254</volume> <fpage>304</fpage>&#x2013;<lpage>308</lpage>.</citation></ref>
<ref id="B12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chothia</surname> <given-names>C.</given-names></name></person-group> (<year>1976</year>). <article-title>The nature of the accessible and buried surfaces in proteins.</article-title> <source><italic>J. Mol. Biol.</italic></source> <volume>105</volume> <fpage>1</fpage>&#x2013;<lpage>12</lpage>.</citation></ref>
<ref id="B13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chothia</surname> <given-names>C.</given-names></name></person-group> (<year>1992</year>). <article-title>Proteins. One thousand families for the molecular biologist.</article-title> <source><italic>Nature</italic></source> <volume>357</volume> <fpage>543</fpage>&#x2013;<lpage>544</lpage>.</citation></ref>
<ref id="B14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chothia</surname> <given-names>C.</given-names></name> <name><surname>Lesk</surname> <given-names>A. M.</given-names></name></person-group> (<year>1986</year>). <article-title>The relation between the divergence of sequence and structure in proteins.</article-title> <source><italic>EMBO J.</italic></source> <volume>5</volume> <fpage>823</fpage>&#x2013;<lpage>826</lpage>.</citation></ref>
<ref id="B15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chothia</surname> <given-names>C.</given-names></name> <name><surname>Lesk</surname> <given-names>A. M.</given-names></name></person-group> (<year>1987</year>). <article-title>The evolution of protein structures.</article-title> <source><italic>Cold Spring Harb. Symp. Quant. Biol.</italic></source> <volume>52</volume> <fpage>399</fpage>&#x2013;<lpage>405</lpage>.</citation></ref>
<ref id="B16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Conant</surname> <given-names>G. C.</given-names></name> <name><surname>Stadler</surname> <given-names>P. F.</given-names></name></person-group> (<year>2009</year>). <article-title>Solvent exposure imparts similar selective pressures across a range of yeast proteins.</article-title> <source><italic>Mol. Biol. Evol.</italic></source> <volume>26</volume> <fpage>1155</fpage>&#x2013;<lpage>1161</lpage>.</citation></ref>
<ref id="B17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Creighton</surname> <given-names>T. E.</given-names></name> <name><surname>Chothia</surname> <given-names>C.</given-names></name></person-group> (<year>1989</year>). <article-title>Protein structure. Selecting buried residues.</article-title> <source><italic>Nature</italic></source> <volume>339</volume> <fpage>14</fpage>&#x2013;<lpage>15</lpage>.</citation></ref>
<ref id="B18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Deeds</surname> <given-names>E. J.</given-names></name> <name><surname>Ashenberg</surname> <given-names>O.</given-names></name> <name><surname>Gerardin</surname> <given-names>J.</given-names></name> <name><surname>Shakhnovich</surname> <given-names>E. I.</given-names></name></person-group> (<year>2007</year>). <article-title>Robust protein protein interactions in crowded cellular environments.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>104</volume> <fpage>14952</fpage>&#x2013;<lpage>14957</lpage>.</citation></ref>
<ref id="B19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dekel</surname> <given-names>E.</given-names></name> <name><surname>Alon</surname> <given-names>U.</given-names></name></person-group> (<year>2005</year>). <article-title>Optimality and evolutionary tuning of the expression level of a protein.</article-title> <source><italic>Nature</italic></source> <volume>436</volume> <fpage>588</fpage>&#x2013;<lpage>592</lpage>.</citation></ref>
<ref id="B20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dignon</surname> <given-names>G. L.</given-names></name> <name><surname>Zheng</surname> <given-names>W.</given-names></name> <name><surname>Mittal</surname> <given-names>J.</given-names></name></person-group> (<year>2019</year>). <article-title>Simulation methods for liquid&#x2013;liquid phase separation of disordered proteins.</article-title> <source><italic>Curr. Opin. Chem. Eng.</italic></source> <volume>23</volume> <fpage>92</fpage>&#x2013;<lpage>98</lpage>.</citation></ref>
<ref id="B21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Doszt&#x00E1;nyi</surname> <given-names>Z.</given-names></name></person-group> (<year>2018</year>). <article-title>Prediction of protein disorder based on IUPred.</article-title> <source><italic>Protein Sci.</italic></source> <volume>27</volume> <fpage>331</fpage>&#x2013;<lpage>340</lpage>.</citation></ref>
<ref id="B22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Drummond</surname> <given-names>D. A.</given-names></name> <name><surname>Wilke</surname> <given-names>C. O.</given-names></name></person-group> (<year>2008</year>). <article-title>Mistranslation-induced protein misfolding as a dominant constraint on coding-sequence evolution.</article-title> <source><italic>Cell</italic></source> <volume>134</volume> <fpage>341</fpage>&#x2013;<lpage>352</lpage>.</citation></ref>
<ref id="B23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Drummond</surname> <given-names>D. A.</given-names></name> <name><surname>Bloom</surname> <given-names>J. D.</given-names></name> <name><surname>Adami</surname> <given-names>C.</given-names></name> <name><surname>Wilke</surname> <given-names>C. O.</given-names></name> <name><surname>Arnold</surname> <given-names>F. H.</given-names></name></person-group> (<year>2005</year>). <article-title>Why highly expressed proteins evolve slowly.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>102</volume> <fpage>14338</fpage>&#x2013;<lpage>14343</lpage>.</citation></ref>
<ref id="B24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dubreuil</surname> <given-names>B.</given-names></name> <name><surname>Matalon</surname> <given-names>O.</given-names></name> <name><surname>Levy</surname> <given-names>E. D.</given-names></name></person-group> (<year>2019</year>). <article-title>Protein abundance biases the amino acid composition of disordered regions to minimize non-functional interactions.</article-title> <source><italic>J. Mol. Biol.</italic></source> <volume>431</volume> <fpage>4978</fpage>&#x2013;<lpage>4992</lpage>.</citation></ref>
<ref id="B25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dyson</surname> <given-names>H. J.</given-names></name> <name><surname>Wright</surname> <given-names>P. E.</given-names></name></person-group> (<year>2005</year>). <article-title>Intrinsically unstructured proteins and their functions.</article-title> <source><italic>Nat. Rev. Mol. Cell Biol.</italic></source> <volume>6</volume> <fpage>197</fpage>&#x2013;<lpage>208</lpage>.</citation></ref>
<ref id="B26"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Echave</surname> <given-names>J.</given-names></name> <name><surname>Wilke</surname> <given-names>C. O.</given-names></name></person-group> (<year>2017</year>). <article-title>Biophysical models of protein evolution: understanding the patterns of evolutionary sequence divergence.</article-title> <source><italic>Annu. Rev. Biophys.</italic></source> <volume>46</volume> <fpage>85</fpage>&#x2013;<lpage>103</lpage>.</citation></ref>
<ref id="B27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Echave</surname> <given-names>J.</given-names></name> <name><surname>Jackson</surname> <given-names>E. L.</given-names></name> <name><surname>Wilke</surname> <given-names>C. O.</given-names></name></person-group> (<year>2015</year>). <article-title>Relationship between protein thermodynamic constraints and variation of evolutionary rates among sites.</article-title> <source><italic>Phys. Biol.</italic></source> <volume>12</volume>:<fpage>025002</fpage>.</citation></ref>
<ref id="B28"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Echave</surname> <given-names>J.</given-names></name> <name><surname>Spielman</surname> <given-names>S. J.</given-names></name> <name><surname>Wilke</surname> <given-names>C. O.</given-names></name></person-group> (<year>2016</year>). <article-title>Causes of evolutionary rate variation among protein sites.</article-title> <source><italic>Nat. Rev. Genet.</italic></source> <volume>17</volume> <fpage>109</fpage>&#x2013;<lpage>121</lpage>.</citation></ref>
<ref id="B29"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Edgar</surname> <given-names>R. C.</given-names></name></person-group> (<year>2004</year>). <article-title>MUSCLE: multiple sequence alignment with high accuracy and high throughput.</article-title> <source><italic>Nucleic Acids Res.</italic></source> <volume>32</volume> <fpage>1792</fpage>&#x2013;<lpage>1797</lpage>.</citation></ref>
<ref id="B30"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Finn</surname> <given-names>R. D.</given-names></name> <name><surname>Bateman</surname> <given-names>A.</given-names></name> <name><surname>Clements</surname> <given-names>J.</given-names></name> <name><surname>Coggill</surname> <given-names>P.</given-names></name> <name><surname>Eberhardt</surname> <given-names>R. Y.</given-names></name> <name><surname>Eddy</surname> <given-names>S. R.</given-names></name><etal/></person-group> (<year>2014</year>). <article-title>Pfam: the protein families database.</article-title> <source><italic>Nucleic Acids Res.</italic></source> <volume>42</volume> <fpage>D222</fpage>&#x2013;<lpage>D230</lpage>.</citation></ref>
<ref id="B31"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Foy</surname> <given-names>S. G.</given-names></name> <name><surname>Wilson</surname> <given-names>B. A.</given-names></name> <name><surname>Bertram</surname> <given-names>J.</given-names></name> <name><surname>Cordes</surname> <given-names>M. H. J.</given-names></name> <name><surname>Masel</surname> <given-names>J.</given-names></name></person-group> (<year>2019</year>). <article-title>A shift in aggregation avoidance strategy marks a long-term direction to protein evolution.</article-title> <source><italic>Genetics</italic></source> <volume>211</volume> <fpage>1345</fpage>&#x2013;<lpage>1355</lpage>.</citation></ref>
<ref id="B32"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Franzosa</surname> <given-names>E. A.</given-names></name> <name><surname>Xia</surname> <given-names>Y.</given-names></name></person-group> (<year>2009</year>). <article-title>Structural determinants of protein evolution are context-sensitive at the residue level.</article-title> <source><italic>Mol. Biol. Evol.</italic></source> <volume>26</volume> <fpage>2387</fpage>&#x2013;<lpage>2395</lpage>.</citation></ref>
<ref id="B33"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fraser</surname> <given-names>H. B.</given-names></name> <name><surname>Hirsh</surname> <given-names>A. E.</given-names></name></person-group> (<year>2004</year>). <article-title>Evolutionary rate depends on number of protein-protein interactions independently of gene expression level.</article-title> <source><italic>BMC Evol. Biol.</italic></source> <volume>4</volume>:<fpage>13</fpage>. <pub-id pub-id-type="doi">10.1186/1471-2148-4-13</pub-id> <pub-id pub-id-type="pmid">15165289</pub-id></citation></ref>
<ref id="B34"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fraser</surname> <given-names>H. B.</given-names></name> <name><surname>Hirsh</surname> <given-names>A. E.</given-names></name> <name><surname>Steinmetz</surname> <given-names>L. M.</given-names></name> <name><surname>Scharfe</surname> <given-names>C.</given-names></name> <name><surname>Feldman</surname> <given-names>M. W.</given-names></name></person-group> (<year>2002</year>). <article-title>Evolutionary rate in the protein interaction network.</article-title> <source><italic>Science</italic></source> <volume>296</volume> <fpage>750</fpage>&#x2013;<lpage>752</lpage>.</citation></ref>
<ref id="B35"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Galea</surname> <given-names>C. A.</given-names></name> <name><surname>Nourse</surname> <given-names>A.</given-names></name> <name><surname>Wang</surname> <given-names>Y.</given-names></name> <name><surname>Sivakolundu</surname> <given-names>S. G.</given-names></name> <name><surname>Heller</surname> <given-names>W. T.</given-names></name> <name><surname>Kriwacki</surname> <given-names>R. W.</given-names></name></person-group> (<year>2008</year>). <article-title>Role of intrinsic flexibility in signal transduction mediated by the cell cycle regulator, p27 Kip1.</article-title> <source><italic>J. Mol. Biol.</italic></source> <volume>376</volume> <fpage>827</fpage>&#x2013;<lpage>838</lpage>.</citation></ref>
<ref id="B36"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Garcia-Seisdedos</surname> <given-names>H.</given-names></name> <name><surname>Empereur-Mot</surname> <given-names>C.</given-names></name> <name><surname>Elad</surname> <given-names>N.</given-names></name> <name><surname>Levy</surname> <given-names>E. D.</given-names></name></person-group> (<year>2017</year>). <article-title>Proteins evolve on the edge of supramolecular self-assembly.</article-title> <source><italic>Nature</italic></source> <volume>548</volume> <fpage>244</fpage>&#x2013;<lpage>247</lpage>.</citation></ref>
<ref id="B37"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Garcia-Seisdedos</surname> <given-names>H.</given-names></name> <name><surname>Villegas</surname> <given-names>J. A.</given-names></name> <name><surname>Levy</surname> <given-names>E. D.</given-names></name></person-group> (<year>2018</year>). <article-title>Infinite assembly of folded proteins in evolution, disease, and engineering.</article-title> <source><italic>Angew. Chem. Int. Ed. Engl.</italic></source> <volume>58</volume> <fpage>5514</fpage>&#x2013;<lpage>5531</lpage>. <pub-id pub-id-type="doi">10.1002/anie.201806092</pub-id> <pub-id pub-id-type="pmid">30133878</pub-id></citation></ref>
<ref id="B38"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Goldman</surname> <given-names>N.</given-names></name> <name><surname>Thorne</surname> <given-names>J. L.</given-names></name> <name><surname>Jones</surname> <given-names>D. T.</given-names></name></person-group> (<year>1998</year>). <article-title>Assessing the impact of secondary structure and solvent accessibility on protein evolution.</article-title> <source><italic>Genetics</italic></source> <volume>149</volume> <fpage>445</fpage>&#x2013;<lpage>458</lpage>.</citation></ref>
<ref id="B39"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Goldstein</surname> <given-names>R. A.</given-names></name></person-group> (<year>2008</year>). <article-title>The structure of protein evolution and the evolution of protein structure.</article-title> <source><italic>Curr. Opin. Struct. Biol.</italic></source> <volume>18</volume> <fpage>170</fpage>&#x2013;<lpage>177</lpage>.</citation></ref>
<ref id="B40"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gough</surname> <given-names>J.</given-names></name></person-group> (<year>2002</year>). <article-title>The SUPERFAMILY database in structural genomics.</article-title> <source><italic>Acta Crystallogr. D Biol. Crystallogr.</italic></source> <volume>58</volume> <fpage>1897</fpage>&#x2013;<lpage>1900</lpage>.</citation></ref>
<ref id="B41"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gough</surname> <given-names>J.</given-names></name> <name><surname>Chothia</surname> <given-names>C.</given-names></name></person-group> (<year>2002</year>). <article-title>SUPERFAMILY: HMMs representing all proteins of known structure. SCOP sequence searches, alignments and genome assignments.</article-title> <source><italic>Nucleic Acids Res.</italic></source> <volume>30</volume> <fpage>268</fpage>&#x2013;<lpage>272</lpage>.</citation></ref>
<ref id="B42"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gout</surname> <given-names>J.-F.</given-names></name> <name><surname>Kahn</surname> <given-names>D.</given-names></name> <name><surname>Duret</surname> <given-names>L.</given-names></name></person-group> <collab>Paramecium Post-Genomics Consortium</collab> (<year>2010</year>). <article-title>The relationship among gene expression, the evolution of gene dosage, and the rate of protein evolution.</article-title> <source><italic>PLoS Genet.</italic></source> <volume>6</volume>:<fpage>e1000944</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pgen.1000944</pub-id> <pub-id pub-id-type="pmid">20485561</pub-id></citation></ref>
<ref id="B43"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Guo</surname> <given-names>H. H.</given-names></name> <name><surname>Choe</surname> <given-names>J.</given-names></name> <name><surname>Loeb</surname> <given-names>L. A.</given-names></name></person-group> (<year>2004</year>). <article-title>Protein tolerance to random amino acid change.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>101</volume> <fpage>9205</fpage>&#x2013;<lpage>9210</lpage>.</citation></ref>
<ref id="B44"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hahn</surname> <given-names>M. W.</given-names></name> <name><surname>Kern</surname> <given-names>A. D.</given-names></name></person-group> (<year>2005</year>). <article-title>Comparative genomics of centrality and essentiality in three eukaryotic protein-interaction networks.</article-title> <source><italic>Mol. Biol. Evol.</italic></source> <volume>22</volume> <fpage>803</fpage>&#x2013;<lpage>806</lpage>.</citation></ref>
<ref id="B45"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hirsh</surname> <given-names>A. E.</given-names></name> <name><surname>Fraser</surname> <given-names>H. B.</given-names></name></person-group> (<year>2001</year>). <article-title>Protein dispensability and rate of evolution.</article-title> <source><italic>Nature</italic></source> <volume>411</volume> <fpage>1046</fpage>&#x2013;<lpage>1049</lpage>.</citation></ref>
<ref id="B46"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hurst</surname> <given-names>L. D.</given-names></name> <name><surname>Smith</surname> <given-names>N. G.</given-names></name></person-group> (<year>1999</year>). <article-title>Do essential genes evolve slowly?</article-title> <source><italic>Curr. Biol.</italic></source> <volume>9</volume> <fpage>747</fpage>&#x2013;<lpage>750</lpage>.</citation></ref>
<ref id="B47"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jordan</surname> <given-names>I. K.</given-names></name> <name><surname>Rogozin</surname> <given-names>I. B.</given-names></name> <name><surname>Wolf</surname> <given-names>Y. I.</given-names></name> <name><surname>Koonin</surname> <given-names>E. V.</given-names></name></person-group> (<year>2002</year>). <article-title>Essential genes are more evolutionarily conserved than are nonessential genes in bacteria.</article-title> <source><italic>Genome Res.</italic></source> <volume>12</volume> <fpage>962</fpage>&#x2013;<lpage>968</lpage>.</citation></ref>
<ref id="B48"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kauzmann</surname> <given-names>W.</given-names></name></person-group> (<year>1959</year>). &#x201C;<article-title>Some factors in the interpretation of protein denaturation11the preparation of this article has been assisted by a grant from the national science foundation</article-title>,&#x201D; in <source><italic>Advances in Protein Chemistry</italic></source>, <role>eds</role> <person-group person-group-type="editor"><name><surname>Anfinsen</surname> <given-names>C. B.</given-names></name> <name><surname>Anson</surname> <given-names>M. L.</given-names></name> <name><surname>Bailey</surname> <given-names>K.</given-names></name> <name><surname>Edsall</surname> <given-names>J. T.</given-names></name></person-group> (<publisher-loc>Cambridge, MA</publisher-loc>: <publisher-name>Academic Press</publisher-name>), <fpage>1</fpage>&#x2013;<lpage>63</lpage>.</citation></ref>
<ref id="B49"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kim</surname> <given-names>P. M.</given-names></name> <name><surname>Lu</surname> <given-names>L. J.</given-names></name> <name><surname>Xia</surname> <given-names>Y.</given-names></name> <name><surname>Gerstein</surname> <given-names>M. B.</given-names></name></person-group> (<year>2006</year>). <article-title>Relating three-dimensional structures to protein networks provides evolutionary insights.</article-title> <source><italic>Science</italic></source> <volume>314</volume> <fpage>1938</fpage>&#x2013;<lpage>1941</lpage>. <pub-id pub-id-type="doi">10.1126/science.1136174</pub-id> <pub-id pub-id-type="pmid">17185604</pub-id></citation></ref>
<ref id="B50"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Knowles</surname> <given-names>T. P.</given-names></name> <name><surname>Vendruscolo</surname> <given-names>M.</given-names></name> <name><surname>Dobson</surname> <given-names>C. M.</given-names></name></person-group> (<year>2014</year>). <article-title>The amyloid state and its association with protein misfolding diseases.</article-title> <source><italic>Nat. Rev. Mol. Cell Biol.</italic></source> <volume>15</volume> <fpage>384</fpage>&#x2013;<lpage>396</lpage>.</citation></ref>
<ref id="B51"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Koshi</surname> <given-names>J. M.</given-names></name> <name><surname>Goldstein</surname> <given-names>R. A.</given-names></name></person-group> (<year>1995</year>). <article-title>Context-dependent optimal substitution matrices.</article-title> <source><italic>Protein Eng. Des. Sel.</italic></source> <volume>8</volume> <fpage>641</fpage>&#x2013;<lpage>645</lpage>.</citation></ref>
<ref id="B52"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Krylov</surname> <given-names>D. M.</given-names></name> <name><surname>Wolf</surname> <given-names>Y. I.</given-names></name> <name><surname>Rogozin</surname> <given-names>I. B.</given-names></name> <name><surname>Koonin</surname> <given-names>E. V.</given-names></name></person-group> (<year>2003</year>). <article-title>Gene loss, protein sequence divergence, gene dispensability, expression level, and interactivity are correlated in eukaryotic evolution.</article-title> <source><italic>Genome Res.</italic></source> <volume>13</volume> <fpage>2229</fpage>&#x2013;<lpage>2235</lpage>.</citation></ref>
<ref id="B53"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kyte</surname> <given-names>J.</given-names></name> <name><surname>Doolittle</surname> <given-names>R. F.</given-names></name></person-group> (<year>1982</year>). <article-title>A simple method for displaying the hydropathic character of a protein.</article-title> <source><italic>J. Mol. Biol.</italic></source> <volume>157</volume> <fpage>105</fpage>&#x2013;<lpage>132</lpage>.</citation></ref>
<ref id="B54"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Landry</surname> <given-names>C. R.</given-names></name> <name><surname>Levy</surname> <given-names>E. D.</given-names></name> <name><surname>Michnick</surname> <given-names>S. W.</given-names></name></person-group> (<year>2009</year>). <article-title>Weak functional constraints on phosphoproteomes.</article-title> <source><italic>Trends Genet</italic>.</source> <volume>25</volume> <fpage>193</fpage>&#x2013;<lpage>197</lpage>. <pub-id pub-id-type="doi">10.1016/j.tig.2009.03.003</pub-id> <pub-id pub-id-type="pmid">19349092</pub-id></citation></ref>
<ref id="B55"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lee</surname> <given-names>B.</given-names></name> <name><surname>Richards</surname> <given-names>F. M.</given-names></name></person-group> (<year>1971</year>). <article-title>The interpretation of protein structures: estimation of static accessibility.</article-title> <source><italic>J. Mol. Biol.</italic></source> <volume>55</volume> <fpage>379</fpage>&#x2013;<lpage>400</lpage>.</citation></ref>
<ref id="B56"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lesk</surname> <given-names>A. M.</given-names></name> <name><surname>Chothia</surname> <given-names>C.</given-names></name></person-group> (<year>1980</year>). <article-title>How different amino acid sequences determine similar protein structures: the structure and evolutionary dynamics of the globins.</article-title> <source><italic>J. Mol. Biol.</italic></source> <volume>136</volume> <fpage>225</fpage>&#x2013;<lpage>270</lpage>.</citation></ref>
<ref id="B57"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Levy</surname> <given-names>E. D.</given-names></name></person-group> (<year>2010</year>). <article-title>A simple definition of structural regions in proteins and its use in analyzing interface evolution.</article-title> <source><italic>J. Mol. Biol.</italic></source> <volume>403</volume> <fpage>660</fpage>&#x2013;<lpage>670</lpage>.</citation></ref>
<ref id="B58"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Levy</surname> <given-names>E. D.</given-names></name> <name><surname>De</surname> <given-names>S.</given-names></name> <name><surname>Teichmann</surname> <given-names>S. A.</given-names></name></person-group> (<year>2012</year>). <article-title>Cellular crowding imposes global constraints on the chemistry and evolution of proteomes.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>109</volume> <fpage>20461</fpage>&#x2013;<lpage>20466</lpage>.</citation></ref>
<ref id="B59"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Levy</surname> <given-names>E. D.</given-names></name> <name><surname>Landry</surname> <given-names>C. R.</given-names></name> <name><surname>Michnick</surname> <given-names>S. W.</given-names></name></person-group> (<year>2009</year>). <article-title>How perfect can protein interactomes be?</article-title> <source><italic>Sci. Signal.</italic></source> <volume>2</volume>:<fpage>e11</fpage>.</citation></ref>
<ref id="B60"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Levy</surname> <given-names>E. D.</given-names></name> <name><surname>Pereira-Leal</surname> <given-names>J. B.</given-names></name> <name><surname>Chothia</surname> <given-names>C.</given-names></name> <name><surname>Teichmann</surname> <given-names>S. A.</given-names></name></person-group> (<year>2006</year>). <article-title>3D complex: a structural classification of protein complexes.</article-title> <source><italic>PLoS Comput. Biol.</italic></source> <volume>2</volume>:<fpage>e155</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pcbi.0020155</pub-id> <pub-id pub-id-type="pmid">17112313</pub-id></citation></ref>
<ref id="B61"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liao</surname> <given-names>B.-Y.</given-names></name> <name><surname>Scott</surname> <given-names>N. M.</given-names></name> <name><surname>Zhang</surname> <given-names>J.</given-names></name></person-group> (<year>2006</year>). <article-title>Impacts of gene essentiality, expression pattern, and gene compactness on the evolutionary rate of mammalian proteins.</article-title> <source><italic>Mol. Biol. Evol.</italic></source> <volume>23</volume> <fpage>2072</fpage>&#x2013;<lpage>2080</lpage>.</citation></ref>
<ref id="B62"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liberles</surname> <given-names>D. A.</given-names></name> <name><surname>Teichmann</surname> <given-names>S. A.</given-names></name> <name><surname>Bahar</surname> <given-names>I.</given-names></name> <name><surname>Bastolla</surname> <given-names>U.</given-names></name> <name><surname>Bloom</surname> <given-names>J.</given-names></name> <name><surname>Bornberg-Bauer</surname> <given-names>E.</given-names></name><etal/></person-group> (<year>2012</year>). <article-title>The interface of protein structure, protein biophysics, and molecular evolution.</article-title> <source><italic>Protein Sci.</italic></source> <volume>21</volume> <fpage>769</fpage>&#x2013;<lpage>785</lpage>.</citation></ref>
<ref id="B63"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liberles</surname> <given-names>D. A.</given-names></name> <name><surname>Tisdell</surname> <given-names>M. D. M.</given-names></name> <name><surname>Grahnen</surname> <given-names>J. A.</given-names></name></person-group> (<year>2011</year>). <article-title>Binding constraints on the evolution of enzymes and signalling proteins: the important role of negative pleiotropy.</article-title> <source><italic>Proc. Biol. Sci.</italic></source> <volume>278</volume> <fpage>1930</fpage>&#x2013;<lpage>1935</lpage>.</citation></ref>
<ref id="B64"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lim</surname> <given-names>W. A.</given-names></name> <name><surname>Sauer</surname> <given-names>R. T.</given-names></name></person-group> (<year>1989</year>). <article-title>Alternative packing arrangements in the hydrophobic core of lambda repressor.</article-title> <source><italic>Nature</italic></source> <volume>339</volume> <fpage>31</fpage>&#x2013;<lpage>36</lpage>.</citation></ref>
<ref id="B65"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lin</surname> <given-names>Y.-S.</given-names></name> <name><surname>Hsu</surname> <given-names>W.-L.</given-names></name> <name><surname>Hwang</surname> <given-names>J.-K.</given-names></name> <name><surname>Li</surname> <given-names>W.-H.</given-names></name></person-group> (<year>2007</year>). <article-title>Proportion of solvent-exposed amino acids in a protein and rate of protein evolution.</article-title> <source><italic>Mol. Biol. Evol.</italic></source> <volume>24</volume> <fpage>1005</fpage>&#x2013;<lpage>1011</lpage>.</citation></ref>
<ref id="B66"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>J.-W.</given-names></name> <name><surname>Lin</surname> <given-names>J.-J.</given-names></name> <name><surname>Cheng</surname> <given-names>C.-W.</given-names></name> <name><surname>Lin</surname> <given-names>Y.-F.</given-names></name> <name><surname>Hwang</surname> <given-names>J.-K.</given-names></name> <name><surname>Huang</surname> <given-names>T.-T.</given-names></name></person-group> (<year>2017</year>). <article-title>On the relationship between residue structural environment and sequence conservation in proteins.</article-title> <source><italic>Proteins</italic></source> <volume>85</volume> <fpage>1713</fpage>&#x2013;<lpage>1723</lpage>.</citation></ref>
<ref id="B67"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lopez-Bigas</surname> <given-names>N.</given-names></name> <name><surname>De</surname> <given-names>S.</given-names></name> <name><surname>Teichmann</surname> <given-names>S. A.</given-names></name></person-group> (<year>2008</year>). <article-title>Functional protein divergence in the evolution of Homo sapiens.</article-title> <source><italic>Genome Biol.</italic></source> <volume>9</volume>:<fpage>R33</fpage>.</citation></ref>
<ref id="B68"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Macossay-Castillo</surname> <given-names>M.</given-names></name> <name><surname>Marvelli</surname> <given-names>G.</given-names></name> <name><surname>Guharoy</surname> <given-names>M.</given-names></name> <name><surname>Jain</surname> <given-names>A.</given-names></name> <name><surname>Kihara</surname> <given-names>D.</given-names></name> <name><surname>Tompa</surname> <given-names>P.</given-names></name><etal/></person-group> (<year>2019</year>). <article-title>The balancing act of intrinsically disordered proteins: enabling functional diversity while minimizing promiscuity.</article-title> <source><italic>J. Mol. Biol.</italic></source> <volume>431</volume> <fpage>1650</fpage>&#x2013;<lpage>1670</lpage>. <pub-id pub-id-type="doi">10.1016/j.jmb.2019.03.008</pub-id> <pub-id pub-id-type="pmid">30878482</pub-id></citation></ref>
<ref id="B69"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mehlhoff</surname> <given-names>J. D.</given-names></name> <name><surname>Stearns</surname> <given-names>F. W.</given-names></name> <name><surname>Rohm</surname> <given-names>D.</given-names></name> <name><surname>Wang</surname> <given-names>B.</given-names></name> <name><surname>Tsou</surname> <given-names>E.-Y.</given-names></name> <name><surname>Dutta</surname> <given-names>N.</given-names></name><etal/></person-group> (<year>2020</year>). <article-title>Collateral fitness effects of mutations.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>117</volume> <fpage>11597</fpage>&#x2013;<lpage>11607</lpage>.</citation></ref>
<ref id="B70"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>M&#x00E9;sz&#x00E1;ros</surname> <given-names>B.</given-names></name> <name><surname>Simon</surname> <given-names>I.</given-names></name> <name><surname>Doszt&#x00E1;nyi</surname> <given-names>Z.</given-names></name></person-group> (<year>2009</year>). <article-title>Prediction of protein binding regions in disordered proteins.</article-title> <source><italic>PLoS Comput. Biol.</italic></source> <volume>5</volume>:<fpage>e1000376</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pcbi.1000376</pub-id> <pub-id pub-id-type="pmid">19412530</pub-id></citation></ref>
<ref id="B71"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Miller</surname> <given-names>S.</given-names></name> <name><surname>Janin</surname> <given-names>J.</given-names></name> <name><surname>Lesk</surname> <given-names>A. M.</given-names></name> <name><surname>Chothia</surname> <given-names>C.</given-names></name></person-group> (<year>1987</year>). <article-title>Interior and surface of monomeric proteins.</article-title> <source><italic>J. Mol. Biol.</italic></source> <volume>196</volume> <fpage>641</fpage>&#x2013;<lpage>656</lpage>.</citation></ref>
<ref id="B72"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Murzin</surname> <given-names>A. G.</given-names></name> <name><surname>Brenner</surname> <given-names>S. E.</given-names></name> <name><surname>Hubbard</surname> <given-names>T.</given-names></name> <name><surname>Chothia</surname> <given-names>C.</given-names></name></person-group> (<year>1995</year>). <article-title>SCOP: a structural classification of proteins database for the investigation of sequences and structures.</article-title> <source><italic>J. Mol. Biol.</italic></source> <volume>247</volume> <fpage>536</fpage>&#x2013;<lpage>540</lpage>.</citation></ref>
<ref id="B73"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Oates</surname> <given-names>M. E.</given-names></name> <name><surname>Stahlhacke</surname> <given-names>J.</given-names></name> <name><surname>Vavoulis</surname> <given-names>D. V.</given-names></name></person-group> (<year>2015</year>). <article-title>The SUPERFAMILY 1.75 database in 2014: a doubling of data.</article-title> <source><italic>Nucleic Acids Res</italic></source> <volume>43</volume> <fpage>D227</fpage>&#x2013;<lpage>D233</lpage>.</citation></ref>
<ref id="B74"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pal</surname> <given-names>C.</given-names></name> <name><surname>Papp</surname> <given-names>B.</given-names></name> <name><surname>Hurst</surname> <given-names>L. D.</given-names></name></person-group> (<year>2001</year>). <article-title>Highly expressed genes in yeast evolve slowly.</article-title> <source><italic>Genetics</italic></source> <volume>158</volume> <fpage>927</fpage>&#x2013;<lpage>931</lpage>.</citation></ref>
<ref id="B75"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>P&#x00E1;l</surname> <given-names>C.</given-names></name> <name><surname>Papp</surname> <given-names>B.</given-names></name> <name><surname>Lercher</surname> <given-names>M. J.</given-names></name></person-group> (<year>2006</year>). <article-title>An integrated view of protein evolution.</article-title> <source><italic>Nat. Rev. Genet.</italic></source> <volume>7</volume> <fpage>337</fpage>&#x2013;<lpage>348</lpage>.</citation></ref>
<ref id="B76"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Park</surname> <given-names>C.</given-names></name> <name><surname>Chen</surname> <given-names>X.</given-names></name> <name><surname>Yang</surname> <given-names>J.-R.</given-names></name> <name><surname>Zhang</surname> <given-names>J.</given-names></name></person-group> (<year>2013</year>). <article-title>Differential requirements for mRNA folding partially explain why highly expressed proteins evolve slowly.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>110</volume> <fpage>E678</fpage>&#x2013;<lpage>E686</lpage>.</citation></ref>
<ref id="B77"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pechmann</surname> <given-names>S.</given-names></name> <name><surname>Frydman</surname> <given-names>J.</given-names></name></person-group> (<year>2013</year>). <article-title>Evolutionary conservation of codon optimality reveals hidden signatures of cotranslational folding.</article-title> <source><italic>Nat. Struct. Mol. Biol.</italic></source> <volume>20</volume> <fpage>237</fpage>&#x2013;<lpage>243</lpage>.</citation></ref>
<ref id="B78"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Plata</surname> <given-names>G.</given-names></name> <name><surname>Vitkup</surname> <given-names>D.</given-names></name></person-group> (<year>2018</year>). <article-title>Protein stability and avoidance of toxic misfolding do not explain the sequence constraints of highly expressed proteins.</article-title> <source><italic>Mol. Biol. Evol.</italic></source> <volume>35</volume> <fpage>700</fpage>&#x2013;<lpage>703</lpage>.</citation></ref>
<ref id="B79"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Plata</surname> <given-names>G.</given-names></name> <name><surname>Gottesman</surname> <given-names>M. E.</given-names></name> <name><surname>Vitkup</surname> <given-names>D.</given-names></name></person-group> (<year>2010</year>). <article-title>The rate of the molecular clock and the cost of gratuitous protein synthesis.</article-title> <source><italic>Genome Biol.</italic></source> <volume>11</volume>:<fpage>R98</fpage>.</citation></ref>
<ref id="B80"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Popescu</surname> <given-names>C. E.</given-names></name> <name><surname>Borza</surname> <given-names>T.</given-names></name> <name><surname>Bielawski</surname> <given-names>J. P.</given-names></name> <name><surname>Lee</surname> <given-names>R. W.</given-names></name></person-group> (<year>2006</year>). <article-title>Evolutionary rates and expression level in <italic>Chlamydomonas</italic>.</article-title> <source><italic>Genetics</italic></source> <volume>172</volume> <fpage>1567</fpage>&#x2013;<lpage>1576</lpage>.</citation></ref>
<ref id="B81"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pupko</surname> <given-names>T.</given-names></name> <name><surname>Bell</surname> <given-names>R. E.</given-names></name> <name><surname>Mayrose</surname> <given-names>I.</given-names></name> <name><surname>Glaser</surname> <given-names>F.</given-names></name> <name><surname>Ben-Tal</surname> <given-names>N.</given-names></name></person-group> (<year>2002</year>). <article-title>Rate4Site: an algorithmic tool for the identification of functional regions in proteins by surface mapping of evolutionary determinants within their homologues.</article-title> <source><italic>Bioinformatics</italic></source> <volume>18</volume> <issue>(Suppl. 1)</issue> <fpage>S71</fpage>&#x2013;<lpage>S77</lpage>.</citation></ref>
<ref id="B82"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Razban</surname> <given-names>R. M.</given-names></name></person-group> (<year>2019</year>). <article-title>Protein melting temperature cannot fully assess whether protein folding free energy underlies the universal abundance&#x2013;evolutionary rate correlation seen in proteins.</article-title> <source><italic>Mol. Biol. Evol.</italic></source> <volume>36</volume> <fpage>1955</fpage>&#x2013;<lpage>1963</lpage>.</citation></ref>
<ref id="B83"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rocha</surname> <given-names>E. P.</given-names></name> <name><surname>Danchin</surname> <given-names>A.</given-names></name></person-group> (<year>2004</year>). <article-title>An analysis of determinants of amino acids substitution rates in bacterial proteins.</article-title> <source><italic>Mol. Biol. Evol.</italic></source> <volume>21</volume> <fpage>108</fpage>&#x2013;<lpage>116</lpage>.</citation></ref>
<ref id="B84"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rose</surname> <given-names>P. W.</given-names></name> <name><surname>Prlic</surname> <given-names>A.</given-names></name> <name><surname>Altunkaya</surname> <given-names>A.</given-names></name> <name><surname>Bi</surname> <given-names>C.</given-names></name> <name><surname>Bradley</surname> <given-names>A. R.</given-names></name> <name><surname>Christie</surname> <given-names>C. H.</given-names></name><etal/></person-group> (<year>2017</year>). <article-title>The RCSB protein data bank: integrative view of protein, gene and 3D structural information.</article-title> <source><italic>Nucleic Acids Res.</italic></source> <volume>45</volume> <fpage>D271</fpage>&#x2013;<lpage>D281</lpage>.</citation></ref>
<ref id="B85"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Russo</surname> <given-names>A. A.</given-names></name> <name><surname>Jeffrey</surname> <given-names>P. D.</given-names></name> <name><surname>Patten</surname> <given-names>A. K.</given-names></name> <name><surname>Massagu&#x00E9;</surname> <given-names>J.</given-names></name> <name><surname>Pavletich</surname> <given-names>N. P.</given-names></name></person-group> (<year>1996</year>). <article-title>Crystal structure of the p27Kip1 cyclin-dependent-kinase inhibitor bound to the cyclin A-Cdk2 complex.</article-title> <source><italic>Nature</italic></source> <volume>382</volume> <fpage>325</fpage>&#x2013;<lpage>331</lpage>.</citation></ref>
<ref id="B86"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>S&#x00E4;llstr&#x00F6;m</surname> <given-names>B.</given-names></name> <name><surname>Arnaout</surname> <given-names>R. A.</given-names></name> <name><surname>Davids</surname> <given-names>W.</given-names></name> <name><surname>Bjelkmar</surname> <given-names>P.</given-names></name> <name><surname>Andersson</surname> <given-names>S. G. E.</given-names></name></person-group> (<year>2006</year>). <article-title>Protein evolutionary rates correlate with expression independently of synonymous substitutions in <italic>Helicobacter</italic> pylori.</article-title> <source><italic>J. Mol. Evol.</italic></source> <volume>62</volume> <fpage>600</fpage>&#x2013;<lpage>614</lpage>.</citation></ref>
<ref id="B87"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sasidharan</surname> <given-names>R.</given-names></name> <name><surname>Chothia</surname> <given-names>C.</given-names></name></person-group> (<year>2007</year>). <article-title>The selection of acceptable protein mutations.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>104</volume> <fpage>10080</fpage>&#x2013;<lpage>10085</lpage>.</citation></ref>
<ref id="B88"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schmid</surname> <given-names>A. B.</given-names></name> <name><surname>Lagleder</surname> <given-names>S.</given-names></name> <name><surname>Gr&#x00E4;wert</surname> <given-names>M. A.</given-names></name> <name><surname>R&#x00F6;hl</surname> <given-names>A.</given-names></name> <name><surname>Hagn</surname> <given-names>F.</given-names></name> <name><surname>Wandinger</surname> <given-names>S. K.</given-names></name><etal/></person-group> (<year>2012</year>). <article-title>The architecture of functional modules in the Hsp90 co-chaperone Sti1/Hop.</article-title> <source><italic>EMBO J.</italic></source> <volume>31</volume> <fpage>1506</fpage>&#x2013;<lpage>1517</lpage>.</citation></ref>
<ref id="B89"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Serohijos</surname> <given-names>A. W. R.</given-names></name> <name><surname>Lee</surname> <given-names>S. Y. R.</given-names></name> <name><surname>Shakhnovich</surname> <given-names>E. I.</given-names></name></person-group> (<year>2013</year>). <article-title>Highly abundant proteins favor more stable 3D structures in yeast.</article-title> <source><italic>Biophys. J.</italic></source> <volume>104</volume> <fpage>L1</fpage>&#x2013;<lpage>L3</lpage>.</citation></ref>
<ref id="B90"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Serohijos</surname> <given-names>A. W. R.</given-names></name> <name><surname>Rimas</surname> <given-names>Z.</given-names></name> <name><surname>Shakhnovich</surname> <given-names>E. I.</given-names></name></person-group> (<year>2012</year>). <article-title>Protein biophysics explains why highly abundant proteins evolve slowly.</article-title> <source><italic>Cell Rep.</italic></source> <volume>2</volume> <fpage>249</fpage>&#x2013;<lpage>256</lpage>.</citation></ref>
<ref id="B91"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shahmoradi</surname> <given-names>A.</given-names></name> <name><surname>Wilke</surname> <given-names>C. O.</given-names></name></person-group> (<year>2016</year>). <article-title>Dissecting the roles of local packing density and longer-range effects in protein sequence evolution.</article-title> <source><italic>Proteins Struct. Funct. Bioinf.</italic></source> <volume>84</volume> <fpage>841</fpage>&#x2013;<lpage>854</lpage>.</citation></ref>
<ref id="B92"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shakhnovich</surname> <given-names>B. E.</given-names></name> <name><surname>Deeds</surname> <given-names>E.</given-names></name> <name><surname>Delisi</surname> <given-names>C.</given-names></name> <name><surname>Shakhnovich</surname> <given-names>E.</given-names></name></person-group> (<year>2005</year>). <article-title>Protein structure and evolutionary history determine sequence space topology.</article-title> <source><italic>Genome Res.</italic></source> <volume>15</volume> <fpage>385</fpage>&#x2013;<lpage>392</lpage>.</citation></ref>
<ref id="B93"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shrake</surname> <given-names>A.</given-names></name> <name><surname>Rupley</surname> <given-names>J. A.</given-names></name></person-group> (<year>1973</year>). <article-title>Environment and exposure to solvent of protein atoms. Lysozyme and insulin.</article-title> <source><italic>J. Mol. Biol.</italic></source> <volume>79</volume> <fpage>351</fpage>&#x2013;<lpage>371</lpage>.</citation></ref>
<ref id="B94"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sikosek</surname> <given-names>T.</given-names></name> <name><surname>Chan</surname> <given-names>H. S.</given-names></name></person-group> (<year>2014</year>). <article-title>Biophysics of protein evolution and evolutionary protein biophysics.</article-title> <source><italic>J. R. Soc. Interface</italic></source> <volume>11</volume>:<fpage>20140419</fpage>.</citation></ref>
<ref id="B95"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Spielman</surname> <given-names>S. J.</given-names></name> <name><surname>Wilke</surname> <given-names>C. O.</given-names></name></person-group> (<year>2016</year>). <article-title>Extensively parameterized mutation&#x2013;selection models reliably capture site-specific selective constraint.</article-title> <source><italic>Mol. Biol. Evol.</italic></source> <volume>33</volume> <fpage>2990</fpage>&#x2013;<lpage>3002</lpage>.</citation></ref>
<ref id="B96"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Stergachis</surname> <given-names>A. B.</given-names></name> <name><surname>Haugen</surname> <given-names>E.</given-names></name> <name><surname>Shafer</surname> <given-names>A.</given-names></name> <name><surname>Fu</surname> <given-names>W.</given-names></name> <name><surname>Vernot</surname> <given-names>B.</given-names></name> <name><surname>Reynolds</surname> <given-names>A.</given-names></name><etal/></person-group> (<year>2013</year>). <article-title>Exonic transcription factor binding directs codon choice and affects protein evolution.</article-title> <source><italic>Science</italic></source> <volume>342</volume> <fpage>1367</fpage>&#x2013;<lpage>1372</lpage>.</citation></ref>
<ref id="B97"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Stutz</surname> <given-names>A.</given-names></name> <name><surname>Bairoch</surname> <given-names>A.</given-names></name> <name><surname>Estreicher</surname> <given-names>A.</given-names></name></person-group> (<year>2006</year>). <article-title>UniProtKB/Swiss-Prot: the protein sequence knowledgebase.</article-title> <source><italic>FEBS J.</italic></source> <volume>273</volume> <fpage>62</fpage>&#x2013;<lpage>62</lpage>.</citation></ref>
<ref id="B98"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Subramanian</surname> <given-names>S.</given-names></name> <name><surname>Kumar</surname> <given-names>S.</given-names></name></person-group> (<year>2004</year>). <article-title>Gene expression intensity shapes evolutionary rates of the proteins encoded by the vertebrate genome.</article-title> <source><italic>Genetics</italic></source> <volume>168</volume> <fpage>373</fpage>&#x2013;<lpage>381</lpage>.</citation></ref>
<ref id="B99"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Thompson</surname> <given-names>J. D.</given-names></name> <name><surname>Higgins</surname> <given-names>D. G.</given-names></name> <name><surname>Gibson</surname> <given-names>T. J.</given-names></name></person-group> (<year>1994</year>). <article-title>CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.</article-title> <source><italic>Nucleic Acids Res</italic>.</source> <volume>22</volume> <fpage>4673</fpage>&#x2013;<lpage>4680</lpage>. <pub-id pub-id-type="doi">10.1093/nar/22.22.4673</pub-id> <pub-id pub-id-type="pmid">7984417</pub-id></citation></ref>
<ref id="B100"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tokuriki</surname> <given-names>N.</given-names></name> <name><surname>Stricher</surname> <given-names>F.</given-names></name> <name><surname>Schymkowitz</surname> <given-names>J.</given-names></name> <name><surname>Serrano</surname> <given-names>L.</given-names></name> <name><surname>Tawfik</surname> <given-names>D. S.</given-names></name></person-group> (<year>2007</year>). <article-title>The stability effects of protein mutations appear to be universally distributed.</article-title> <source><italic>J. Mol. Biol.</italic></source> <volume>369</volume> <fpage>1318</fpage>&#x2013;<lpage>1332</lpage>.</citation></ref>
<ref id="B101"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tompa</surname> <given-names>P.</given-names></name></person-group> (<year>2005</year>). <article-title>The interplay between structure and function in intrinsically unstructured proteins.</article-title> <source><italic>FEBS Lett.</italic></source> <volume>579</volume> <fpage>3346</fpage>&#x2013;<lpage>3354</lpage>.</citation></ref>
<ref id="B102"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>T&#x00F3;th-Petr&#x00F3;czy</surname> <given-names>A.</given-names></name> <name><surname>Tawfik</surname> <given-names>D. S.</given-names></name></person-group> (<year>2011</year>). <article-title>Slow protein evolutionary rates are dictated by surface-core association.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>108</volume> <fpage>11151</fpage>&#x2013;<lpage>11156</lpage>.</citation></ref>
<ref id="B103"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tuller</surname> <given-names>T.</given-names></name> <name><surname>Carmi</surname> <given-names>A.</given-names></name> <name><surname>Vestsigian</surname> <given-names>K.</given-names></name> <name><surname>Navon</surname> <given-names>S.</given-names></name> <name><surname>Dorfan</surname> <given-names>Y.</given-names></name> <name><surname>Zaborske</surname> <given-names>J.</given-names></name><etal/></person-group> (<year>2010</year>). <article-title>An evolutionarily conserved mechanism for controlling the efficiency of protein translation.</article-title> <source><italic>Cell</italic></source> <volume>141</volume> <fpage>344</fpage>&#x2013;<lpage>354</lpage>.</citation></ref>
<ref id="B104"><citation citation-type="journal"><collab>UniProt Consortium</collab> (<year>2019</year>). <article-title>UniProt: a worldwide hub of protein knowledge.</article-title> <source><italic>Nucleic Acids Res.</italic></source> <volume>47</volume> <fpage>D506</fpage>&#x2013;<lpage>D515</lpage>.</citation></ref>
<ref id="B105"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Uversky</surname> <given-names>V. N.</given-names></name> <name><surname>Dunker</surname> <given-names>A. K.</given-names></name></person-group> (<year>2010</year>). <article-title>Understanding protein non-folding.</article-title> <source><italic>Biochim. Biophys. Acta</italic></source> <volume>1804</volume> <fpage>1231</fpage>&#x2013;<lpage>1264</lpage>.</citation></ref>
<ref id="B106"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vacic</surname> <given-names>V.</given-names></name> <name><surname>Oldfield</surname> <given-names>C. J.</given-names></name> <name><surname>Mohan</surname> <given-names>A.</given-names></name> <name><surname>Radivojac</surname> <given-names>P.</given-names></name> <name><surname>Cortese</surname> <given-names>M. S.</given-names></name> <name><surname>Uversky</surname> <given-names>V. N.</given-names></name><etal/></person-group> (<year>2007</year>). <article-title>Characterization of molecular recognition features, MoRFs, and their binding partners.</article-title> <source><italic>J. Proteome Res.</italic></source> <volume>6</volume> <fpage>2351</fpage>&#x2013;<lpage>2366</lpage>.</citation></ref>
<ref id="B107"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Van der Lee</surname> <given-names>R.</given-names></name> <name><surname>Buljan</surname> <given-names>M.</given-names></name> <name><surname>Lang</surname> <given-names>B.</given-names></name> <name><surname>Weatheritt</surname> <given-names>R. J.</given-names></name> <name><surname>Daughdrill</surname> <given-names>G. W.</given-names></name> <name><surname>Dunker</surname> <given-names>A. K.</given-names></name><etal/></person-group> (<year>2014</year>). <article-title>Classification of intrinsically disordered regions and proteins.</article-title> <source><italic>Chem. Rev.</italic></source> <volume>114</volume> <fpage>6589</fpage>&#x2013;<lpage>6631</lpage>.</citation></ref>
<ref id="B108"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vecchi</surname> <given-names>G.</given-names></name> <name><surname>Sormanni</surname> <given-names>P.</given-names></name> <name><surname>Mannini</surname> <given-names>B.</given-names></name> <name><surname>Vandelli</surname> <given-names>A.</given-names></name> <name><surname>Tartaglia</surname> <given-names>G. G.</given-names></name> <name><surname>Dobson</surname> <given-names>C. M.</given-names></name><etal/></person-group> (<year>2020</year>). <article-title>Proteome-wide observation of the phenomenon of life on the edge of solubility.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>117</volume> <fpage>1015</fpage>&#x2013;<lpage>1020</lpage>.</citation></ref>
<ref id="B109"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wagner</surname> <given-names>A.</given-names></name></person-group> (<year>2005</year>). <article-title>Energy constraints on the evolution of gene expression.</article-title> <source><italic>Mol. Biol. Evol.</italic></source> <volume>22</volume> <fpage>1365</fpage>&#x2013;<lpage>1374</lpage>.</citation></ref>
<ref id="B110"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wall</surname> <given-names>D. P.</given-names></name> <name><surname>Hirsh</surname> <given-names>A. E.</given-names></name> <name><surname>Fraser</surname> <given-names>H. B.</given-names></name> <name><surname>Kumm</surname> <given-names>J.</given-names></name> <name><surname>Giaever</surname> <given-names>G.</given-names></name> <name><surname>Eisen</surname> <given-names>M. B.</given-names></name><etal/></person-group> (<year>2005</year>). <article-title>Functional genomic analysis of the rates of protein evolution.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>102</volume> <fpage>5483</fpage>&#x2013;<lpage>5488</lpage>.</citation></ref>
<ref id="B111"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>M.</given-names></name> <name><surname>Herrmann</surname> <given-names>C. J.</given-names></name> <name><surname>Simonovic</surname> <given-names>M.</given-names></name> <name><surname>Szklarczyk</surname> <given-names>D.</given-names></name> <name><surname>von Mering</surname> <given-names>C.</given-names></name></person-group> (<year>2015</year>). <article-title>Version 4.0 of PaxDb: protein abundance data, integrated across model organisms, tissues, and cell-lines.</article-title> <source><italic>Proteomics</italic></source> <volume>15</volume> <fpage>3163</fpage>&#x2013;<lpage>3168</lpage>. <pub-id pub-id-type="doi">10.1002/pmic.201400441</pub-id> <pub-id pub-id-type="pmid">25656970</pub-id></citation></ref>
<ref id="B112"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>M.</given-names></name> <name><surname>Weiss</surname> <given-names>M.</given-names></name> <name><surname>Simonovic</surname> <given-names>M.</given-names></name> <name><surname>Haertinger</surname> <given-names>G.</given-names></name> <name><surname>Schrimpf</surname> <given-names>S. P.</given-names></name> <name><surname>Hengartner</surname> <given-names>M. O.</given-names></name><etal/></person-group> (<year>2012</year>). <article-title>PaxDb, a database of protein abundance averages across all three domains of life.</article-title> <source><italic>Mol. Cell. Proteomics</italic></source> <volume>11</volume> <fpage>492</fpage>&#x2013;<lpage>500</lpage>.</citation></ref>
<ref id="B113"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wapinski</surname> <given-names>I.</given-names></name> <name><surname>Pfeffer</surname> <given-names>A.</given-names></name> <name><surname>Friedman</surname> <given-names>N.</given-names></name> <name><surname>Regev</surname> <given-names>A.</given-names></name></person-group> (<year>2007</year>). <article-title>Natural history and evolutionary principles of gene duplication in fungi.</article-title> <source><italic>Nature</italic></source> <volume>449</volume> <fpage>54</fpage>&#x2013;<lpage>61</lpage>.</citation></ref>
<ref id="B114"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wetlaufer</surname> <given-names>D. B.</given-names></name></person-group> (<year>1973</year>). <article-title>Nucleation, rapid folding, and globular intrachain regions in proteins.</article-title> <source><italic>Proc. Natl. Acad. Sci. U. S. A</italic>.</source> <volume>70</volume>, <fpage>697</fpage>&#x2013;<lpage>701</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.70.3.697</pub-id> <pub-id pub-id-type="pmid">4351801</pub-id></citation></ref>
<ref id="B115"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wright</surname> <given-names>P. E.</given-names></name> <name><surname>Dyson</surname> <given-names>H. J.</given-names></name></person-group> (<year>2015</year>). <article-title>Intrinsically disordered proteins in cellular signalling and regulation.</article-title> <source><italic>Nat. Rev. Mol. Cell Biol.</italic></source> <volume>16</volume> <fpage>18</fpage>&#x2013;<lpage>29</lpage>.</citation></ref>
<ref id="B116"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xia</surname> <given-names>Y.</given-names></name> <name><surname>Franzosa</surname> <given-names>E. A.</given-names></name> <name><surname>Gerstein</surname> <given-names>M. B.</given-names></name></person-group> (<year>2009</year>). <article-title>Integrated assessment of genomic correlates of protein evolutionary rate.</article-title> <source><italic>PLoS Comput. Biol.</italic></source> <volume>5</volume>:<fpage>e1000413</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pcbi.1000413</pub-id> <pub-id pub-id-type="pmid">19521505</pub-id></citation></ref>
<ref id="B117"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yang</surname> <given-names>J. R.</given-names></name> <name><surname>Liao</surname> <given-names>B. Y.</given-names></name> <name><surname>Zhuang</surname> <given-names>S. M.</given-names></name> <name><surname>Zhang</surname> <given-names>J. Z.</given-names></name></person-group> (<year>2012</year>). <article-title>Protein misinteraction avoidance causes highly expressed proteins to evolve slowly.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>109</volume> <fpage>E831</fpage>&#x2013;<lpage>E840</lpage>.</citation></ref>
<ref id="B118"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yeh</surname> <given-names>S.-W.</given-names></name> <name><surname>Huang</surname> <given-names>T.-T.</given-names></name> <name><surname>Liu</surname> <given-names>J.-W.</given-names></name> <name><surname>Yu</surname> <given-names>S.-H.</given-names></name> <name><surname>Shih</surname> <given-names>C.-H.</given-names></name> <name><surname>Hwang</surname> <given-names>J.-K.</given-names></name><etal/></person-group> (<year>2014</year>). <article-title>Local packing density is the main structural determinant of the rate of protein sequence evolution at site level.</article-title> <source><italic>Biomed. Res. Int.</italic></source> <volume>2014</volume>:<fpage>572409</fpage>.</citation></ref>
<ref id="B119"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>J.</given-names></name> <name><surname>Yang</surname> <given-names>J. R.</given-names></name></person-group> (<year>2015</year>). <article-title>Determinants of the rate of protein sequence evolution.</article-title> <source><italic>Nat. Rev. Genet.</italic></source> <volume>16</volume> <fpage>409</fpage>&#x2013;<lpage>420</lpage>.</citation></ref>
<ref id="B120"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhou</surname> <given-names>Z.</given-names></name> <name><surname>Dang</surname> <given-names>Y.</given-names></name> <name><surname>Zhou</surname> <given-names>M.</given-names></name> <name><surname>Li</surname> <given-names>L.</given-names></name> <name><surname>Yu</surname> <given-names>C.-H.</given-names></name> <name><surname>Fu</surname> <given-names>J.</given-names></name><etal/></person-group> (<year>2016</year>). <article-title>Codon usage is an important determinant of gene expression levels largely through its effects on transcription.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>113</volume> <fpage>E6117</fpage>&#x2013;<lpage>E6125</lpage>.</citation></ref>
</ref-list>
</back>
</article>