<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="research-article" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Mol. Biosci.</journal-id>
<journal-title>Frontiers in Molecular Biosciences</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Mol. Biosci.</abbrev-journal-title>
<issn pub-type="epub">2296-889X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">889943</article-id>
<article-id pub-id-type="doi">10.3389/fmolb.2022.889943</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Molecular Biosciences</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Variable and Conserved Regions of Secondary Structure in the &#x3b2;-Trefoil Fold: Structure Versus Function</article-title>
<alt-title alt-title-type="left-running-head">Blaber</alt-title>
<alt-title alt-title-type="right-running-head">Elements of &#x3b2;-Trefoil Architecture</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Blaber</surname>
<given-names>Michael</given-names>
</name>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/100050/overview"/>
</contrib>
</contrib-group>
<aff>
<institution>Department of Biomedical Sciences</institution>, <institution>College of Medicine</institution>, <institution>Florida State University</institution>, <addr-line>Tallahassee</addr-line>, <addr-line>FL</addr-line>, <country>United States</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/254675/overview">Delia Picone</ext-link>, University of Naples Federico II, Italy</p>
</fn>
<fn fn-type="edited-by">
<p>
<bold>Reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1715885/overview">Takeshi Kikuchi</ext-link>, Ritsumeikan University, Japan</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/619905/overview">Serena Leone</ext-link>, Zoological Station Anton Dohrn, Italy</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Michael Blaber, <email>michael.blaber@med.fsu.edu</email>
</corresp>
<fn fn-type="other">
<p>This article was submitted to Structural Biology, a section of the journal Frontiers in Molecular Biosciences</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>19</day>
<month>04</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>9</volume>
<elocation-id>889943</elocation-id>
<history>
<date date-type="received">
<day>04</day>
<month>03</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>01</day>
<month>04</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2022 Blaber.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Blaber</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<p>&#x3b2;-trefoil proteins exhibit an approximate C<sub>3</sub> rotational symmetry. An analysis of the secondary structure for members of this diverse superfamily of proteins indicates that it is comprised of remarkably conserved &#x3b2;-strands and highly-divergent turn regions. A fundamental &#x201c;minimal&#x201d; architecture can be identified that is devoid of heterogenous and extended turn regions, and is conserved among all family members. Conversely, the different functional families of &#x3b2;-trefoils can potentially be identified by their unique turn patterns (or turn &#x201c;signature&#x201d;). Such analyses provide clues as to the evolution of the &#x3b2;-trefoil family, suggesting a folding/stability role for the &#x3b2;-strands and a functional role for turn regions. This viewpoint can also guide <italic>de novo</italic> protein design of &#x3b2;-trefoil proteins having novel functionality.</p>
</abstract>
<kwd-group>
<kwd>protein symmetry</kwd>
<kwd>
<italic>de novo</italic> design</kwd>
<kwd>hydrophobic patterning</kwd>
<kwd>ligand</kwd>
<kwd>folding nucleus</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="s1">
<title>Introduction</title>
<p>The &#x3b2;-trefoil is a common protein architecture, with 10 different superfamilies, and constituting approximately 1% of the proteome (<xref ref-type="bibr" rid="B3">Andreeva et al., 2013</xref>) (<xref ref-type="table" rid="T1">Table 1</xref>). A notable feature of the &#x3b2;-trefoil is a discernable C<sub>3</sub> rotational symmetry where the repeating &#x201c;trefoil&#x201d; motif is approximately 40&#x2013;50 amino acids in length and contains four anti-parallel &#x3b2;-strands connected by turn/loop regions (<xref ref-type="bibr" rid="B53">Sweet et al., 1974</xref>; <xref ref-type="bibr" rid="B37">McLachlan, 1979</xref>; <xref ref-type="bibr" rid="B39">Murzin et al., 1992</xref>) (<xref ref-type="fig" rid="F1">Figure 1</xref>). &#x3b2;-trefoil proteins encompass diverse ligand-type functionalities, including toxins, protease inhibitors, cytokines, growth factors, agglutinins, lectins, and other types of ligands [SCOP database (<xref ref-type="bibr" rid="B4">Andreeva et al., 2019</xref>)], although no known enzymatic functionality. These ligand functionalities are associated with specific turn/loop regions that may define certain &#x3b2;-trefoil families (<xref ref-type="bibr" rid="B9">Blow et al., 1974</xref>; <xref ref-type="bibr" rid="B59">Veerapandian et al., 1992</xref>; <xref ref-type="bibr" rid="B42">Notenboom et al., 2002</xref>; <xref ref-type="bibr" rid="B11">Bovi et al., 2012</xref>; <xref ref-type="bibr" rid="B6">Blaber, 2020</xref>).</p>
<table-wrap id="T1" position="float">
<label>TABLE 1</label>
<caption>
<p>&#x3b2;-trefoil superfamily and structures utilized in characterization of secondary structure heterogeneity. The overlay statistics with Symfoil-4T (RCSB 3O4B) are also provided.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Superfamily</th>
<th align="center">Family</th>
<th align="center">Domain</th>
<th align="center">RCSB</th>
<th align="center">Res. (&#xc5;)</th>
<th align="center">&#x23;C&#x3b1; Ovl</th>
<th align="center">Ovl rmsd (&#xc5;)</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td rowspan="22" align="left">Ricin B-like lectin</td>
<td rowspan="9" align="left">Ricin B-like</td>
<td align="left">&#x3b2;-zylanase</td>
<td align="left">1XYF</td>
<td align="char" char=".">1.90</td>
<td align="center">93</td>
<td align="char" char=".">1.13</td>
</tr>
<tr>
<td align="left">&#x3b2;-galactoside-specific lectin 1</td>
<td align="left">1SZ6</td>
<td align="char" char=".">2.05</td>
<td align="center">96</td>
<td align="char" char=".">1.36</td>
</tr>
<tr>
<td align="left">Hemolytic lectin CEL-III</td>
<td align="left">1VCL</td>
<td align="char" char=".">1.70</td>
<td align="center">94</td>
<td align="char" char=".">1.33</td>
</tr>
<tr>
<td align="left">29-kDa galactose-binding lectin</td>
<td align="left">2ZQO</td>
<td align="char" char=".">1.80</td>
<td align="center">86</td>
<td align="char" char=".">1.31</td>
</tr>
<tr>
<td align="left">Main hemagglutinin component type C</td>
<td align="left">3AH2</td>
<td align="char" char=".">1.70</td>
<td align="center">102</td>
<td align="char" char=".">1.24</td>
</tr>
<tr>
<td align="left">Agglutinin</td>
<td align="left">5D61</td>
<td align="char" char=".">1.60</td>
<td align="center">98</td>
<td align="char" char=".">1.01</td>
</tr>
<tr>
<td align="left">Endo-1,4-&#x3b2;-xylanase A</td>
<td align="left">1KNL</td>
<td align="char" char=".">1.20</td>
<td align="center">90</td>
<td align="char" char=".">1.14</td>
</tr>
<tr>
<td align="left">Cytolethal distending toxin</td>
<td align="left">1SR4</td>
<td align="char" char=".">2.00</td>
<td align="center">104</td>
<td align="char" char=".">1.28</td>
</tr>
<tr>
<td align="left">Abrin-A</td>
<td align="left">1ABR</td>
<td align="char" char=".">2.14</td>
<td align="center">95</td>
<td align="char" char=".">1.22</td>
</tr>
<tr>
<td align="left">Cysteine rich domain</td>
<td align="left">Cysteine rich domain</td>
<td align="left">1FWV</td>
<td align="char" char=".">1.90</td>
<td align="center">88</td>
<td align="char" char=".">1.19</td>
</tr>
<tr>
<td align="left">GlcNAc-alpha-1,4-Gal-releasing endo-&#x3b2;-galactosidase</td>
<td align="left">GlcNAc-alpha-1,4-Gal-releasing endo-&#x3b2;-galactosidase</td>
<td align="left">1UPS</td>
<td align="char" char=".">1.82</td>
<td align="center">104</td>
<td align="char" char=".">1.16</td>
</tr>
<tr>
<td align="left">HylA &#x3b2;-trefoil domain-like</td>
<td align="left">HylA &#x3b2;-trefoil domain-like</td>
<td align="left">1XEZ</td>
<td align="char" char=".">2.30</td>
<td align="center">88</td>
<td align="char" char=".">1.55</td>
</tr>
<tr>
<td rowspan="5" align="left">Kunitz (STI) inhibitors</td>
<td align="left">Chymotrypsin inhibitor 3</td>
<td align="left">1EYL</td>
<td align="char" char=".">1.90</td>
<td align="center">79</td>
<td align="char" char=".">1.41</td>
</tr>
<tr>
<td align="left">Trypsin inhibitor A</td>
<td align="left">1AVW</td>
<td align="char" char=".">1.75</td>
<td align="center">75</td>
<td align="char" char=".">1.37</td>
</tr>
<tr>
<td align="left">Alpha-amylase/subtilisin inhibitor</td>
<td align="left">3BX1</td>
<td align="char" char=".">1.85</td>
<td align="center">80</td>
<td align="char" char=".">1.34</td>
</tr>
<tr>
<td align="left">Kunitz-type serine proteinase inhibitor DrTI</td>
<td align="left">1R8N</td>
<td align="char" char=".">1.75</td>
<td align="center">78</td>
<td align="char" char=".">1.42</td>
</tr>
<tr>
<td align="left">Albumin-1</td>
<td align="left">1WBA</td>
<td align="char" char=".">1.80</td>
<td align="center">74</td>
<td align="char" char=".">1.34</td>
</tr>
<tr>
<td rowspan="3" align="left">Clostridium neurotoxins, C-terminal domain</td>
<td align="left">Botulinum neurotoxin type B</td>
<td align="left">1EPW</td>
<td align="char" char=".">1.90</td>
<td align="center">85</td>
<td align="char" char=".">1.41</td>
</tr>
<tr>
<td align="left">Botulinum neurotoxin type A</td>
<td align="left">5MK6</td>
<td align="char" char=".">1.45</td>
<td align="center">79</td>
<td align="char" char=".">1.19</td>
</tr>
<tr>
<td align="left">Tetanus toxin</td>
<td align="left">1A8D</td>
<td align="char" char=".">1.57</td>
<td align="center">80</td>
<td align="char" char=".">1.25</td>
</tr>
<tr>
<td rowspan="2" align="left">Clitocypin-like</td>
<td align="left">Clitocypin-5</td>
<td align="left">3H6S</td>
<td align="char" char=".">2.22</td>
<td align="center">87</td>
<td align="char" char=".">1.18</td>
</tr>
<tr>
<td align="left">Clitocypin-2</td>
<td align="left">3H6R</td>
<td align="char" char=".">1.95</td>
<td align="center">89</td>
<td align="char" char=".">1.26</td>
</tr>
<tr>
<td rowspan="10" align="left">Cytokine</td>
<td rowspan="7" align="left">Fibroblast growth factors</td>
<td align="left">FGF-1</td>
<td align="left">1RG8</td>
<td align="char" char=".">1.10</td>
<td align="center">115</td>
<td align="char" char=".">1.06</td>
</tr>
<tr>
<td align="left">FGF-2</td>
<td align="left">1BFG</td>
<td align="char" char=".">1.60</td>
<td align="center">113</td>
<td align="char" char=".">0.98</td>
</tr>
<tr>
<td align="left">FGF-4</td>
<td align="left">1IJT</td>
<td align="char" char=".">1.80</td>
<td align="center">115</td>
<td align="char" char=".">1.23</td>
</tr>
<tr>
<td align="left">FGF-8</td>
<td align="left">2FDB</td>
<td align="char" char=".">2.28</td>
<td align="center">110</td>
<td align="char" char=".">1.23</td>
</tr>
<tr>
<td align="left">FGF-9</td>
<td align="left">1IHK</td>
<td align="char" char=".">2.20</td>
<td align="center">113</td>
<td align="char" char=".">1.19</td>
</tr>
<tr>
<td align="left">FGF-12</td>
<td align="left">1Q1U</td>
<td align="char" char=".">1.70</td>
<td align="center">113</td>
<td align="char" char=".">1.39</td>
</tr>
<tr>
<td align="left">FGF-19</td>
<td align="left">1PWA</td>
<td align="char" char=".">1.30</td>
<td align="center">93</td>
<td align="char" char=".">1.18</td>
</tr>
<tr>
<td rowspan="3" align="left">Interleukin-1 (IL-1)</td>
<td align="left">Interleukin-1 &#x3b2;</td>
<td align="left">5R7W</td>
<td align="char" char=".">1.27</td>
<td align="center">95</td>
<td align="char" char=".">1.34</td>
</tr>
<tr>
<td align="left">Interleukin-18</td>
<td align="left">3WO2</td>
<td align="char" char=".">2.33</td>
<td align="center">89</td>
<td align="char" char=".">1.33</td>
</tr>
<tr>
<td align="left">Interleukin-36 receptor agonist protein</td>
<td align="left">1MD6</td>
<td align="char" char=".">1.60</td>
<td align="center">81</td>
<td align="char" char=".">1.30</td>
</tr>
<tr>
<td align="left">Actin-crosslinking proteins</td>
<td align="left">Fascin</td>
<td align="left">Fascin-1</td>
<td align="left">3LLP</td>
<td align="char" char=".">1.80</td>
<td align="center">104</td>
<td align="char" char=".">1.24</td>
</tr>
<tr>
<td align="left">DNA-binding protein LAG-1 (CSL)</td>
<td align="left">DNA-binding protein LAG-1 (CSL)</td>
<td align="left">Lin-12 and Glp-1 phenotype</td>
<td align="left">3BRD</td>
<td align="char" char=".">2.21</td>
<td align="center">83</td>
<td align="char" char=".">1.04</td>
</tr>
<tr>
<td align="left">AbfB domain</td>
<td align="left">AbfB domain</td>
<td align="left">Alpha-L-arabinofuranosidase B</td>
<td align="left">1WD3</td>
<td align="char" char=".">1.75</td>
<td align="center">96</td>
<td align="char" char=".">1.22</td>
</tr>
<tr>
<td align="left">Agglutinin</td>
<td align="left">Agglutinin</td>
<td align="left">Agglutinin</td>
<td align="left">1JLY</td>
<td align="char" char=".">2.20</td>
<td align="center">98</td>
<td align="char" char=".">1.38</td>
</tr>
<tr>
<td rowspan="2" align="left">MIR domain</td>
<td rowspan="2" align="left">MIR domain</td>
<td align="left">Inositol 1,4,5-trisphosphate receptor type 1</td>
<td align="left">1N4K</td>
<td align="char" char=".">2.20</td>
<td align="center">101</td>
<td align="char" char=".">1.12</td>
</tr>
<tr>
<td align="left">Uncharacterized protein (<italic>C. elegans</italic>)</td>
<td align="left">1T9F</td>
<td align="char" char=".">2.00</td>
<td align="center">105</td>
<td align="char" char=".">0.90</td>
</tr>
<tr>
<td rowspan="3" align="left">30&#xa0;K Lipoprotein C-terminal domain-like</td>
<td rowspan="3" align="left">30&#xa0;K Lipoprotein C-terminal domain-like</td>
<td align="left">30&#xa0;K protein 2</td>
<td align="left">4EFP</td>
<td align="char" char=".">1.33</td>
<td align="center">107</td>
<td align="char" char=".">1.12</td>
</tr>
<tr>
<td align="left">Low molecular mass 30&#xa0;kDa lipoprotein 19G1</td>
<td align="left">4IY9</td>
<td align="char" char=".">2.10</td>
<td align="center">107</td>
<td align="char" char=".">1.10</td>
</tr>
<tr>
<td align="left">30&#xa0;K lipoprotein</td>
<td align="left">4PC4</td>
<td align="char" char=".">1.80</td>
<td align="center">104</td>
<td align="char" char=".">1.10</td>
</tr>
<tr>
<td align="left">Proteinase inhibitor 1-like</td>
<td align="left">Proteinase inhibitor 1-like</td>
<td align="left">Serine protease inhibitor 1</td>
<td align="left">3VWC</td>
<td align="char" char=".">1.50</td>
<td align="center">95</td>
<td align="char" char=".">1.22</td>
</tr>
<tr>
<td rowspan="3" align="left">
<italic>de novo</italic> Symmetric</td>
<td rowspan="3" align="left">
<italic>de novo</italic> Symmetric</td>
<td align="left">Symfoil (Symfoil-4T variant)</td>
<td align="left">3O4B</td>
<td align="char" char=".">1.80</td>
<td align="center">126 (Ref)</td>
<td align="char" char=".">N/A (Ref)</td>
</tr>
<tr>
<td align="left">Threefoil</td>
<td align="left">3PG0</td>
<td align="char" char=".">1.62</td>
<td align="center">105</td>
<td align="char" char=".">1.00</td>
</tr>
<tr>
<td align="left">Mitsuba-1</td>
<td align="left">5XG5</td>
<td align="char" char=".">1.54</td>
<td align="center">103</td>
<td align="char" char=".">1.03</td>
</tr>
</tbody>
</table>
</table-wrap>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption>
<p>The primary, secondary, and tertiary structure of the Symfoil (&#x201c;Symfoil-4T&#x201d;) reference &#x3b2;-trefoil protein. Upper panel: A &#x201c;ribbon&#x201d; diagram of the Symfoil protein (RCSB 3O4B). The colored region (blue: &#x3b2;-strand; red: turn) identifies the first of three repeating &#x201c;trefoil&#x201d; motifs in the structure (the other two colored in gray). Middle panel: A two-dimensional representation of the overall &#x3b2;-trefoil architecture and indicating the strand and turn numbering and the number of residues in each type of secondary structure (referencing Symfoil). Lower panel: the primary structure of the Symfoil protein indicating the secondary structure positions (&#x3b2;-strands underlined and indicated by &#x201c;S&#x201d;, and turns indicated by &#x201c;T&#x201d;).</p>
</caption>
<graphic xlink:href="fmolb-09-889943-g001.tif"/>
</fig>
<p>Symmetry in a subset of common protein folds has been evident from the earliest days of protein structure determination, and has stimulated hypotheses of gene duplication and fusion in their evolutionary emergence from simpler peptide motifs (<xref ref-type="bibr" rid="B20">Eck and Dayhoff, 1966</xref>; <xref ref-type="bibr" rid="B43">Ohno, 1970</xref>; <xref ref-type="bibr" rid="B36">McLachlan, 1972</xref>). Alternative hypotheses for such evolution of the &#x3b2;-trefoil have been proposed, including &#x201c;emergent architecture&#x201d; and &#x201c;conserved architecture&#x201d; models, where the simple peptide motif comprises two anti-parallel &#x3b2;-hairpins known as a &#x201c;trefoil&#x201d; (<xref ref-type="bibr" rid="B38">Mukhopadhyay, 2000</xref>; <xref ref-type="bibr" rid="B47">Ponting and Russell, 2000</xref>; <xref ref-type="bibr" rid="B8">Blaber and Lee, 2012</xref>; <xref ref-type="bibr" rid="B5">Balaji, 2015</xref>). In the emergent architecture model the structural complexity increases with each gene duplication and fusion event, such that the overall &#x3b2;-trefoil architecture only emerges upon a final triplet repeat of the trefoil motif. In the conserved architecture model, the trefoil peptide has the property of oligomerizing as a trimer, thereby generating an intact &#x3b2;-trefoil architecture. A tandem repeat also oligomerizes as a domain-swapped trimer that generates two intact &#x3b2;-trefoils. A triplet repeat of the trefoil motif yields a single polypeptide that folds into &#x3b2;-trefoil. Experimental studies lend greater support to the conserved architecture model (<xref ref-type="bibr" rid="B31">Lee and Blaber, 2011</xref>; <xref ref-type="bibr" rid="B32">Lee et al., 2011</xref>), indicating that an appropriate trefoil motif peptide can spontaneously oligomerize as a trimer to form an intact &#x3b2;-trefoil. Sequence and structure analyses suggest that extant &#x3b2;-trefoil proteins are unlikely to share a common ancestor, but are more likely to have evolved independently from simpler peptide motifs many times, and indeed, this may be a reoccurring and ongoing evolutionary process (<xref ref-type="bibr" rid="B12">Broom et al., 2012</xref>).</p>
<p>Current knowledge regarding symmetric protein architecture suggests that utilization of symmetry is an efficient and practical strategy for simplifying the <italic>de novo</italic> design problem (<xref ref-type="bibr" rid="B26">Hocker et al., 2004</xref>; <xref ref-type="bibr" rid="B41">Nikkhah et al., 2006</xref>; <xref ref-type="bibr" rid="B65">Yadid and Tawfik, 2007</xref>; <xref ref-type="bibr" rid="B49">Richter et al., 2010</xref>; <xref ref-type="bibr" rid="B30">Kopec and Lupas, 2013</xref>; <xref ref-type="bibr" rid="B60">Voet et al., 2014</xref>; <xref ref-type="bibr" rid="B13">Broom et al., 2015</xref>; <xref ref-type="bibr" rid="B14">Brunette et al., 2015</xref>; <xref ref-type="bibr" rid="B27">Huang et al., 2016</xref>; <xref ref-type="bibr" rid="B56">Terada et al., 2017</xref>; <xref ref-type="bibr" rid="B2">Afanasieva et al., 2019</xref>; <xref ref-type="bibr" rid="B29">Kimura et al., 2020</xref>). Furthermore, it may be practical to divide the design problem into two parts: 1) the initial design of a stable, foldable but functionless &#x201c;scaffold&#x201d;, followed by 2) specific functionalization (<xref ref-type="bibr" rid="B10">Bolon et al., 2002</xref>; <xref ref-type="bibr" rid="B19">Dwyer et al., 2004</xref>; <xref ref-type="bibr" rid="B17">Claren et al., 2009</xref>). In the case of the &#x3b2;-trefoil (and perhaps also the &#x3b2;-propeller architecture), this strategy appears especially appropriate for the design of proteins having novel ligand functionalities. It would therefore be extremely useful to elucidate the structural parameters that dictate stable, foldable architecture, from parameters that generate specific functionality.</p>
<p>In this report we examine the hypothesis that the structural determinants of stability and folding for the &#x3b2;-trefoil are principally the &#x3b2;-strand secondary structure (and that this is an essentially conserved structural feature in this superfamily), while specific functionality is provided by turn/loop regions (and that this is a divergent, and unique feature, among functionally-distinct &#x3b2;-trefoil proteins). The analysis suggests an efficient <italic>de novo</italic> protein design pathway that leverages symmetric principles of protein architecture.</p>
</sec>
<sec sec-type="materials|methods" id="s2">
<title>Materials and Methods</title>
<sec id="s2-1">
<title>Selection of Reference &#x3b2;-Trefoil Structure</title>
<p>The identification of insertions or deletions of secondary structure within a protein architecture depends upon the reference protein used for such comparison. The reference protein should ideally comprise the essential structural architecture, with no extraneous insertions or deletions beyond the basic folding and stability requirements. In the case of the &#x3b2;-trefoil, where extant naturally evolved proteins exhibit varying degree of C<sub>3</sub> rotational symmetry, the reference protein would ideally constitute a purely-symmetric architecture so that any asymmetric features in an evaluated protein can readily be identified. There are several <italic>de novo</italic> designed &#x3b2;-trefoil proteins having an exact threefold symmetric primary structure; including Threefoil (<xref ref-type="bibr" rid="B13">Broom et al., 2015</xref>), Mitsuba-1 (<xref ref-type="bibr" rid="B56">Terada et al., 2017</xref>), Phifoil (<xref ref-type="bibr" rid="B35">Longo et al., 2014</xref>) and the Symfoil family of proteins (<xref ref-type="bibr" rid="B31">Lee and Blaber, 2011</xref>; <xref ref-type="bibr" rid="B32">Lee et al., 2011</xref>). Threefoil was designed to have carbohydrate binding function and contains specific turn/loop secondary structure for this purpose. Similarly, Mitsuba-1 was designed to have a galactose binding site afforded by specific surface turn/loop secondary structure. In contrast, Symfoil was designed exclusively from the standpoint of optimized folding kinetics and thermodynamics, and is notably devoid of any specific functionality. Symfoil (using the Symfoil-4T variant) as a reference structure identifies five residue insertions within turns T2, T6 and T10 in Threefoil, and seven residue insertions of the same turns in Mitsuba-1 (<xref ref-type="fig" rid="F2">Figure 2</xref>). Thus, the Symfoil protein was considered as the most appropriate reference protein with which to quantify secondary structure heterogeneity among &#x3b2;-trefoil proteins.</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption>
<p>A comparison of secondary structure insertions/deletions for three symmetric designed &#x3b2;-trefoil proteins. The Symfoil, Threefoil, and Mitsuba-1 proteins are three independently <italic>de novo</italic> designed, purely-symmetric &#x3b2;-trefoil proteins. The most compact of these three is Symfoil, primarily due to ligand-binding turns T2, T6, and T10, engineered into both Threefoil and Misuba-1.</p>
</caption>
<graphic xlink:href="fmolb-09-889943-g002.tif"/>
</fig>
</sec>
<sec id="s2-2">
<title>Representative &#x3b2;-Trefoil Proteins</title>
<p>The RCSB structural databank (<ext-link ext-link-type="uri" xlink:href="http://www.rcsb.org">www.rcsb.org</ext-link>) was queried for &#x3b2;-trefoil proteins solved to better than 2.5&#xa0;&#xc5; resolution. A total of 45 proteins were identified, representing 10 superfamilies, 17 families, and 45 domains, and with an average resolution of 1.81 &#xb1; 0.31&#xa0;&#xc5; (<xref ref-type="table" rid="T1">Table 1</xref>). Only the <italic>de novo</italic> designed &#x3b2;-trefoil proteins exhibit an exact threefold rotational symmetry; all naturally-evolved &#x3b2;-trefoil proteins exhibit varying degrees of primary, secondary and tertiary structure symmetry.</p>
</sec>
<sec id="s2-3">
<title>Structural Overlay</title>
<p>Structural overlays of individual &#x3b2;-trefoil proteins onto the Symfoil protein coordinates (using the Symfoil-4T variant, RCSB 3O4B) were performed using the Swiss PDB Viewer software (<xref ref-type="bibr" rid="B25">Guex and Peitsch, 1997</xref>) and selecting for C&#x3b1; atoms. An iterative fitting process was used to optimize the overlay. The number of matching C&#x3b1; atoms was noted, as well as the rmsd for the fit (<xref ref-type="table" rid="T1">Table 1</xref>). This overlay was then examined for insertions or deletions in specific secondary structure elements as defined in the Symfoil structure (<xref ref-type="fig" rid="F1">Figure 1</xref>). The percent of C&#x3b1; matches per secondary structure element was also determined.</p>
</sec>
<sec id="s2-4">
<title>Sequence Logo Plots</title>
<p>Sequence logo plots are a graphical representation of an amino acid (or nucleic acid) multiple sequence alignment (<xref ref-type="bibr" rid="B50">Schneider and Stephens, 1990</xref>; <xref ref-type="bibr" rid="B18">Crooks et al., 2004</xref>). Each logo consists of stacks of symbols, one stack for each position in the sequence. The height of symbols within a stack indicates the relative frequency of each amino at that position. A sequence logo plot was generated for &#x3b2;-strands S1, S5, and S9 as a group; similarly, S2, S6, and S10 as a group; S3, S7, and S11 as a group; and S4, S8, and S12 as a group (i.e., all sets of C<sub>3</sub> symmetry related strands, <italic>n</italic> &#x3d; 126), for all representative &#x3b2;-trefoil proteins in <xref ref-type="table" rid="T1">Table 1</xref> and using structural overlays as described above. Image generation utilized the web logo server at <ext-link ext-link-type="uri" xlink:href="https://weblogo.berkeley.edu/">https://weblogo.berkeley.edu/</ext-link> with colors based on chemical properties: polar amino acids (G,S,T,Y,C,Q,N) are green, basic (K,R,H) blue, acidic (D,E) red and hydrophobic (A,V,L,I,P,W,F,M) amino acids are black.</p>
</sec>
</sec>
<sec sec-type="results" id="s3">
<title>Results</title>
<sec id="s3-1">
<title>Secondary Structure Length and Conformational Heterogeneity</title>
<p>An analysis of the secondary structure length heterogeneity for the &#x3b2;-trefoil superfamily of proteins, compared to the Symfoil reference, shows that the heterogeneity is localized almost exclusively to turn secondary structure; indeed, all &#x3b2;-strands show a remarkable absence of relative insertion or deletion (i.e., all &#x3b2;-strands show a marked conservation of length (<xref ref-type="fig" rid="F3">Figure 3</xref>). Furthermore, the heterogeneity in the turn regions principally involves insertions, as opposed to deletions, compared to the Symfoil reference protein. However, there are two notable exceptions to this general rule at turns T4 and T8, where some &#x3b2;-trefoils have limited deletions of up to three amino acids.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption>
<p>Relative insertions or deletions in secondary structure elements among the &#x3b2;-trefoil superfamily of proteins. The reference protein is the Symfoil protein&#x2014;a <italic>de novo</italic> designed, purely-symmetric, minimalist, and functionless &#x3b2;-trefoil protein (see text).</p>
</caption>
<graphic xlink:href="fmolb-09-889943-g003.tif"/>
</fig>
<p>An analysis of the C&#x3b1; structural conservation for regions of secondary structure in &#x3b2;-trefoil proteins, compared to the Symfoil-4T reference, shows that not only do &#x3b2;-strand regions show highly-conserved lengths, but that their overall conformation as &#x3b2;-strands is also highly-conserved (<xref ref-type="fig" rid="F4">Figure 4</xref>). It can be seen that for the entire superfamily of &#x3b2;-trefoils a &#x3e;90% structural conservation (i.e., &#x3c;1.5&#xa0;&#xc5; rmsd) is present with the symmetry-related sets of &#x3b2;-strands S1/S5/S9, S3/S7/S11, and S4/S8/S12. The S2/S6/S10 set exhibits 76&#x2013;84% C&#x3b1; structural conservation. Among turn secondary structure, turns T4 and T8 (which are symmetry-related) exhibit the least C&#x3b1; structural conservation.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption>
<p>C&#x3b1; structural conservation (&#x3c;1.5&#xa0;&#xc5;&#xa0;rmsd) within secondary structure elements for the &#x3b2;-trefoil family of proteins. The reference protein is the Symfoil protein (RCSB 3O4B)&#x2014;a <italic>de novo</italic> designed, purely-symmetric, minimalist, and functionless &#x3b2;-trefoil protein (see <xref ref-type="fig" rid="F1">Figure 1</xref>).</p>
</caption>
<graphic xlink:href="fmolb-09-889943-g004.tif"/>
</fig>
<p>The Ricin B-like, Cytokine, and 30&#xa0;K Lipoprotein superfamilies have the greatest number of members, with 22, 10, and 3 members, respectively (<xref ref-type="table" rid="T1">Table 1</xref>). The secondary structure length heterogeneity for these individual families is shown in <xref ref-type="fig" rid="F5">Figure 5</xref>. This graph suggests that the general turn heterogeneity observed in the overall superfamily graph (<xref ref-type="fig" rid="F3">Figure 3</xref>) is a composite of patterns of turn heterogeneity unique to the individual superfamilies or families. Thus, the Ricin B-like lectin superfamily exhibits the greatest turn heterogeneity (i.e., extensions) at T2, T3, T4, T6, and T10; while the Cytokine superfamily exhibits turn extensions principally at T3, T4, T7, T9, and T11; and the 30&#xa0;K Lipoprotein superfamily exhibits turn extensions principally at T2, T6, and T10. Thus, each different superfamily exhibits characteristically different turn heterogeneity (i.e., extensions).</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption>
<p>Relative secondary structure insertions or deletions of individual &#x3b2;-trefoil superfamilies. <bold>(A)</bold>: Ricin B-like lectin superfamily (<italic>n</italic> &#x3d; 22). <bold>(B)</bold>: Cytokine superfamily (<italic>n</italic> &#x3d; 10). <bold>(C)</bold>: 30&#xa0;K Lipoprotein superfamily (<italic>n</italic> &#x3d; 3).</p>
</caption>
<graphic xlink:href="fmolb-09-889943-g005.tif"/>
</fig>
</sec>
<sec id="s3-2">
<title>Sequence Logo Plots</title>
<p>The sequence logo plots for the &#x3b2;-strand secondary structure exhibit characteristic patterns of hydrophobic residues (<xref ref-type="fig" rid="F6">Figure 6</xref>). In &#x3b2;-strands S1/S5/S9 position &#x23;4 is principally hydrophobic: Ile and Leu account for 80% of all amino acids at this position, with the other residues being Phe, Tyr, Val and Met. There is some indication of hydrophobic preference at position &#x23;2, with Val and Phe accounting for approximately 40% of positions (and if Y is considered hydrophobic, then &#x223c;50% of residues at position &#x23;2 are hydrophobic). In &#x3b2;-strands S2/S6/S10 positions &#x23;3 and &#x23;5 show a clear hydrophobic preference. Leu accounts for &#x223c;50% of residues at position &#x23;3, with the majority of other residues being either Val, Ile, Phe or Trp. At position &#x23;5 Leu, Val, Ile, Ala and Met account for &#x223c;66% of residues. In &#x3b2;-strands S3/S7/S11 Val, Leu and Ile account for &#x223c;75% of residues at position &#x23;2. Ala, Leu, Val and Ile account for &#x223c;50% of residues at position &#x23;4, with Gly another major residue at this position. In &#x3b2;-strands S4/S8/S12 there is a remarkable &#x223c;70% preference of aromatic residues W or F at position &#x23;2 (with Leu, Val and Ile comprising the majority of the remainder). Hydrophobic residues are also preferred at position &#x23;4, with Ile, Leu, Phe, and Val comprising &#x223c;60% of residues. Thus, in all &#x3b2;-strands there is a hydrophobic (P)/hydrophilic (H) pattern of H-P-H-P-H. Binary patterning of hydrophobic/hydrophilic amino acids is a key determinant of protein secondary structure, with an alternating hydrophobic/hydrophilic pattern favoring the formation of amphipathic &#x3b2;-strand secondary structure (<xref ref-type="bibr" rid="B62">West and Hecht, 1995</xref>; <xref ref-type="bibr" rid="B64">Xiong et al., 1995</xref>). These hydrophobic residues within the H-P-H-P-H patterning of the &#x3b2;-trefoil &#x3b2;-strands contribute to a highly-cooperative core packing group in the &#x3b2;-trefoil structure (<xref ref-type="bibr" rid="B7">Blaber, 2021</xref>).</p>
<fig id="F6" position="float">
<label>FIGURE 6</label>
<caption>
<p>Sequence logo plots for the &#x3b2;-strand secondary structure in the &#x3b2;-trefoil superfamily. Equivalent &#x3b2;-strands are grouped by the C<sub>3</sub> rotational symmetry of the &#x3b2;-trefoil for all members of the superfamily in <xref ref-type="table" rid="T1">Table 1</xref>. Thus, strands S2, S6, and S10 are grouped together in this analysis, and similarly for the other symmetry-related &#x3b2;-strands (therefore, <italic>n</italic> &#x3d; 126 at each position). The single letter amino acid code is utilized, and the height indicates the relative prevalence of a particular amino acid at each position. The amino acids are colored according to chemical properties (see text); however, hydrophobic is indicated by black.</p>
</caption>
<graphic xlink:href="fmolb-09-889943-g006.tif"/>
</fig>
</sec>
</sec>
<sec sec-type="discussion" id="s4">
<title>Discussion</title>
<sec id="s4-1">
<title>Is Symfoil-4T a &#x201c;Minimal&#x201d; &#x3b2;-Trefoil?</title>
<p>Among the <italic>de novo</italic> designed symmetric &#x3b2;-trefoil proteins Symfoil is the most compact, principally due to the absence of specific functional surface turns/loops. Analyses of structural variations (i.e., insertions or deletions) of other &#x3b2;-trefoil proteins indicate that the vast majority of structural heterogeneity is associated with insertions in surface turn/loop regions in comparison to Symfoil. However, there is evidence of some &#x3b2;-trefoil proteins having relative truncations in the T4 and T8 regions (<xref ref-type="fig" rid="F3">Figures 3</xref>, <xref ref-type="fig" rid="F5">5A</xref>). Specifically, 1FWV, 1ABR, 1KNL, 2ZQO, 1SZ6, 1XYF, and 1XEZ (all members of the Ricin B-like lectin superfamily, <xref ref-type="table" rid="T1">Table 1</xref>) have three amino acid deletions in both the T4 and T8 regions. These deletions effectively eliminate the hydrophobic residue at the &#x23;2 position in the S5 and S9 &#x3b2;-strands (which participate in the cooperative central core); thus, these truncations of the T4 and T8 turns may result in a less stable, or less cooperatively-folding, protein. The Symfoil protein therefore represents a &#x201c;minimal&#x201d; or &#x201c;essential&#x201d; &#x3b2;-trefoil architecture&#x2014;one that is highly-conserved in the family of &#x3b2;-trefoil proteins&#x2014;and is therefore a useful reference structure by which to characterize secondary structure heterogeneity in &#x3b2;-trefoil proteins.</p>
</sec>
<sec id="s4-2">
<title>Is There a Segregation of &#x3b2;-Strand and Turn Secondary Structure as Regards Protein Structure and Function?</title>
<p>The highly-conserved &#x3b2;-strands, and highly-divergent turn/loop regions, when comparing members of the &#x3b2;-trefoil superfamily, strongly suggests that functionality has its principle basis in turn/loop structure. For example, the specific heparin-binding functionality of FGF-1 (Cytokine superfamily) has been localized principally to an extension within the T11 region (<xref ref-type="bibr" rid="B15">Brych et al., 2004</xref>) while interaction with FGF receptor involves the T1, T4, and T8 regions (<xref ref-type="bibr" rid="B44">Olsen et al., 2004</xref>). Lectin functionality in the shellfish lectin MytiLec-1 and <italic>M. oreades</italic> mushroom lectin is localized to regions T2, T6, and T10 (<xref ref-type="bibr" rid="B13">Broom et al., 2015</xref>; <xref ref-type="bibr" rid="B56">Terada et al., 2017</xref>). The inhibitory function of Kunitz (STI) protease inhibitors is due to active site binding of an extended T4 loop region (<xref ref-type="bibr" rid="B51">Song and Suh, 1998</xref>). Ricin B-like lectin interactions involve the T2/T3 and T10/T11 regions (<xref ref-type="bibr" rid="B52">Suzuki et al., 2009</xref>). The Pmt2-MIR domain (superfamily MIR domain) interaction with tetraethylene glycol ligand involves regions T4 and T7 (<xref ref-type="bibr" rid="B16">Chiapparino et al., 2020</xref>). The interaction between LAG-1 (CSL) DNA-binding protein and DNA ligand principally involves the T1 region (<xref ref-type="bibr" rid="B24">Friedmann et al., 2008</xref>). The interaction between Agglutinin and T-disaccharide involves the T6 and T10 region (<xref ref-type="bibr" rid="B58">Transue et al., 1997</xref>). The interaction between clitocypin and cathepsin V involves the T1 and T3 regions (<xref ref-type="bibr" rid="B48">Renko et al., 2010</xref>). This representative summary of binding interactions provides strong support for a primary assignment of functionality to specific and structurally-heterogenous turn/loop regions in &#x3b2;-trefoil proteins.</p>
</sec>
<sec id="s4-3">
<title>Can Turn Structure Provide Evidence of Evolutionary Gene Duplication/Fusion Processes?</title>
<p>Symmetric relationships among turn/loop structures in &#x3b2;-trefoils appears most apparent within the symmetry-related set of T2/T6/T10 turn positions. There are &#x3b2;-trefoil proteins having relative insertions of <italic>n</italic> &#x3d; &#x2b;1 (1UPS), <italic>n</italic> &#x3d; &#x2b;5 (3PG0), <italic>n</italic> &#x3d; &#x2b;6 (4IY9), <italic>n</italic> &#x3d; &#x2b;7 (5XG5), and <italic>n</italic> &#x3d; &#x2b;8 (1T9F) amino acids, relative to the Symfoil (i.e., 3O4B) reference structure. Additionally, similar examples exist having no relative insertions (i.e., <italic>n</italic> &#x3d; 0; 1Q1U/1IHK/2FDB/1IJT/1BFG/1RG8) as well as <italic>n</italic> &#x3d; &#x2212;1 deletions (1WD3) (<xref ref-type="fig" rid="F7">Figure 7</xref>). The most parsimonious explanation for such structural conservation of these symmetry-related turns is for duplication/fusion events to occur subsequent to trefoil motif structural evolution. This implies the likelihood of multiple independent instances of the evolution of &#x3b2;-trefoil proteins from simpler (i.e., trefoil-fold) motifs, and supports the evolutionary hypothesis put forth by Meiering (<xref ref-type="bibr" rid="B12">Broom et al., 2012</xref>) that the emergence of &#x3b2;-trefoil proteins is a recurring and ongoing evolutionary mechanism.</p>
<fig id="F7" position="float">
<label>FIGURE 7</label>
<caption>
<p>Examples of &#x3b2;-trefoil proteins having distinct C<sub>3</sub> symmetry at the T2/T6/T10 turn region. The turn length in reference to the Symfoil protein is -1 (3PG0), 0 (2FDB), &#x2b;1 (1UPS), &#x2b;5 (3PG0), &#x2b;6 (4IY9), and &#x2b;8 (1T9F). The view is down the C<sub>3</sub> axis of rotational symmetry. Such symmetric relationships in turn structure suggests divergence of this turn structure occurred prior to duplication/fusion/truncation events leading to the extant &#x3b2;-trefoil architecture.</p>
</caption>
<graphic xlink:href="fmolb-09-889943-g007.tif"/>
</fig>
<p>In the simplest example of duplication and fusion of individual trefoil-motifs leading ultimately to formation of a &#x3b2;-trefoil protein, the junction of gene fusion is the T4 turn region (<xref ref-type="bibr" rid="B47">Ponting and Russell, 2000</xref>; <xref ref-type="bibr" rid="B31">Lee and Blaber, 2011</xref>; <xref ref-type="bibr" rid="B32">Lee et al., 2011</xref>). Thus, the &#x3b2;-trefoil architecture contains two symmetry-related turns T4 and T8, with the &#x201c;third&#x201d; member of this symmetrically-related set defined by the adjacent (but discontinuous) N- and C- termini (see <xref ref-type="fig" rid="F1">Figure 1</xref>). As with the T2/T6/T10 turns, a number of &#x3b2;-trefoil proteins exhibit a unique structural symmetry when comparing the T4 and T8 turns (e.g., 1FWV, 1ABR, 1KNL, 2ZQO, 1SZ6, 1XYF, 1XEZ; as described above). This implies that this turn formed prior to the duplication and fusion event that yielded the mature &#x3b2;-trefoil architecture. However, this results in a structural conundrum. The existence of a T4 region results from the fusion of two trefoil motifs. Two such turns (i.e., T4 and T8) would be generated by a subsequent tandem duplication of such a construct; however, this would yield a total of four sequential trefoil motifs. The apparent solution to the presence of an &#x201c;extra&#x201d; trefoil motif is for the latter fusion to include a truncation event affecting one trefoil motif (<xref ref-type="bibr" rid="B28">Jeltsch, 1999</xref>; <xref ref-type="bibr" rid="B46">Peisajovich et al., 2006</xref>; <xref ref-type="bibr" rid="B34">Longo et al., 2013</xref>).</p>
</sec>
<sec id="s4-4">
<title>Turns and the Folding Nucleus</title>
<p>In addition to providing a potential functional role, turns also serve to connect adjacent &#x3b2;-strand secondary structure (forming a &#x3b2;-hairpin), minimizing the entropic penalty of association, and thereby influencing stability and folding (<xref ref-type="bibr" rid="B40">Nagi et al., 1999</xref>; <xref ref-type="bibr" rid="B57">Thompson and Eisenberg, 1999</xref>; <xref ref-type="bibr" rid="B33">Lindberg et al., 2006</xref>). The reaction coordinate of cooperative protein folding typically describes a highly-polarized transition state or folding nucleus (<xref ref-type="bibr" rid="B1">Abkevich et al., 1994</xref>; <xref ref-type="bibr" rid="B61">Went and Jackson, 2005</xref>; <xref ref-type="bibr" rid="B21">Fa&#xed;sca, 2009</xref>). Establishment of this folding nucleus is the rate limiting step in folding, and once formed, serves to rapidly condense formation of the overall native structure. An isolated 42-mer trefoil motif (i.e., &#x201c;Monofoil&#x201d;) derived from the Symfoil protein spontaneously oligomerizes to yield an intact &#x3b2;-trefoil architecture (<xref ref-type="bibr" rid="B31">Lee and Blaber, 2011</xref>; <xref ref-type="bibr" rid="B32">Lee et al., 2011</xref>); thus, a serviceable folding nucleus resides within each repeating motif in the Symfoil protein (<xref ref-type="bibr" rid="B6">Blaber, 2020</xref>; <xref ref-type="bibr" rid="B45">Parker et al., 2021</xref>). However, phi-value analysis (<xref ref-type="bibr" rid="B23">Fersht and Sato, 2004</xref>) indicates that the effective folding nucleus in the Symfoil protein, and the related fibroblast-growth factor-1 &#x3b2;-trefoil protein, while not identical, are both centrally-located and more expansive than an individual trefoil motif (<xref ref-type="bibr" rid="B35">Longo et al., 2014</xref>; <xref ref-type="bibr" rid="B63">Xia et al., 2016</xref>). This more expansive central definition includes turns T4 and T8, which are novel turn structures generated by the fusion of trefoil motif repeats. These novel turns are postulated to promote local &#x3b2;-hairpin interactions, thereby generating a more efficient folding nucleus compared to an isolated trefoil motif. However, destabilizing mutations targeting the folding nucleus region of Symfoil indicate that the C<sub>3</sub> symmetry provides for alternative folding nuclei in other regions of the protein able to salvage foldability (<xref ref-type="bibr" rid="B34">Longo et al., 2013</xref>; <xref ref-type="bibr" rid="B55">Tenorio et al., 2020</xref>). The survey of turn region lengths in the &#x3b2;-trefoil superfamily indicates that the central region comprises turns having generally the shortest lengths (<xref ref-type="fig" rid="F3">Figure 3</xref>). Thus, central turns may be somewhat &#x201c;privileged&#x201d; regions of secondary structure where considerations of efficient folding nucleus formation impact the optimal turn length and sequence design.</p>
</sec>
<sec id="s4-5">
<title>Implications and Suitability of &#x3b2;-Trefoil Proteins for <italic>de novo</italic> Design</title>
<p>The secondary structure elements of the fundamental &#x3b2;-trefoil are limited to &#x3b2;-strand and reverse turn, and thus describe a comparatively simple protein architecture. Knowledge essential for the <italic>de novo</italic> design of &#x3b2;-trefoil proteins is extensive: 1) The &#x3b2;-strand secondary structure is the key determinant of the conserved basic architecture for this protein superfamily; 2) Conserved &#x3b2;-strand characteristics have been elucidated as regards length and hydrophobic patterning; and 3) The role of &#x3b2;-strand hydrophobic residues in cooperative core-packing interactions has been well-characterized. In this regard, it is interesting to note the different independent solutions for the set of hydrophobic core-packing residues (referencing <xref ref-type="fig" rid="F6">Figure 6</xref>) utilized by the <italic>de novo</italic> designed symmetric &#x3b2;-trefoil proteins Symfoil [3O4B; generated through top-down symmetric deconstruction of FGF-1 (<xref ref-type="bibr" rid="B31">Lee and Blaber, 2011</xref>; <xref ref-type="bibr" rid="B32">Lee et al., 2011</xref>)], Phifoil [4O4W; generated by folding nucleus symmetric expansion of FGF-1 (<xref ref-type="bibr" rid="B35">Longo et al., 2014</xref>)], Threefoil [3PG0; generated by consensus sequence of a carbohydrate-binding ricin sequence (<xref ref-type="bibr" rid="B12">Broom et al., 2012</xref>)], and Mitsuba-1 [5XG5; generated by computational sequence constraint of the shellfish lectin MytiLec-1 (<xref ref-type="bibr" rid="B56">Terada et al., 2017</xref>)]. The sequence logo plot for this set of core-packing residues (<xref ref-type="fig" rid="F8">Figure 8</xref>) suggests that, as long as the appropriate hydrophobic patterning and compatible van der Waals interactions are satisfied, a variety of alternative core-packing arrangements are permissible, thereby indicating a lowered threshold for successful design.</p>
<fig id="F8" position="float">
<label>FIGURE 8</label>
<caption>
<p>Sequence logo plot of the set of symmetric core-packing residues (see <xref ref-type="fig" rid="F6">Figure 6</xref>) present in <italic>de novo</italic> designed symmetric &#x3b2;-trefoil proteins Symfoil-4T (3O4B), Phifoil (4OW4), Threefoil (3PG0), and Misuba-1 (5XG5). The positions within a single trefoil motif are shown, but these are replicated exactly for the other two trefoil motifs in each protein. Position &#x23;4 in S1, and position &#x23;3 in S2, have the highest neighbor contacts among the set of core residues (<xref ref-type="bibr" rid="B7">Blaber, 2021</xref>).</p>
</caption>
<graphic xlink:href="fmolb-09-889943-g008.tif"/>
</fig>
<p>The general attributes of the folding nucleus for Symfoil have been identified, and the potential for redundant folding nuclei demonstrated. Evolutionary considerations indicate highly-permissive design pathways of foldability involving diverse fusion/truncation of trefoil motifs. Turn regions have been identified as the key regions of structural variability, and are the principle determinants of ligand functionality characteristic of this superfamily. As connectors of adjacent &#x3b2;-strand secondary structure, turn regions also influence the entropic penalty for the assembly of local &#x3b2;-hairpin structure, and this plays an important role in the formation of the folding nucleus.</p>
<p>Protein design must simultaneously solve at least three different problems: 1) protein foldability (i.e., folding kinetics requirements), 2) protein stability (i.e., thermodynamic requirements), and 3) the accommodation of specific function (with potential structural dynamics requirements). Analysis of the &#x3b2;-trefoil architecture suggests that it is readily amenable to a two-step design process, with the initial step focusing upon the design of a foldable, stable &#x201c;scaffold&#x201d; (and many avenues appear possible); subsequently followed by a second step of functional mutation. The present analysis indicates that the first step involves &#x3b2;-strand secondary structure and key hydrophobic patterning design (building upon current extensive knowledge in this area). The C<sub>3</sub> symmetry substantially reduces the combinatorial search of appropriate primary structure solutions. The second step focuses upon turn/loop regions and their mutation to generate desired functionality (the &#x3b2;-trefoil architecture perhaps best suited to ligand functionality). This second step is less-well characterized and therefore open to expansive and novel opportunities. The C<sub>3</sub> symmetry provides for monovalent or multivalent ligand binding opportunities. In an alternative approach, if specific loop regions are associated with unique functional properties, and the &#x3b2;-strands as structural elements, then diverse chimeras with novel combined structure/function attributes might be constructed using computational approaches (<xref ref-type="bibr" rid="B22">Ferruz et al., 2021</xref>). Overall, the &#x3b2;-trefoil architecture has many attractive features for <italic>de novo</italic> protein design, applied especially to ligand functionality. The adoption of heparin-binding functionality into a benign &#x3b2;-trefoil scaffold using the principles described herein has recently been demonstrated (<xref ref-type="bibr" rid="B54">Tenorio et al., Forthcoming 2022</xref>).</p>
</sec>
</sec>
</body>
<back>
<sec id="s5">
<title>Data Availability Statement</title>
<p>The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.</p>
</sec>
<sec id="s6">
<title>Author Contributions</title>
<p>MB is responsible for planning, data analysis, and writing of this report.</p>
</sec>
<sec id="s7">
<title>Funding</title>
<p>This work was supported by research grant RF02551 from Trefoil Therapeutics Inc. Additional support was provided by the department of Biomedical Sciences, FSU College of Medicine.</p>
</sec>
<sec sec-type="COI-statement" id="s8">
<title>Conflict of Interest</title>
<p>MB is a cofounder and has equity ownership in Trefoil Therapeutics Inc.</p>
</sec>
<sec sec-type="disclaimer" id="s9">
<title>Publisher&#x2019;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ack>
<p>The work is dedicated to Brian Matthews on the occasion of his retirement. A truly outstanding scientist and mentor who led naturally and effectively by example.</p>
</ack>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Abkevich</surname>
<given-names>V. I.</given-names>
</name>
<name>
<surname>Gutin</surname>
<given-names>A. M.</given-names>
</name>
<name>
<surname>Shakhnovich</surname>
<given-names>E. I.</given-names>
</name>
</person-group> (<year>1994</year>). <article-title>Specific Nucleus as the Transition State for Protein Folding: Evidence from the Lattice Model</article-title>. <source>Biochemistry</source> <volume>33</volume> (<issue>33</issue>), <fpage>10026</fpage>&#x2013;<lpage>10036</lpage>. <pub-id pub-id-type="doi">10.1021/bi00199a029</pub-id> </citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Afanasieva</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Chaudhuri</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Martin</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Hertle</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Ursinus</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Alva</surname>
<given-names>V.</given-names>
</name>
<etal/>
</person-group> (<year>2019</year>). <article-title>Structural Diversity of Oligomeric &#x3b2;-propellers with Different Numbers of Identical Blades</article-title>. <source>eLife</source> <volume>8</volume>, <fpage>e49853</fpage>. <pub-id pub-id-type="doi">10.7554/eLife.49853</pub-id> </citation>
</ref>
<ref id="B3">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Andreeva</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Howorth</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Chothia</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Kulesha</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Murzin</surname>
<given-names>A. G.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>SCOP2 Prototype: a New Approach to Protein Structure Mining</article-title>. <source>Nucl. Acids Res.</source> <volume>42</volume> (<issue>D1</issue>), <fpage>D310</fpage>&#x2013;<lpage>D314</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkt1242</pub-id> </citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Andreeva</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Kulesha</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Gough</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Murzin</surname>
<given-names>A. G.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>The SCOP Database in 2020: Expanded Classification of Representative Family and Superfamily Domains of Known Protein Structures</article-title>. <source>Nucleic Acids Res.</source> <volume>48</volume> (<issue>D1</issue>), <fpage>D376</fpage>&#x2013;<lpage>D382</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkz1064</pub-id> </citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Balaji</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Internal Symmetry in Protein Structures: Prevalence, Functional Relevance and Evolution</article-title>. <source>Curr. Opin. Struct. Biol.</source> <volume>32</volume>, <fpage>156</fpage>&#x2013;<lpage>166</lpage>. <pub-id pub-id-type="doi">10.1016/j.sbi.2015.05.004</pub-id> </citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Blaber</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Conserved Buried Water Molecules Enable the &#x3b2;&#x2010;trefoil Architecture</article-title>. <source>Protein Sci.</source> <volume>29</volume> (<issue>8</issue>), <fpage>1794</fpage>&#x2013;<lpage>1802</lpage>. <pub-id pub-id-type="doi">10.1002/pro.3899</pub-id> </citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Blaber</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Cooperative Hydrophobic Core Interactions in the &#x3b2;&#x2010;trefoil Architecture</article-title>. <source>Protein Sci.</source> <volume>30</volume> (<issue>5</issue>), <fpage>956</fpage>&#x2013;<lpage>965</lpage>. <pub-id pub-id-type="doi">10.1002/pro.4059</pub-id> </citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Blaber</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>Designing Proteins from Simple Motifs: Opportunities in Top-Down Symmetric Deconstruction</article-title>. <source>Curr. Opin. Struct. Biol.</source> <volume>22</volume> (<issue>4</issue>), <fpage>442</fpage>&#x2013;<lpage>450</lpage>. <pub-id pub-id-type="doi">10.1016/j.sbi.2012.05.008</pub-id> </citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Blow</surname>
<given-names>D. M.</given-names>
</name>
<name>
<surname>Janin</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Sweet</surname>
<given-names>R. M.</given-names>
</name>
</person-group> (<year>1974</year>). <article-title>Mode of Action of Soybean Trypsin Inhibitor (Kunitz) as a Model for Specific Protein-Protein Interactions</article-title>. <source>Nature</source> <volume>249</volume>, <fpage>54</fpage>&#x2013;<lpage>57</lpage>. <pub-id pub-id-type="doi">10.1038/249054a0</pub-id> </citation>
</ref>
<ref id="B10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bolon</surname>
<given-names>D. N.</given-names>
</name>
<name>
<surname>Voigt</surname>
<given-names>C. A.</given-names>
</name>
<name>
<surname>Mayo</surname>
<given-names>S. L.</given-names>
</name>
</person-group> (<year>2002</year>). <article-title>De Novo design of Biocatalysts</article-title>. <source>Curr. Opin. Chem. Biol.</source> <volume>6</volume> (<issue>2</issue>), <fpage>125</fpage>&#x2013;<lpage>129</lpage>. <pub-id pub-id-type="doi">10.1016/S1367-5931(02)00303-4</pub-id> </citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bovi</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Cenci</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Perduca</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Capaldi</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Carrizo</surname>
<given-names>M. E.</given-names>
</name>
<name>
<surname>Civiero</surname>
<given-names>L.</given-names>
</name>
<etal/>
</person-group> (<year>2012</year>). <article-title>BEL &#x3b2;-trefoil: A Novel Lectin with Antineoplastic Properties in king Bolete (Boletus Edulis) Mushrooms</article-title>. <source>Glycobiology</source> <volume>23</volume> (<issue>5</issue>), <fpage>578</fpage>&#x2013;<lpage>592</lpage>. <pub-id pub-id-type="doi">10.1093/glycob/cws164</pub-id> </citation>
</ref>
<ref id="B12">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Broom</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Doxey</surname>
<given-names>A. C.</given-names>
</name>
<name>
<surname>Lobsanov</surname>
<given-names>Y. D.</given-names>
</name>
<name>
<surname>Berthin</surname>
<given-names>L. G.</given-names>
</name>
<name>
<surname>Rose</surname>
<given-names>D. R.</given-names>
</name>
<name>
<surname>Howell</surname>
<given-names>P. L.</given-names>
</name>
<etal/>
</person-group> (<year>2012</year>). <article-title>Modular Evolution and the Origins of Symmetry: Reconstruction of a Three-fold Symmetric Globular Protein</article-title>. <source>Structure</source> <volume>20</volume>, <fpage>161</fpage>&#x2013;<lpage>171</lpage>. <pub-id pub-id-type="doi">10.1016/j.str.2011.10.021</pub-id> </citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Broom</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Ma</surname>
<given-names>S. M.</given-names>
</name>
<name>
<surname>Xia</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Rafalia</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Trainor</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Col&#xf3;n</surname>
<given-names>W.</given-names>
</name>
<etal/>
</person-group> (<year>2015</year>). <article-title>Designed Protein Reveals Structural Determinants of Extreme Kinetic Stability</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A.</source> <volume>112</volume>, <fpage>14605</fpage>&#x2013;<lpage>14610</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1510748112</pub-id> </citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Brunette</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Parmeggiani</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>P.-S.</given-names>
</name>
<name>
<surname>Bhabha</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Ekiert</surname>
<given-names>D. C.</given-names>
</name>
<name>
<surname>Tsutakawa</surname>
<given-names>S. E.</given-names>
</name>
<etal/>
</person-group> (<year>2015</year>). <article-title>Exploring the Repeat Protein Universe through Computational Protein Design</article-title>. <source>Nature</source> <volume>528</volume>, <fpage>580</fpage>&#x2013;<lpage>584</lpage>. <pub-id pub-id-type="doi">10.1038/nature16162</pub-id> </citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Brych</surname>
<given-names>S. R.</given-names>
</name>
<name>
<surname>Dubey</surname>
<given-names>V. K.</given-names>
</name>
<name>
<surname>Bienkiewicz</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Logan</surname>
<given-names>T. M.</given-names>
</name>
<name>
<surname>Blaber</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2004</year>). <article-title>Symmetric Primary and Tertiary Structure Mutations within a Symmetric Superfold: a Solution, Not a Constraint, to Achieve a Foldable Polypeptide</article-title>. <source>J. Mol. Biol.</source> <volume>344</volume> (<issue>3</issue>), <fpage>769</fpage>&#x2013;<lpage>780</lpage>. <pub-id pub-id-type="doi">10.1016/j.jmb.2004.09.060</pub-id> </citation>
</ref>
<ref id="B16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chiapparino</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Grbavac</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Jonker</surname>
<given-names>H. R.</given-names>
</name>
<name>
<surname>Hackmann</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Mortensen</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Zatorska</surname>
<given-names>E.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>Functional Implications of MIR Domains in Protein O-Mannosylation</article-title>. <source>eLife</source> <volume>9</volume>, <fpage>e61189</fpage>. <pub-id pub-id-type="doi">10.7554/eLife.61189</pub-id> </citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Claren</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Malisi</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>H&#xf6;cker</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Sterner</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>Establishing Wild-type Levels of Catalytic Activity on Natural and Artificial (&#x3b2;&#x3b1;)<sub>8</sub>-barrel Protein Scaffolds</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A.</source> <volume>106</volume> (<issue>10</issue>), <fpage>3704</fpage>&#x2013;<lpage>3709</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.0810342106</pub-id> </citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Crooks</surname>
<given-names>G. E.</given-names>
</name>
<name>
<surname>Wolfe</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Brenner</surname>
<given-names>S. E.</given-names>
</name>
</person-group> (<year>2004</year>). <article-title>Measurements of Protein Sequence-Structure Correlations</article-title>. <source>Proteins</source> <volume>57</volume> (<issue>4</issue>), <fpage>804</fpage>&#x2013;<lpage>810</lpage>. <pub-id pub-id-type="doi">10.1002/prot.20262</pub-id> </citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dwyer</surname>
<given-names>M. A.</given-names>
</name>
<name>
<surname>Looger</surname>
<given-names>L. L.</given-names>
</name>
<name>
<surname>Hellinga</surname>
<given-names>H. W.</given-names>
</name>
</person-group> (<year>2004</year>). <article-title>Computational Design of a Biologically Active Enzyme</article-title>. <source>Science</source> <volume>304</volume> (<issue>5679</issue>), <fpage>1967</fpage>&#x2013;<lpage>1971</lpage>. <pub-id pub-id-type="doi">10.1126/science.1098432</pub-id> </citation>
</ref>
<ref id="B20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Eck</surname>
<given-names>R. V.</given-names>
</name>
<name>
<surname>Dayhoff</surname>
<given-names>M. O.</given-names>
</name>
</person-group> (<year>1966</year>). <article-title>Evolution of the Structure of Ferredoxin Based on Living Relics of Primitive Amino Acid Sequences</article-title>. <source>Science</source> <volume>152</volume> (<issue>April 15</issue>), <fpage>363</fpage>&#x2013;<lpage>366</lpage>. <pub-id pub-id-type="doi">10.1126/science.152.3720.363</pub-id> </citation>
</ref>
<ref id="B21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fa&#xed;sca</surname>
<given-names>P. F. N.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>The Nucleation Mechanism of Protein Folding: a Survey of Computer Simulation Studies</article-title>. <source>J. Phys. Condens. Matter</source> <volume>21</volume> (<issue>37</issue>), <fpage>373102</fpage>. <pub-id pub-id-type="doi">10.1088/0953-8984/21/37/373102</pub-id> </citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ferruz</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Noske</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>H&#xf6;cker</surname>
<given-names>B.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Protlego: A Python Package for the Analysis and Design of Chimeric Proteins</article-title>. <source>Bioinformatics</source> <volume>37</volume> (<issue>19</issue>), <fpage>3182</fpage>&#x2013;<lpage>3189</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btab253</pub-id> </citation>
</ref>
<ref id="B23">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fersht</surname>
<given-names>A. R.</given-names>
</name>
<name>
<surname>Sato</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2004</year>). <article-title>&#x3a6;-Value Analysis and the Nature of Protein-Folding Transition States</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A.</source> <volume>101</volume>, <fpage>7976</fpage>&#x2013;<lpage>7981</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.0402684101</pub-id> </citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Friedmann</surname>
<given-names>D. R.</given-names>
</name>
<name>
<surname>Wilson</surname>
<given-names>J. J.</given-names>
</name>
<name>
<surname>Kovall</surname>
<given-names>R. A.</given-names>
</name>
</person-group> (<year>2008</year>). <article-title>RAM-induced Allostery Facilitates Assembly of a Notch Pathway Active Transcription Complex</article-title>. <source>J. Biol. Chem.</source> <volume>283</volume> (<issue>21</issue>), <fpage>14781</fpage>&#x2013;<lpage>14791</lpage>. <pub-id pub-id-type="doi">10.1074/jbc.M709501200</pub-id> </citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Guex</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Peitsch</surname>
<given-names>M. C.</given-names>
</name>
</person-group> (<year>1997</year>). <article-title>SWISS-MODEL and the Swiss-Pdb Viewer: An Environment for Comparative Protein Modeling</article-title>. <source>Electrophoresis</source> <volume>18</volume>, <fpage>2714</fpage>&#x2013;<lpage>2723</lpage>. <pub-id pub-id-type="doi">10.1002/elps.1150181505</pub-id> </citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>H&#xf6;cker</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Claren</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Sterner</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2004</year>). <article-title>Mimicking Enzyme Evolution by Generating New (&#x3b2;&#x3b1;) 8 -barrels from (&#x3b2;&#x3b1;) 4 -Half-Barrels</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A.</source> <volume>101</volume>, <fpage>16448</fpage>&#x2013;<lpage>16453</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.0405832101</pub-id> </citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Huang</surname>
<given-names>P.-S.</given-names>
</name>
<name>
<surname>Feldmeier</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Parmeggiani</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Fernandez Velasco</surname>
<given-names>D. A.</given-names>
</name>
<name>
<surname>H&#xf6;cker</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Baker</surname>
<given-names>D.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>De Novo design of a Four-fold Symmetric TIM-Barrel Protein with Atomic-Level Accuracy</article-title>. <source>Nat. Chem. Biol.</source> <volume>12</volume> (<issue>1</issue>), <fpage>29</fpage>&#x2013;<lpage>34</lpage>. <pub-id pub-id-type="doi">10.1038/nchembio.1966</pub-id> </citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jeltsch</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>1999</year>). <article-title>Circular Permutations in the Molecular Evolution of DNA Methyltransferases</article-title>. <source>J. Mol. Evol.</source> <volume>49</volume>, <fpage>161</fpage>&#x2013;<lpage>164</lpage>. <pub-id pub-id-type="doi">10.1007/pl00006529</pub-id> </citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kimura</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Aumpuchin</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Hamaue</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Shimomura</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Kikuchi</surname>
<given-names>T.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Analyses of the Folding Sites of Irregular &#x3b2;-trefoil Fold Proteins through Sequence-Based Techniques and G&#x14d;-Model Simulations</article-title>. <source>BMC Mol. Cel Biol</source> <volume>21</volume> (<issue>1</issue>), <fpage>1</fpage>&#x2013;<lpage>17</lpage>. <pub-id pub-id-type="doi">10.1186/s12860-020-00271-4</pub-id> </citation>
</ref>
<ref id="B30">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kopec</surname>
<given-names>K. O.</given-names>
</name>
<name>
<surname>Lupas</surname>
<given-names>A. N.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>&#x3b2;-Propeller Blades as Ancestral Peptides in Protein Evolution</article-title>. <source>PLoS ONE</source> <volume>8</volume> (<issue>10</issue>), <fpage>e77074</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0077074</pub-id> </citation>
</ref>
<ref id="B31">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lee</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Blaber</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>Experimental Support for the Evolution of Symmetric Protein Architecture from a Simple Peptide Motif</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A.</source> <volume>108</volume>, <fpage>126</fpage>&#x2013;<lpage>130</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1015032108</pub-id> </citation>
</ref>
<ref id="B32">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lee</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Blaber</surname>
<given-names>S. I.</given-names>
</name>
<name>
<surname>Dubey</surname>
<given-names>V. K.</given-names>
</name>
<name>
<surname>Blaber</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>A Polypeptide "Building Block" for the &#x3b2;-Trefoil Fold Identified by "Top-Down Symmetric Deconstruction"</article-title>. <source>J. Mol. Biol.</source> <volume>407</volume>, <fpage>744</fpage>&#x2013;<lpage>763</lpage>. <pub-id pub-id-type="doi">10.1016/j.jmb.2011.02.002</pub-id> </citation>
</ref>
<ref id="B33">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lindberg</surname>
<given-names>M. O.</given-names>
</name>
<name>
<surname>Haglund</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Hubner</surname>
<given-names>I. A.</given-names>
</name>
<name>
<surname>Shakhnovich</surname>
<given-names>E. I.</given-names>
</name>
<name>
<surname>Oliveberg</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2006</year>). <article-title>Identification of the Minimal Protein-Folding Nucleus through Loop-Entropy Perturbations</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A.</source> <volume>103</volume> (<issue>11</issue>), <fpage>4083</fpage>&#x2013;<lpage>4088</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.0508863103</pub-id> </citation>
</ref>
<ref id="B34">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Longo</surname>
<given-names>L. M.</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Tenorio</surname>
<given-names>C. A.</given-names>
</name>
<name>
<surname>Blaber</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>Alternative Folding Nuclei Definitions Facilitate the Evolution of a Symmetric Protein Fold from a Smaller Peptide Motif</article-title>. <source>Structure</source> <volume>21</volume>, <fpage>2042</fpage>&#x2013;<lpage>2050</lpage>. <pub-id pub-id-type="doi">10.1016/j.str.2013.09.003</pub-id> </citation>
</ref>
<ref id="B35">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Longo</surname>
<given-names>L. M.</given-names>
</name>
<name>
<surname>Kumru</surname>
<given-names>O. S.</given-names>
</name>
<name>
<surname>Middaugh</surname>
<given-names>C. R.</given-names>
</name>
<name>
<surname>Blaber</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Evolution and Design of Protein Structure by Folding Nucleus Symmetric Expansion</article-title>. <source>Structure</source> <volume>22</volume>, <fpage>1377</fpage>&#x2013;<lpage>1384</lpage>. <pub-id pub-id-type="doi">10.1016/j.str.2014.08.008</pub-id> </citation>
</ref>
<ref id="B36">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>McLachlan</surname>
<given-names>A. D.</given-names>
</name>
</person-group> (<year>1972</year>). <article-title>Repeating Sequences and Gene Duplication in Proteins</article-title>. <source>J. Mol. Biol.</source> <volume>64</volume> (<issue>2</issue>), <fpage>417</fpage>&#x2013;<lpage>437</lpage>. <pub-id pub-id-type="doi">10.1016/0022-2836(72)90508-6</pub-id> </citation>
</ref>
<ref id="B37">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>McLachlan</surname>
<given-names>A. D.</given-names>
</name>
</person-group> (<year>1979</year>). <article-title>Three-fold Structural Pattern in the Soybean Trypsin Inhibitor (Kunitz)</article-title>. <source>J. Mol. Biol.</source> <volume>133</volume>, <fpage>557</fpage>&#x2013;<lpage>563</lpage>. <pub-id pub-id-type="doi">10.1016/0022-2836(79)90408-x</pub-id> </citation>
</ref>
<ref id="B38">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mukhopadhyay</surname>
<given-names>D.</given-names>
</name>
</person-group> (<year>2000</year>). <article-title>The Molecular Evolutionary History of a Winged Bean &#x3b1;-Chymotrypsin Inhibitor and Modeling of its Mutations through Structural Analyses</article-title>. <source>J. Mol. Evol.</source> <volume>50</volume>, <fpage>214</fpage>&#x2013;<lpage>223</lpage>. <pub-id pub-id-type="doi">10.1007/s002399910024</pub-id> </citation>
</ref>
<ref id="B39">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Murzin</surname>
<given-names>A. G.</given-names>
</name>
<name>
<surname>Lesk</surname>
<given-names>A. M.</given-names>
</name>
<name>
<surname>Chothia</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>1992</year>). <article-title>&#x3b2;-Trefoil Fold. Patterns Of Structure And Sequence In The Kunitz Inhibitors Interleukins-1&#x3b2; and 1&#x3b1; And Fibroblast Growth Factors</article-title>. <source>J. Mol. Biol.</source> <volume>223</volume>, <fpage>531</fpage>&#x2013;<lpage>543</lpage>. <pub-id pub-id-type="doi">10.1016/0022-2836(92)90668-a</pub-id> </citation>
</ref>
<ref id="B40">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nagi</surname>
<given-names>A. D.</given-names>
</name>
<name>
<surname>Anderson</surname>
<given-names>K. S.</given-names>
</name>
<name>
<surname>Regan</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>1999</year>). <article-title>Using Loop Length Variants to Dissect the Folding Pathway of a four-helix-bundle Protein</article-title>. <source>J. Mol. Biol.</source> <volume>286</volume> (<issue>1</issue>), <fpage>257</fpage>&#x2013;<lpage>265</lpage>. <pub-id pub-id-type="doi">10.1006/jmbi.1998.2474</pub-id> </citation>
</ref>
<ref id="B41">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nikkhah</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Jawad-Alami</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Demydchuk</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Ribbons</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Paoli</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2006</year>). <article-title>Engineering of &#x3b2;-propeller Protein Scaffolds by Multiple Gene Duplication and Fusion of an Idealized WD Repeat</article-title>. <source>Biomol. Eng.</source> <volume>23</volume>, <fpage>185</fpage>&#x2013;<lpage>194</lpage>. <pub-id pub-id-type="doi">10.1016/j.bioeng.2006.02.002</pub-id> </citation>
</ref>
<ref id="B42">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Notenboom</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Boraston</surname>
<given-names>A. B.</given-names>
</name>
<name>
<surname>Williams</surname>
<given-names>S. J.</given-names>
</name>
<name>
<surname>Kilburn</surname>
<given-names>D. G.</given-names>
</name>
<name>
<surname>Rose</surname>
<given-names>D. R.</given-names>
</name>
</person-group> (<year>2002</year>). <article-title>High-Resolution Crystal Structures of the Lectin-like Xylan Binding Domain from Streptomyces Lividans Xylanase 10A with Bound Substrates Reveal a Novel Mode of Xylan Binding,</article-title> <source>Biochemistry</source> <volume>41</volume> (<issue>13</issue>), <fpage>4246</fpage>&#x2013;<lpage>4254</lpage>. <pub-id pub-id-type="doi">10.1021/bi015865j</pub-id> </citation>
</ref>
<ref id="B43">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Ohno</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>1970</year>). <source>Evolution by Gene Duplication</source>. <publisher-loc>New York</publisher-loc>: <publisher-name>Allen &#x26; Unwin</publisher-name>. </citation>
</ref>
<ref id="B44">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Olsen</surname>
<given-names>S. K.</given-names>
</name>
<name>
<surname>Ibrahimi</surname>
<given-names>O. A.</given-names>
</name>
<name>
<surname>Raucci</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Eliseenkova</surname>
<given-names>A. V.</given-names>
</name>
<name>
<surname>Yayon</surname>
<given-names>A.</given-names>
</name>
<etal/>
</person-group> (<year>2004</year>). <article-title>Insights into the Molecular Basis for Fibroblast Growth Factor Receptor Autoinhibition and Ligand-Binding Promiscuity</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A.</source> <volume>101</volume>, <fpage>935</fpage>&#x2013;<lpage>940</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.0307287101</pub-id> </citation>
</ref>
<ref id="B45">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Parker</surname>
<given-names>J. B.</given-names>
</name>
<name>
<surname>Tenorio</surname>
<given-names>C. A.</given-names>
</name>
<name>
<surname>Blaber</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>The Ubiquitous Buried Water in the Beta-Trefoil Architecture Contributes to the Folding Nucleus and &#x223c;20% of the Folding Enthalpy</article-title>. <source>Protein Sci.</source> <volume>30</volume> (<issue>11</issue>), <fpage>2287</fpage>&#x2013;<lpage>2297</lpage>. <pub-id pub-id-type="doi">10.1002/pro.4192</pub-id> </citation>
</ref>
<ref id="B46">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Peisajovich</surname>
<given-names>S. G.</given-names>
</name>
<name>
<surname>Rockah</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Tawfik</surname>
<given-names>D. S.</given-names>
</name>
</person-group> (<year>2006</year>). <article-title>Evolution of New Protein Topologies through Multistep Gene Rearrangements</article-title>. <source>Nat. Genet.</source> <volume>38</volume>, <fpage>168</fpage>&#x2013;<lpage>174</lpage>. <pub-id pub-id-type="doi">10.1038/ng1717</pub-id> </citation>
</ref>
<ref id="B47">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ponting</surname>
<given-names>C. P.</given-names>
</name>
<name>
<surname>Russell</surname>
<given-names>R. B.</given-names>
</name>
</person-group> (<year>2000</year>). <article-title>Identification of Distant Homologues of Fibroblast Growth Factors Suggests a Common Ancestor for All &#x3b2;-trefoil Proteins 1 1Edited by J. Thornton</article-title>. <source>J. Mol. Biol.</source> <volume>302</volume>, <fpage>1041</fpage>&#x2013;<lpage>1047</lpage>. <pub-id pub-id-type="doi">10.1006/jmbi.2000.4087</pub-id> </citation>
</ref>
<ref id="B48">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Renko</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Saboti&#x10d;</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Miheli&#x10d;</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Brzin</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Kos</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Turk</surname>
<given-names>D.</given-names>
</name>
</person-group> (<year>2010</year>). <article-title>Versatile Loops in Mycocypins Inhibit Three Protease Families</article-title>. <source>J. Biol. Chem.</source> <volume>285</volume> (<issue>1</issue>), <fpage>308</fpage>&#x2013;<lpage>316</lpage>. <pub-id pub-id-type="doi">10.1074/jbc.M109.043331</pub-id> </citation>
</ref>
<ref id="B49">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Richter</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Bosnali</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Carstensen</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Seitz</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Durchschlag</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Blanquart</surname>
<given-names>S.</given-names>
</name>
<etal/>
</person-group> (<year>2010</year>). <article-title>Computational and Experimental Evidence for the Evolution of a (&#x3b2;&#x3b1;)<sub>8</sub>-Barrel Protein from an Ancestral Quarter-Barrel Stabilised by Disulfide Bonds</article-title>. <source>J. Mol. Biol.</source> <volume>398</volume>, <fpage>763</fpage>&#x2013;<lpage>773</lpage>. <pub-id pub-id-type="doi">10.1016/j.jmb.2010.03.057</pub-id> </citation>
</ref>
<ref id="B50">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schneider</surname>
<given-names>T. D.</given-names>
</name>
<name>
<surname>Stephens</surname>
<given-names>R. M.</given-names>
</name>
</person-group> (<year>1990</year>). <article-title>Sequence Logos: a New Way to Display Consensus Sequences</article-title>. <source>Nucl. Acids Res.</source> <volume>18</volume> (<issue>20</issue>), <fpage>6097</fpage>&#x2013;<lpage>6100</lpage>. <pub-id pub-id-type="doi">10.1093/nar/18.20.6097</pub-id> </citation>
</ref>
<ref id="B51">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Song</surname>
<given-names>H. K.</given-names>
</name>
<name>
<surname>Suh</surname>
<given-names>S. W.</given-names>
</name>
</person-group> (<year>1998</year>). <article-title>Kunitz-type Soybean Trypsin Inhibitor Revisited: Refined Structure of its Complex with Porcine Trypsin Reveals an Insight into the Interaction between a Homologous Inhibitor from Erythrina Caffra and Tissue-type Plasminogen Activator</article-title>. <source>J. Mol. Biol.</source> <volume>275</volume>, <fpage>347</fpage>&#x2013;<lpage>363</lpage>. <pub-id pub-id-type="doi">10.1006/jmbi.1997.1469</pub-id> </citation>
</ref>
<ref id="B52">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Suzuki</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Kuno</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Hasegawa</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Hirabayashi</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Kasai</surname>
<given-names>K.-i.</given-names>
</name>
<name>
<surname>Momma</surname>
<given-names>M.</given-names>
</name>
<etal/>
</person-group> (<year>2009</year>). <article-title>Sugar-complex Structures of the C-Half Domain of the Galactose-Binding Lectin EW29 from the earthwormLumbricus Terrestris</article-title>. <source>Acta Crystallogr. D Biol. Cryst.</source> <volume>65</volume>
<bold>,</bold> <fpage>49</fpage>&#x2013;<lpage>57</lpage>. <pub-id pub-id-type="doi">10.1107/S0907444908037451</pub-id> </citation>
</ref>
<ref id="B53">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sweet</surname>
<given-names>R. M.</given-names>
</name>
<name>
<surname>Wright</surname>
<given-names>H. T.</given-names>
</name>
<name>
<surname>Janin</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Chothia</surname>
<given-names>C. H.</given-names>
</name>
<name>
<surname>Blow</surname>
<given-names>D. M.</given-names>
</name>
</person-group> (<year>1974</year>). <article-title>Crystal Structure of the Complex of Porcine Trypsin with Soybean Trypsin Inhibitor (Kunitz) at 2.6 Angstrom Resolution</article-title>. <source>Biochemistry</source> <volume>13</volume>, <fpage>4212</fpage>&#x2013;<lpage>4228</lpage>. <pub-id pub-id-type="doi">10.1021/bi00717a024</pub-id> </citation>
</ref>
<ref id="B54">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tenorio</surname>
<given-names>C. A.</given-names>
</name>
<name>
<surname>Parker</surname>
<given-names>J. B.</given-names>
</name>
<name>
<surname>Blaber</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>Forthcoming 2022</year>). <article-title>Functionalization of a Symmetric Protein Scaffold: Redundant Folding Nuclei and Alternative Oligomeric Folding Pathways</article-title>. <source>Protein Sci.</source> <lpage>31</lpage>. <pub-id pub-id-type="doi">10.1002/pro.4301</pub-id> </citation>
</ref>
<ref id="B55">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tenorio</surname>
<given-names>C. A.</given-names>
</name>
<name>
<surname>Parker</surname>
<given-names>J. B.</given-names>
</name>
<name>
<surname>Blaber</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Oligomerization of a Symmetric &#x3b2;&#x2010;trefoil Protein in Response to Folding Nucleus Perturbation</article-title>. <source>Protein Sci.</source> <volume>29</volume> (<issue>7</issue>), <fpage>1629</fpage>&#x2013;<lpage>1640</lpage>. <pub-id pub-id-type="doi">10.1002/pro.3877</pub-id> </citation>
</ref>
<ref id="B56">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Terada</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Voet</surname>
<given-names>A. R. D.</given-names>
</name>
<name>
<surname>Noguchi</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Kamata</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Ohki</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Addy</surname>
<given-names>C.</given-names>
</name>
<etal/>
</person-group> (<year>2017</year>). <article-title>Computational Design of a Symmetrical &#x3b2;-trefoil Lectin with Cancer Cell Binding Activity</article-title>. <source>Sci. Rep.</source> <volume>7</volume> (<issue>1</issue>), <fpage>5943</fpage>. <pub-id pub-id-type="doi">10.1038/s41598-017-06332-7</pub-id> </citation>
</ref>
<ref id="B57">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Thompson</surname>
<given-names>M. J.</given-names>
</name>
<name>
<surname>Eisenberg</surname>
<given-names>D.</given-names>
</name>
</person-group> (<year>1999</year>). <article-title>Transproteomic Evidence of a Loop-Deletion Mechanism for Enhancing Protein Thermostability</article-title>. <source>J. Mol. Biol.</source> <volume>290</volume>, <fpage>595</fpage>&#x2013;<lpage>604</lpage>. <pub-id pub-id-type="doi">10.1006/jmbi.1999.2889</pub-id> </citation>
</ref>
<ref id="B58">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Transue</surname>
<given-names>T. R.</given-names>
</name>
<name>
<surname>Smith</surname>
<given-names>A. K.</given-names>
</name>
<name>
<surname>Mo</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Goldstein</surname>
<given-names>I. J.</given-names>
</name>
<name>
<surname>Saper</surname>
<given-names>M. A.</given-names>
</name>
</person-group> (<year>1997</year>). <article-title>Structure of Benzyl T-Antigen Disaccharide Bound to Amaranthus Caudatus Agglutinin</article-title>. <source>Nat. Struct. Mol. Biol.</source> <volume>4</volume> (<issue>10</issue>), <fpage>779</fpage>&#x2013;<lpage>783</lpage>. <pub-id pub-id-type="doi">10.1038/nsb1097-779</pub-id> </citation>
</ref>
<ref id="B59">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Veerapandian</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Gilliland</surname>
<given-names>G. L.</given-names>
</name>
<name>
<surname>Raag</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Svensson</surname>
<given-names>A. L.</given-names>
</name>
<name>
<surname>Masui</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Hirai</surname>
<given-names>Y.</given-names>
</name>
<etal/>
</person-group> (<year>1992</year>). <article-title>Functional Implications of Interleukin-1&#x3b2; Based on the Three-Dimensional Structure</article-title>. <source>Proteins</source> <volume>12</volume>, <fpage>10</fpage>&#x2013;<lpage>23</lpage>. <pub-id pub-id-type="doi">10.1002/prot.340120103</pub-id> </citation>
</ref>
<ref id="B60">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Voet</surname>
<given-names>A. R. D.</given-names>
</name>
<name>
<surname>Noguchi</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Addy</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Simoncini</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Terada</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Unzai</surname>
<given-names>S.</given-names>
</name>
<etal/>
</person-group> (<year>2014</year>). <article-title>Computational Design of a Self-Assembling Symmetrical &#x3b2;-propeller Protein</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A.</source> <volume>111</volume>, <fpage>15102</fpage>&#x2013;<lpage>15107</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1412768111</pub-id> </citation>
</ref>
<ref id="B61">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Went</surname>
<given-names>H. M.</given-names>
</name>
<name>
<surname>Jackson</surname>
<given-names>S. E.</given-names>
</name>
</person-group> (<year>2005</year>). <article-title>Ubiquitin Folds through a Highly Polarized Transition State</article-title>. <source>Protein Eng. Des. Selec.</source> <volume>18</volume> (<issue>5</issue>), <fpage>229</fpage>&#x2013;<lpage>237</lpage>. <pub-id pub-id-type="doi">10.1093/protein/gzi025</pub-id> </citation>
</ref>
<ref id="B62">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>West</surname>
<given-names>M. W.</given-names>
</name>
<name>
<surname>Hecht</surname>
<given-names>M. H.</given-names>
</name>
</person-group> (<year>1995</year>). <article-title>Binary Patterning of Polar and Nonpolar Amino Acids in the Sequences and Structures of Native Proteins</article-title>. <source>Protein Sci.</source> <volume>4</volume> (<issue>10</issue>), <fpage>2032</fpage>&#x2013;<lpage>2039</lpage>. <pub-id pub-id-type="doi">10.1002/pro.5560041008</pub-id> </citation>
</ref>
<ref id="B63">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xia</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Longo</surname>
<given-names>L. M.</given-names>
</name>
<name>
<surname>Sutherland</surname>
<given-names>M. A.</given-names>
</name>
<name>
<surname>Blaber</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Evolution of a Protein Folding Nucleus</article-title>. <source>Protein Sci.</source> <volume>25</volume> (<issue>7</issue>), <fpage>1227</fpage>&#x2013;<lpage>1240</lpage>. <pub-id pub-id-type="doi">10.1002/pro.2848</pub-id> </citation>
</ref>
<ref id="B64">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xiong</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Buckwalter</surname>
<given-names>B. L.</given-names>
</name>
<name>
<surname>Shieh</surname>
<given-names>H. M.</given-names>
</name>
<name>
<surname>Hecht</surname>
<given-names>M. H.</given-names>
</name>
</person-group> (<year>1995</year>). <article-title>Periodicity of Polar and Nonpolar Amino Acids Is the Major Determinant of Secondary Structure in Self-Assembling Oligomeric Peptides</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A.</source> <volume>92</volume> (<issue>14</issue>), <fpage>6349</fpage>&#x2013;<lpage>6353</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.92.14.6349</pub-id> </citation>
</ref>
<ref id="B65">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yadid</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Tawfik</surname>
<given-names>D. S.</given-names>
</name>
</person-group> (<year>2007</year>). <article-title>Reconstruction of Functional &#x3b2;-Propeller Lectins via Homo-Oligomeric Assembly of Shorter Fragments</article-title>. <source>J. Mol. Biol.</source> <volume>365</volume>, <fpage>10</fpage>&#x2013;<lpage>17</lpage>. <pub-id pub-id-type="doi">10.1016/j.jmb.2006.09.055</pub-id> </citation>
</ref>
</ref-list>
</back>
</article>