<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="research-article" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Mol. Biosci.</journal-id>
<journal-title>Frontiers in Molecular Biosciences</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Mol. Biosci.</abbrev-journal-title>
<issn pub-type="epub">2296-889X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">855511</article-id>
<article-id pub-id-type="doi">10.3389/fmolb.2022.855511</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Molecular Biosciences</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Structural Basis for the Propagation of Homing Endonuclease-Associated Inteins</article-title>
<alt-title alt-title-type="left-running-head">Beyer and Iwa&#xef;</alt-title>
<alt-title alt-title-type="right-running-head">Structures of <italic>Pho</italic> and <italic>Tli</italic> VMA Inteins</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Beyer</surname>
<given-names>Hannes M.</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1637530/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Iwa&#xef;</surname>
<given-names>Hideo</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/107655/overview"/>
</contrib>
</contrib-group>
<aff id="aff1">
<sup>1</sup>
<institution>Institute of Biotechnology</institution>, <institution>University of Helsinki</institution>, <addr-line>Helsinki</addr-line>, <country>Finland</country>
</aff>
<aff id="aff2">
<sup>2</sup>
<institution>Institute of Synthetic Biology</institution>, <institution>Heinrich-Heine-University D&#xfc;sseldorf</institution>, <addr-line>D&#xfc;sseldorf</addr-line>, <country>Germany</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1281653/overview">Christopher Lennon</ext-link>, Murray State University, United&#x20;States</p>
</fn>
<fn fn-type="edited-by">
<p>
<bold>Reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1653136/overview">Barry Stoddard</ext-link>, Fred Hutchinson Cancer Research Center, United&#x20;States</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1539192/overview">Ryuta Mizutani</ext-link>, Tokai University, Japan</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Hideo Iwa&#xef;, <email>hideo.iwai@helsinki.fi or</email>, <email>iwai@alumni.ethz.ch</email>
</corresp>
<fn fn-type="other">
<p>This article was submitted to Cellular Biochemistry, a section of the journal Frontiers in Molecular Biosciences</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>16</day>
<month>03</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>9</volume>
<elocation-id>855511</elocation-id>
<history>
<date date-type="received">
<day>26</day>
<month>01</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>08</day>
<month>02</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2022 Beyer and Iwa&#xef;.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Beyer and Iwa&#xef;</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these&#x20;terms.</p>
</license>
</permissions>
<abstract>
<p>Inteins catalyze their removal from a host protein through protein splicing. Inteins that contain an additional site-specific endonuclease domain display genetic mobility via a process termed &#x201c;homing&#x201d; and thereby act as selfish DNA elements. We elucidated the crystal structures of two archaeal inteins associated with an active or inactive homing endonuclease domain. This analysis illustrated structural diversity in the accessory domains (ACDs) associated with the homing endonuclease domain. To augment homing endonucleases with highly specific DNA cleaving activity using the intein scaffold, we engineered the ACDs and characterized their homing site recognition. Protein engineering of the ACDs in the inteins illuminated a possible strategy for how inteins could avoid their extinction but spread via the acquisition of a diverse accessory domain.</p>
</abstract>
<kwd-group>
<kwd>intein structures</kwd>
<kwd>horizontal gene transfer</kwd>
<kwd>protein splicing</kwd>
<kwd>DNA recognition</kwd>
<kwd>intein</kwd>
<kwd>meganuclease</kwd>
<kwd>homing endonuclease</kwd>
</kwd-group>
<contract-num rid="cn001">137995 277335</contract-num>
<contract-num rid="cn002">NNF17OC0025402 NNF17OC0027550</contract-num>
<contract-sponsor id="cn001">Academy of Finland<named-content content-type="fundref-id">10.13039/501100002341</named-content>
</contract-sponsor>
<contract-sponsor id="cn002">Novo Nordisk Foundation Center for Basic Metabolic Research<named-content content-type="fundref-id">10.13039/501100011747</named-content>
</contract-sponsor>
</article-meta>
</front>
<body>
<sec id="s1">
<title>Introduction</title>
<p>Protein-splicing intervening sequences often include a homing endonuclease (HEN) domain, which is embedded within inteins containing the Hedgehog/INTein (HINT) domain (<xref ref-type="bibr" rid="B44">Perler, 1998</xref>). The HINT domain catalyzes the protein splicing reaction, whereas HEN domains often function independently of the HINT domain (<xref ref-type="fig" rid="F1">Figure&#x20;1</xref>) (<xref ref-type="bibr" rid="B20">Derbyshire et&#x20;al., 1997</xref>). Inteins are generally considered selfish genetic elements, frequently invading conserved coding sequences across many unicellular host organisms. In this scenario, inteins make use of homing endonuclease domains for efficient invasion by directing sequence insertion via horizontal gene transfer (HGT) initiated by DNA-strand breaks in intein-less host alleles (<xref ref-type="fig" rid="F1">Figure&#x20;1B</xref>) (<xref ref-type="bibr" rid="B8">Barzel et&#x20;al., 2011</xref>). HENs themselves are selfish genetic elements that exist free-standing (without intein or intron) or associated with inteins or introns, e.g., in group I introns (<xref ref-type="bibr" rid="B23">Dujon et&#x20;al., 1989</xref>; <xref ref-type="bibr" rid="B19">Derbyshire and Belfort, 1998</xref>; <xref ref-type="bibr" rid="B13">Burt and Koufopanou, 2004</xref>). However, being an integral component of inteins enables HENs to invade coding sequences, which are usually more preserved than noncoding regions such as introns (<xref ref-type="bibr" rid="B8">Barzel et&#x20;al., 2011</xref>). This association with the HINT domain becomes possible due to the unique autocatalytic protein splicing activity of inteins leading to self-removal from the host protein and ligation of the flanking protein sequences (<xref ref-type="fig" rid="F1">Figure&#x20;1A</xref>). Through the association between the HINT and HEN, the latter benefits from a conserved homing environment while inteins take advantage of rapid dissemination across alleles in a given genome or population (<xref ref-type="bibr" rid="B37">Liu, 2000</xref>; <xref ref-type="bibr" rid="B13">Burt and Koufopanou, 2004</xref>; <xref ref-type="bibr" rid="B8">Barzel et&#x20;al., 2011</xref>). Many HENs within inteins belong to the most diverse LAGLIDADG family with an extensive phylogenetic distribution (<xref ref-type="bibr" rid="B18">Dalgaard et&#x20;al., 1997</xref>). LAGLIDADG homing endonucleases (LHEs) recognize about 14&#x2013;40&#xa0;bp pseudo palindromic or asymmetric target DNA sites (homing sites) and contain conserved LAGLIDADG motifs (<xref ref-type="bibr" rid="B17">Chevalier and Stoddard, 2001</xref>). The relatively long recognition sequence supposedly warrants high cleavage specificity, thereby reducing possible toxic effects to the host. Importantly, in contrast to most endonucleases, LHEs tolerate a certain degree of sequence variation within their homing site, an essential property for maintaining their propagation along evolutionary drifts (<xref ref-type="bibr" rid="B5">Argast et&#x20;al., 1998</xref>).</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption>
<p>Schematic mechanisms of homing endonuclease (HEN) and Hedgehog/INTein (HINT) domains in inteins. <bold>(A)</bold> The HINT domain catalyzes self-excision of the intein (here consisting of HINT, HEN, and an accessory domain (ACD)) while covalently ligating the flanking extein sequences of the host protein during the protein <italic>cis-</italic>splicing reaction. <bold>(B)</bold> The nested HEN domain of the intein promotes gene conversion by introducing DNA double-strand breaks at the homing site into a vacant host allele followed by invasion via horizontal gene transfer (HGT) and fixation into the organism or population. Saturation of occupied alleles may cause HEN degeneration and loss.</p>
</caption>
<graphic xlink:href="fmolb-09-855511-g001.tif"/>
</fig>
<p>During recent years, inteins have increasingly become important as a robust protein engineering platform thanks to their peptide-bond forming catalytic activity (<xref ref-type="fig" rid="F1">Figure&#x20;1A</xref>) (<xref ref-type="bibr" rid="B53">Volkmann and Iwa&#xef;, 2010</xref>; <xref ref-type="bibr" rid="B55">Wood and Camarero, 2014</xref>). In particular, natural mini- and split inteins lacking HEN domains as well as the feasibility of splitting inteins into halves, have fostered this development (<xref ref-type="bibr" rid="B49">Southworth et&#x20;al., 1998</xref>; <xref ref-type="bibr" rid="B4">Aranko et&#x20;al., 2014</xref>). HEN-free mini-inteins are prevalent and have presumably emerged from HEN-associated inteins that have lost their HEN domain through size reduction (<xref ref-type="bibr" rid="B8">Barzel et&#x20;al., 2011</xref>; <xref ref-type="bibr" rid="B54">Aranko et&#x20;al., 2013</xref>; <xref ref-type="bibr" rid="B41">Novikova et&#x20;al., 2016</xref>). According to the homing cycle model, HEN-less inteins may emerge after an intein has invaded and occupied all vacant alleles of a host population (&#x201c;Fixation,&#x201d; <xref ref-type="fig" rid="F1">Figure&#x20;1B</xref>) (<xref ref-type="bibr" rid="B13">Burt and Koufopanou, 2004</xref>). After the fixation, HEN suffers target-site depletion and degeneration because HEN-associated inteins do not provide any benefits to host organisms, and the HEN activity is required only for invasion while protein-splicing activity is constantly selected by the production of active host proteins (<xref ref-type="bibr" rid="B13">Burt and Koufopanou, 2004</xref>; <xref ref-type="bibr" rid="B8">Barzel et&#x20;al., 2011</xref>). Thus, degenerative mutations accumulate, eventually resulting in the loss of the HEN, thereby creating a mini-intein (<xref ref-type="bibr" rid="B33">Iwa&#xef; et&#x20;al., 2017</xref>). To avoid the loss of HENs in inteins, some HEN domains might have developed a mutualism with HINT (<xref ref-type="bibr" rid="B8">Barzel et&#x20;al., 2011</xref>; <xref ref-type="bibr" rid="B33">Iwa&#xef; et&#x20;al., 2017</xref>). This mutualism emerged, albeit HEN and HINT were long thought as functionally independent, as seen with mini-inteins that lack HENs (<xref ref-type="bibr" rid="B20">Derbyshire et&#x20;al., 1997</xref>). Artificially deleting HEN domains in several inteins impaired their protein splicing activity, suggesting that HEN domains, regardless of their nuclease activities, could assist in the protein splicing reaction of HINT. This domain interplay thereby might provide the selection to ensure the persistence of the HEN domain in inteins (<xref ref-type="bibr" rid="B33">Iwa&#xef; et&#x20;al., 2017</xref>). Thus, structural and functional studies of HEN-associated inteins could shed light on the evolutionary history of individual inteins and contribute to the development of novel reagents as genomic and protein engineering&#x20;tools.</p>
<p>In this study, we elucidated crystal structures of HEN-associated archaeal inteins inserted at the same insertion site (VMA-b), which is located within the A subunit P-loop of the vascular-type ATP synthase (VMA) from <italic>Thermococcus litoralis</italic> (<italic>Tli</italic>) and <italic>Pyrococcus horikoshii</italic> (<italic>Pho</italic>). The two three-dimensional structures highlighted a modular architecture consisting of HINT, HEN, and an accessory domain (ACD). The structures of the ACDs are diverse, even among the known three-dimensional structures of HEN-containing inteins. We further investigated the structural role of the ACD in DNA recognition of inteins by engineering the ACDs. These results suggest that the ACDs modulate DNA cleavages by the HEN-associated inteins. We speculate that acquiring a diverse ACD in HEN-associated inteins could be a general strategy to avoid their eventual extinction by promoting further spread into more distant insertion&#x20;sites.</p>
</sec>
<sec sec-type="results" id="s2">
<title>Results</title>
<sec id="s2-1">
<title>Crystal Structures of <italic>P. horikoshii VMA</italic> and <italic>T. litoralis VMA</italic> Inteins</title>
<p>To understand the molecular evolution of inteins, we are interested in elucidating three-dimensional structures of various inteins with a presumable HEN domain. The first intein was identified as an intervening sequence within the yeast vacuolar membrane ATPase (VMA), subunit A (<xref ref-type="bibr" rid="B30">Hirata et&#x20;al., 1990</xref>). The majority of inteins among eukaryotes reside at the highly conserved insertion site within the Vacuolar ATPase (VMA-a insertion site) (<xref ref-type="bibr" rid="B51">Swithers et&#x20;al., 2009</xref>). The extensively investigated VMA intein from <italic>Saccharomyces cerevisiae</italic> (<italic>Sce</italic>VMA) defines a proto-typical intein possessing homing endonuclease activity, also called PI-<italic>Sce</italic>I, as a rare cutting DNA endonuclease (<xref ref-type="bibr" rid="B27">Grindl et&#x20;al., 1998</xref>). Whereas yeast inteins are inserted at the highly conserved insertion site (VMA-a site), archaeal inteins commonly target a region approximately 20-residue downstream of the VMA-a insertion site (VMA-b insertion site), located at the P-loop motif of ATPases (<xref ref-type="bibr" rid="B51">Swithers et&#x20;al., 2009</xref>). The VMA intein from <italic>P. horikoshii</italic> (<italic>Pho</italic>VMA) consists of 376&#xa0;amino acids, which is considerably smaller than canonical HEN-associated inteins, e.g., <italic>Sce</italic>VMA consisting of 454 residues but more similar to the size of the TFIIB intein from <italic>Methanococcus jannaschii</italic> (<italic>Mja</italic>TFIIB, 335 residues). The structure of the <italic>Mja</italic>TFIIB intein could previously not be determined together with the HEN domain by protein crystallography (<xref ref-type="bibr" rid="B33">Iwa&#xef; et&#x20;al., 2017</xref>). Inteins share conserved amino acid sequence stretches designated as Blocks A-G (<xref ref-type="bibr" rid="B47">Pietrokovski, 1994</xref>; <xref ref-type="bibr" rid="B45">Perler et&#x20;al., 1997</xref>) (<xref ref-type="fig" rid="F2">Figure&#x20;2A</xref>). Blocks C, D, E, and H denote the HEN domain, out of which Blocks C and E represent the eponymous conserved LAGLIDADG helices bearing the acidic catalytic residues (<xref ref-type="bibr" rid="B45">Perler et&#x20;al., 1997</xref>). The sequence alignment of the archaeal inteins also suggests that <italic>Pho</italic>VMA intein lacks homing endonuclease activity due to the absence of the active site residues in Blocks C and E (<xref ref-type="fig" rid="F2">Figure&#x20;2B</xref>). We were successful in producing the <italic>Pho</italic>VMA intein and obtaining diffracting crystals. We solved the crystal structure of the <italic>Pho</italic>VMA intein at the 2.5&#xa0;&#xc5;-resolution (<xref ref-type="fig" rid="F2">Figure&#x20;2C</xref>; <xref ref-type="sec" rid="s11">Supplementary Table S1</xref>). The crystal structure of <italic>Pho</italic>VMA intein revealed the typical HINT domain of thermophilic inteins, which contains a &#x3b2;-strand insertion and the HEN domain structure (<xref ref-type="fig" rid="F2">Figure&#x20;2C</xref>) (<xref ref-type="bibr" rid="B4">Aranko et&#x20;al., 2014</xref>). As expected from the primary structure, the <italic>Pho</italic>VMA intein lacks the presumed HEN active site residues in both usually conserved LAGLIDADG helices (Blocks C and E). It shows a partial truncation in Block E, hypothetically indicating progressive HEN degeneration (<xref ref-type="fig" rid="F2">Figures 2A&#x2013;C</xref>).</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption>
<p>Structures of degenerated and active VMA inteins. <bold>(A)</bold> General domain organization and conservation in inteins. The HEN domain resides within the intein while the intein resides within a host protein (N- and C-exteins). Conserved sequence Blocks A&#x2013;H are indicated. Host protein, black; intein, gray; HEN, yellow. <bold>(B)</bold> Sequence comparison around Blocks C and E corresponding to the active site-carrying LAGLIDADG helices of the HEN domains. Comparison of the intein orthologs of <italic>Pyrococcus horikoshii</italic> (<italic>Pho</italic>VMA), <italic>Thermococcus litoralis</italic> (<italic>Tli</italic>VMA), <italic>Pyrococcus furiosus</italic> (<italic>Pfu</italic>VMA), and <italic>Pyrococcus abyssi</italic> (<italic>Pab</italic>VMA). The position of the catalytic aspartates in Blocks C and E are highlighted in red. <bold>(C)</bold> Crystal structure of <italic>Pho</italic>VMA intein. <bold>(D)</bold> Crystal structure of <italic>Tli</italic>VMA intein. For <bold>(C,D)</bold>, the locations of the active sites are highlighted by red circles. The close-ups of the active sites are shown to the left with electron density maps at 1.0 &#x3c3; contour level. <bold>(E)</bold> The previously reported three crystal structures of PI-<italic>Pfu</italic>I (PDB: 1dq3) (<xref ref-type="bibr" rid="B32">Ichiyanagi et&#x20;al., 2000</xref>)(<xref ref-type="bibr" rid="B32">Ichiyanagi et&#x20;al., 2000</xref>), PI-<italic>Tko</italic>II (PDB: 2cw7) (<xref ref-type="bibr" rid="B39">Matsumura et&#x20;al., 2006</xref>), and PI-<italic>Sce</italic>I (PDB: 1lws) (<xref ref-type="bibr" rid="B40">Moure et&#x20;al., 2002</xref>). In <bold>(C&#x2013;E)</bold>, HINT, HEN, and ACD domains are colored in gray, yellow, and blue, respectively. PI-<italic>Tko</italic>II contains an additional domain IV indicated in orange. Int<sub>N</sub> and Int<sub>C</sub> indicate the N- and C-terminal parts of the HINT domain, which are separated by the HEN. The domain arrangement is schematically illustrated below each structure. <italic>N</italic> and <italic>C</italic> denote the termini.</p>
</caption>
<graphic xlink:href="fmolb-09-855511-g002.tif"/>
</fig>
<p>The length of inteins considerably varies from 123 to &#x3e;1,000 residues due to various insertions such as HENs and sequence deletions (<xref ref-type="bibr" rid="B26">Green et&#x20;al., 2018</xref>). Large intein sequences generally indicate the presence of an active or inactive nested HEN. Therefore, we were interested in elucidating the structures of other VMA inteins inserted at the same VMA-b site to reveal a possible structural basis directing inteins of diverse sizes to the same target insertion site within host genomes. Among VMA inteins inserted at the VMA-b insertion site, we could obtain crystals of the VMA intein from <italic>Thermococcus litoralis</italic> (<italic>Tli</italic>). The <italic>Tli</italic>VMA intein comprises 429 residues and is larger than the <italic>Pho</italic>VMA intein (376 residues) but similar to the size of PI-<italic>Sce</italic>I (454 residues). To prevent self-cleavages during protein production, we expressed both inteins in <italic>E.&#x20;coli</italic> with alanine substitutions of the catalytic cysteines 1 (Cys1). We used the N-terminal small ubiquitin-like modifier (SUMO) fusion to facilitate protein purification of <italic>Pho</italic>VMA intein (<xref ref-type="bibr" rid="B28">Guerrero et&#x20;al., 2015</xref>). However, <italic>Tli</italic>VMA intein required an N-terminal MBP fusion in addition to SUMO (H<sub>6</sub>-MBP-SUMO-<italic>Tli</italic>VMA intein) for successful soluble expression (<xref ref-type="bibr" rid="B28">Guerrero et&#x20;al., 2015</xref>). Unlike <italic>Pho</italic>VMA intein with the presumably degenerated HEN domain, <italic>Tli</italic>VMA intein also required a high salt buffer composition, compensating for the lack of nucleic acids to maintain solubility after proteolytic removal of the fusion&#x20;tag.</p>
<p>We solved the structure of <italic>Tli</italic>VMA intein at the 1.6&#xa0;&#xc5;-resolution (<xref ref-type="fig" rid="F2">Figure&#x20;2D</xref>, <xref ref-type="sec" rid="s11">Supplementary Table S1</xref>). The crystal structures of the <italic>Tli</italic>VMA intein revealed a very similar overall structure as found in the <italic>Pho</italic>VMA intein, including the three-domain architecture known from the three other reported HEN-containing inteins (<xref ref-type="fig" rid="F2">Figures 2C&#x2013;E</xref>) (<xref ref-type="bibr" rid="B22">Duan et&#x20;al., 1997</xref>; <xref ref-type="bibr" rid="B32">Ichiyanagi et&#x20;al., 2000</xref>; <xref ref-type="bibr" rid="B40">Moure et&#x20;al., 2002</xref>; <xref ref-type="bibr" rid="B39">Matsumura et&#x20;al., 2006</xref>). The Hedgehog/Intein domains (HINT, gray) are composed of the N- and C-terminal fragments (Int<sub>N</sub> and Int<sub>C</sub>) with the HEN domains (yellow) inserted into the common intein split-site located between the two pseudo-two-fold symmetrical units forming a horseshoe-like fold common to all HINT domains (<xref ref-type="bibr" rid="B25">Eryilmaz et&#x20;al., 2014</xref>; <xref ref-type="bibr" rid="B33">Iwa&#xef; et&#x20;al., 2017</xref>). The HINT domain of the <italic>Tli</italic>VMA intein also contains the &#x3b2;-strand extension found among thermophilic inteins (<xref ref-type="fig" rid="F2">Figure&#x20;2C</xref>) (<xref ref-type="bibr" rid="B4">Aranko et&#x20;al., 2014</xref>; <xref ref-type="bibr" rid="B9">Beyer et&#x20;al., 2019</xref>; <xref ref-type="bibr" rid="B29">Hiltunen et&#x20;al., 2021</xref>).</p>
</sec>
<sec id="s2-2">
<title>The Differences Between <italic>P. horikoshii VMA</italic> and <italic>T. litoralis VMA</italic> Inteins</title>
<p>Not surprisingly, the HINT domains of <italic>Pho</italic>VMA and<italic>Tli</italic>VMA inteins show a virtually identical three-dimensional structure with a 86% sequence identity (<xref ref-type="fig" rid="F2">Figure&#x20;2</xref>; <xref ref-type="sec" rid="s11">Supplementary Table S4</xref>). The HINT domains connect to the first of the two LAGLIDADG helices of the HEN domains via unstructured loops of 28&#x2013;32 residues, located distant from the DNA-binding interfaces. We have identified &#x201c;accessory domains&#x201d; (ACD, shown in blue) residing between the HEN domains and the C-terminal part of the HINT domains, where we observed the most notable differences. The striking contrast between both inteins is the divergence of their ACDs, showing the least structural homology between the two molecules (33% pairwise sequence identity, <xref ref-type="fig" rid="F2">Figures 2C,D</xref>; <xref ref-type="sec" rid="s11">Supplementary Figure S3</xref>). The 53-residue difference in the lengths between the <italic>Pho</italic>VMA and<italic>Tli</italic>VMA inteins can be mainly attributed to the difference in the ACDs. Even though ACDs at the intersections between HINT and HEN domain were identified in other reported HEN-associated inteins, their biological functions remain unclear (<xref ref-type="fig" rid="F2">Figure&#x20;2E</xref>) (<xref ref-type="bibr" rid="B22">Duan et&#x20;al., 1997</xref>; <xref ref-type="bibr" rid="B32">Ichiyanagi et&#x20;al., 2000</xref>; <xref ref-type="bibr" rid="B40">Moure et&#x20;al., 2002</xref>; <xref ref-type="bibr" rid="B39">Matsumura et&#x20;al., 2006</xref>).</p>
<p>As for the nested HEN domains, the deletion in Block E in the <italic>Pho</italic>VMA intein (<xref ref-type="fig" rid="F2">Figure&#x20;2B</xref>) causes a truncation of the second LAGLIDADG helix by one turn, thereby removing one of the catalytic aspartate residues (<xref ref-type="fig" rid="F2">Figures 2B&#x2013;D</xref>). Another obvious consequence of the degeneration in the <italic>Pho</italic>VMA intein appeared in the structure of the DNA-binding interfaces of the HEN mediated by two stretches of &#x3b2;-sheets, each originating from one copy of the two-fold pseudo symmetric LAGLIDADG elements (<xref ref-type="sec" rid="s11">Supplementary Figures S1A,B</xref>) (<xref ref-type="bibr" rid="B40">Moure et&#x20;al., 2002</xref>). The electrostatic surface potential of the HEN domains is very different between the two VMA inteins, which is in line with their binding to DNA fragments (see below).</p>
<p>Based on the three-dimensional structures, we deleted the HEN domain (residues 123-335 for <italic>Pho</italic>VMA intein and 123-388 for <italic>Tli</italic>VMA intein) and connected with an &#x201c;NG&#x201d; sequence linker, resulting in 165-residue <italic>cis</italic>-splicing <italic>Pho</italic>VMA<sup>&#x394;HEN</sup> and <italic>Tli</italic>VMA<sup>&#x394;HEN</sup> inteins. We modeled the structure of the two deletion variants with the RoseTTAFold software using the deep-learning algorithm (<xref ref-type="sec" rid="s11">Supplementary Figures S1C,D</xref>) (<xref ref-type="bibr" rid="B7">Baek et&#x20;al., 2021</xref>). Both structures appear identical with high confidence scores and an r.m.s.d. of 0.7&#xa0;&#xc5; for 165&#xa0;C&#x3b1;&#x20;atoms.</p>
<p>The <italic>Pho</italic>VMA<sup>&#x394;HEN</sup> intein still retained the protein splicing activity, indicating that the HINT domain of <italic>Pho</italic>VMA intein is functionally independent of the nested HEN domain without having developed a mutualism (<xref ref-type="sec" rid="s11">Supplementary Figure S1C</xref>) (<xref ref-type="bibr" rid="B33">Iwa&#xef; et&#x20;al., 2017</xref>). However, the protein splicing activity of the <italic>Tli</italic>VMA<sup>&#x394;HEN</sup> intein largely diminished, presumably because of the mutualism developed between the HINT and HEN domains (<xref ref-type="sec" rid="s11">Supplementary Figure S1D</xref>) (<xref ref-type="bibr" rid="B33">Iwa&#xef; et&#x20;al., 2017</xref>). Even though the two three-dimensional structures are predicted to be almost identical to the original HINT domain (<xref ref-type="sec" rid="s11">Supplementary Figures S1C,D</xref>), the HEN domain of <italic>Tli</italic>VMA intein likely contributes to the protein splicing activity, as it has also been suggested for <italic>Mvu</italic>TFIIB intein (<xref ref-type="bibr" rid="B33">Iwa&#xef; et&#x20;al., 2017</xref>).</p>
</sec>
<sec id="s2-3">
<title>The HEN Domain of <italic>Pho</italic>VMA Intein Has Degenerated, and Its Activity Can Be Rescued</title>
<p>The primary structures and the three-dimensional crystal structures of the VMA inteins suggest that the HEN activity of <italic>Pho</italic>VMA intein has most likely degenerated during evolution and is inactive due to the lack of active site aspartate residues. However, <italic>Tli</italic>VMA intein probably carries a catalytically active HEN domain capable of binding to DNA and introducing DNA double-strand breaks (<xref ref-type="fig" rid="F2">Figures&#x20;2A,B</xref>).</p>
<p>To experimentally validate these assumptions, we performed <italic>in&#x20;vitro</italic> DNA-binding and cleavage studies. First, we generated DNA substrates containing the theoretical homing sites of the inteins, that is, the coding DNA sequence of the <italic>Tli</italic> and <italic>Pho vma</italic> genes without the intein coding region (<xref ref-type="fig" rid="F3">Figure&#x20;3A</xref>). These reconstructed intein-less DNA fragments should resemble the allelic situation before invasion by the inteins (<xref ref-type="fig" rid="F3">Figure&#x20;3A</xref>). We generated 750-bp linear double-strand DNA fragments asymmetrically harboring the reconstituted homing site by amplifying their respective sequences from the genomic DNA by PCR and tested the cleavage of the DNA fragments by the inteins. Indeed, we observed that <italic>Tli</italic>VMA intein cleaved its reconstituted homing site accompanied by a strong DNA binding affinity as indicated by an electrophoretic mobility shift (EMSA) of the substrate- and product-DNA fragments (<xref ref-type="fig" rid="F3">Figure&#x20;3B</xref>). In contrast, <italic>Pho</italic>VMA intein was neither able to process its theoretical homing site, nor did it show any detectable DNA binding affinity (<xref ref-type="fig" rid="F3">Figure&#x20;3C</xref>). Thus, the DNA substrates with the reconstituted homing sites validated our assumptions derived from the structures of <italic>Tli</italic> and <italic>Pho</italic> VMA inteins.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption>
<p>DNA-binding and cleavages of the theoretical homing sites by <italic>Tli</italic>VMA and <italic>Pho</italic>VMA intein variants. <bold>(A)</bold> PCR-based construction of linear DNA substrates from <italic>Tli</italic> and <italic>Pho</italic> genomic DNA and expected cleavage pattern. The homing site was reconstituted by deleting the intein coding sequence from the <italic>vma</italic> gene while adjusting the size to 750&#xa0;bp asymmetrically harboring the homing sites to generate the expected 250- and 500-bp cleavage products. <bold>(B)</bold> DNA-binding and cleavage of the 750-bp <italic>Tli</italic> DNA substrate by incubation with increasing concentrations of <italic>Tli</italic>VMA intein at 80&#xb0;C for 2&#xa0;h. The electrostatic surface potential with an isoelectric point of 10.23 is shown below with a view of the DNA-binding interface. Positive, blue; neutral, white; negative, red. <bold>(C)</bold> Experiment as in <bold>(B)</bold> but using the 750-bp <italic>Pho</italic> DNA substrate and <italic>Pho</italic>VMA intein. The isoelectric point of the electrostatic surface potential is 5.45. The electrostatic surface potential model was generated using an alternative coordinate file without gaps in the HEN domain. <bold>(D)</bold> Activity test of the reactivated <italic>Pho</italic>VMA<sub>Act</sub> intein. Experiment as in <bold>(C)</bold> but using <italic>Pho</italic>VMA<sub>Act</sub> intein. The reconstitution of active site regions in the <italic>Pho</italic>VMA intein by grafting the sequence from Block C and E regions of the <italic>Tli</italic>VMA intein is illustrated below. In panels <bold>(B&#x2013;D)</bold>, S, substrate; P1, 500-bp product, P2, 250-bp product, P2&#x2032;, &#x223c;200-bp product, P3, &#x223c;50-bp product. M stands for the DNA ladder, &#x201c;stop&#x201d; indicates the addition of SDS-containing stop solution after incubation. <bold>(E)</bold> Distance of homing site (<italic>HS</italic>) and alternative site (<italic>AS</italic>) in the <italic>T. litoralis</italic> DNA substrate. Arrowheads indicate the positions where strand cleavage occurs. <bold>(F)</bold> The alignment of <italic>Tli</italic>VMA intein homing (<italic>HS</italic>) and alternative (<italic>AS</italic>, reversed) sites. A region of 27&#xa0;bp with high sequence similarity between the <italic>HS</italic> and the reversed <italic>AS</italic> is indicated by a box. The central four base pairs where cleavage occurs are highlighted in red (<italic>HS</italic>) and blue (<italic>AS</italic>).</p>
</caption>
<graphic xlink:href="fmolb-09-855511-g003.tif"/>
</fig>
<p>We attributed the catalytic inactivity of the HEN in <italic>Pho</italic>VMA intein to the loss of presumed active-site residues. The differences in the electrostatic surface potential distributions of the HEN domains between <italic>Tli</italic> and <italic>Pho</italic> VMA inteins further might support the weaker DNA-binding affinity of <italic>Pho</italic>VMA compared with <italic>Tli</italic>VMA intein (<xref ref-type="fig" rid="F3">Figures 3B,C</xref>). Because the architecture of the HEN domain in <italic>Pho</italic>VMA intein was retained intact despite degenerative mutations and deletion, we wondered whether the nuclease activity of <italic>Pho</italic>VMA intein could be restored by protein engineering to reverse the evolutional process. To validate our hypothesis, we engineered the inactive <italic>Pho</italic>VMA intein by grafting the active sites in the LAGLIDADG helices from the sequences of the <italic>Tli</italic>VMA intein (<xref ref-type="fig" rid="F2">Figures 2B</xref>, <xref ref-type="fig" rid="F3">3D</xref>). Indeed, the engineered <italic>Pho</italic>VMA intein (<italic>Pho</italic>VMA<sub>Act</sub> intein) with the restored catalytic residues cleaved the DNA substrate containing the reconstituted homing site, albeit less efficiently not attaining the complete substrate digestion as observed with <italic>Tli</italic>VMA intein (<italic>Tli</italic>VMA intein, <xref ref-type="fig" rid="F3">Figure&#x20;3B</xref>; <italic>Pho</italic>VMA<sub>Act</sub> intein, <xref ref-type="fig" rid="F3">Figure&#x20;3D</xref>).</p>
</sec>
<sec id="s2-4">
<title>
<italic>Tli</italic>VMA and <italic>Pho</italic>VMA<sub>Act</sub> Inteins Differ in Homing Site Recognition</title>
<p>We designed and generated the DNA substrates for the DNA cleavage assay from <italic>Tli</italic> and <italic>Pho</italic> genomic DNA using PCR. Removing the intein coding sequences from the <italic>vma</italic> genes restored the theoretical homing site within a linear DNA of 750&#xa0;bp containing the 250- and 500-bp fragments of the genomic sequences upstream and downstream of the reconstituted homing site, respectively (<xref ref-type="fig" rid="F3">Figure&#x20;3A</xref>). The DNA cleavage at the homing site by the VMA inteins should produce 250- and 500-bp products. While the engineered <italic>Pho</italic>VMA<sub>Act</sub> intein produced the expected two fragments (<xref ref-type="fig" rid="F3">Figure&#x20;3D</xref>), <italic>Tli</italic>VMA intein exhibited an unexpected pattern of the products (<xref ref-type="fig" rid="F3">Figure&#x20;3B</xref>). The disappearance of the DNA fragments at higher concentrations of <italic>Tli</italic>VMA intein without SDS-treated denaturation is presumably due to the strong affinity to the DNA molecule (&#x201c;end-holding&#x201d;). Interestingly, besides the expected 500-bp fragment, a product of &#x223c;200&#xa0;bp and a third one shorter than 75&#xa0;bp appeared with <italic>Tli</italic>VMA intein. The analysis of the cleavage products by DNA sequencing revealed that <italic>Pho</italic>VMA<sub>Act</sub> intein cleaved precisely at the expected homing site (<xref ref-type="sec" rid="s11">Supplementary Figure&#x20;S2A</xref>).</p>
<p>In contrast, <italic>Tli</italic>VMA intein cleaved at two different sites. One site was indeed at the theoretical homing site (<italic>HS</italic>) with the central four base pairs of the sequence 5&#x2032;-AAAA-3&#x2032;, while the other alternative site (<italic>AS</italic>) is located 52&#xa0;bp upstream of the reconstituted homing site and contains the central sequence 5&#x2032;-TCTT-3&#x2032; (<xref ref-type="sec" rid="s11">Supplementary Figure S2B</xref>). We assume that recognition and cleavage of the <italic>AS</italic> by <italic>Tli</italic>VMA intein occur on the opposite strand of the <italic>HS</italic>. The sequence on the opposite strand corresponds to the sequence of 5&#x2032;-AAGA-3&#x2032;, reminiscent of the reconstituted homing site of <italic>Pho</italic>VMA<sub>Act</sub> intein (<xref ref-type="fig" rid="F3">Figures 3E,F</xref>; <xref ref-type="sec" rid="s11">Supplementary Figure S2C</xref>) and bearing a single substitution to the central four base pairs of the homing site of <italic>Tli</italic>VMA intein. Indeed, the alignment of the DNA sequence against the reverse strand of the alternative site revealed a striking 63% identity encompassing a 30&#xa0;bp region surrounding the two cleavage sites (<xref ref-type="fig" rid="F3">Figure&#x20;3F</xref>). Overall, the DNA substrates reconstituted from <italic>Tli</italic> and <italic>Pho</italic> genomic DNA have sufficient similarity to assume that both contain the <italic>AS</italic> next to the <italic>HS</italic> (<xref ref-type="sec" rid="s11">Supplementary Figure S2C</xref>). However, <italic>Pho</italic>VMA<sub>Act</sub> intein could exclusively process the homing site (<italic>HS</italic>), leaving the <italic>AS</italic> unaffected (<xref ref-type="fig" rid="F3">Figure&#x20;3D</xref>). We could conclude that the activated <italic>Pho</italic>VMA<sub>Act</sub> intein is more specific toward recognizing the reconstituted homing site (<italic>HS</italic>) despite its lower affinity.</p>
</sec>
<sec id="s2-5">
<title>The <italic>Tli</italic>VMA Intein Accessory Domain Lowers DNA Cleavage Specificity</title>
<p>The lengthy DNA sequences recognized by homing endonucleases (HENs) attracted protein engineering of HENs for genomic application because the high specificity of HENs could facilitate various <italic>in&#x20;vitro</italic> and <italic>in vivo</italic> applications (<xref ref-type="bibr" rid="B50">Stoddard, 2011</xref>). However, the number of HENs that recognize different DNA sequences which could be used for broad applications is small. Although dozens of intein structures have been deposited to the protein data bank (PDB), only three of those contain a nested HEN domain. Moreover, exclusively the intein structure of PI-<italic>Sce</italic>I from <italic>Saccharomyces cerevisiae</italic> was elucidated as the DNA/intein complex (<xref ref-type="bibr" rid="B22">Duan et&#x20;al., 1997</xref>; <xref ref-type="bibr" rid="B40">Moure et&#x20;al., 2002</xref>). The limited structural information of HEN-associated inteins hinders our understanding of inteins as site-specific DNA endonucleases, impeding further development of HEN-associated inteins by protein engineering as genetic engineering tools. Other reported HEN-associated intein structures are archaeal inteins from <italic>Thermococcus kodakaraensis</italic> (PI-<italic>Tko</italic>II) (<xref ref-type="bibr" rid="B39">Matsumura et&#x20;al., 2006</xref>) and <italic>Pyrococcus furiosus</italic> (PI-<italic>Pfu</italic>I) (<xref ref-type="bibr" rid="B32">Ichiyanagi et&#x20;al., 2000</xref>). Just like the VMA inteins from <italic>T. litoralis</italic> and <italic>P. horikoshii</italic>, these inteins have an accessory domain (ACD) in addition to HINT and HEN domains (<xref ref-type="fig" rid="F2">Figure&#x20;2D</xref>). Furthermore, in the case of PI-<italic>Tko</italic>II, an additional domain, termed domain IV, was reported (<xref ref-type="bibr" rid="B39">Matsumura et&#x20;al., 2006</xref>).</p>
<p>It is believed that ACDs in HEN-associated inteins might generally contribute to interactions with DNA. For PI-<italic>Sce</italic>I, where the ACD is referred to as DNA recognition region (DRR), this role has been demonstrated, although the location of the ACD in PI-<italic>Sce</italic>I is different from other reported HEN-associated intein structures (<xref ref-type="fig" rid="F2">Figures 2C&#x2013;E</xref>; <xref ref-type="sec" rid="s11">Supplementary Figure S3</xref>). The ACD (DDR) can be seen as an insertion into the HINT domain rather than a connection of HEN and HINT domains (<xref ref-type="bibr" rid="B40">Moure et&#x20;al., 2002</xref>) (<xref ref-type="fig" rid="F2">Figure&#x20;2E</xref>; <xref ref-type="sec" rid="s11">Supplementary Figure S3</xref>). However, HENs also exist free-standing without being embedded in inteins or introns. They are known to be among the most sequence-specific endonucleases due to their relatively long sequence recognition motif (<xref ref-type="bibr" rid="B17">Chevalier and Stoddard, 2001</xref>). Some of such HENs do not contain ACDs. Thus, it remains elusive why some HEN-associate inteins require ACDs and cannot define sufficient DNA sequence specificity with their intrinsic DNA recognition capability.</p>
<p>To investigate the structural and functional roles of ACDs in HEN-associated inteins, we decided to delete the ACD region from the <italic>Tli</italic>VMA intein based on the three-dimensional structure. We could also validate our deletion design by determining the crystal structure of the deletion variant, termed <italic>Tli</italic>VMA<sup>&#x394;ACD</sup> intein (<xref ref-type="fig" rid="F4">Figure&#x20;4A</xref>; <xref ref-type="sec" rid="s11">Supplementary Table S1</xref>). The crystal structure of <italic>Tli</italic>VMA<sup>&#x394;ACD</sup> intein confirmed that the deletion of the ACD did not influence the HINT and HEN domain folds, nor their relative orientation toward each other (<xref ref-type="fig" rid="F4">Figure&#x20;4A</xref>). In the DNA cleavage and binding assays, <italic>Tli</italic>VMA<sup>&#x394;ACD</sup> intein similarly cleaved the same substrate as the wild-type <italic>Tli</italic>VMA intein did, albeit with reduced DNA-binding (<xref ref-type="fig" rid="F3">Figures 3B</xref>, <xref ref-type="fig" rid="F4">4B</xref>). To our surprise, the deletion of the ACD from the <italic>Tli</italic>VMA intein changed the cleavage profile. The substrate cleavage profile by <italic>Tli</italic>VMA<sup>&#x394;ACD</sup> intein resembled that of the <italic>Pho</italic>VMA<sub>Act</sub> intein, producing two main products as opposed to three products generated by <italic>Tli</italic>VMA intein (<xref ref-type="fig" rid="F4">Figure&#x20;4C</xref>). The DNA sequencing chromatogram of the smaller cleavage product generated by <italic>Tli</italic>VMA<sup>&#x394;ACD</sup> supported that <italic>Tli</italic>VMA<sup>&#x394;ACD</sup> intein did not cleave the alternative site as observed for the wild-type <italic>Tli</italic>VMA intein, similar to the digestion pattern of the <italic>Pho</italic>VMA<sub>Act</sub> intein (<xref ref-type="fig" rid="F4">Figure&#x20;4F</xref>). Moreover, we found that <italic>Tli</italic>VMA intein cleaved the reconstituted DNA substrate from the <italic>Pyrococcus horikoshii</italic> (<italic>Pho</italic>) genome at the alternative cleavage site (<xref ref-type="fig" rid="F4">Figure&#x20;4E</xref>).</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption>
<p>Deletion and grafting of the ACD in the <italic>Tli</italic>VMA intein. <bold>(A)</bold> The crystal structures of <italic>Tli</italic>VMA intein (left) and <italic>Tli</italic>VMA<sup>&#x394;ACD</sup> intein (right) without ACD, connecting the HEN domain and the C-terminal part of the HINT domain. HEN, ACD, and HINT are colored in yellow, blue, and gray, respectively. <bold>(B)</bold> DNA-binding and cleavages of the reconstituted 750-bp DNA fragment with the homing site from <italic>Thermococcus litoralis</italic> genome (<italic>Tli</italic>) by <italic>Tli</italic>VMA<sup>&#x394;ACD</sup> intein. The cleavages were analyzed on agarose gels after the incubation with increasing concentrations at 80&#xb0;C for 2&#xa0;h. <bold>(C)</bold> Comparison of DNA cleavages of the reconstituted 750-bp <italic>Tli</italic> DNA fragment by <italic>Tli</italic>VMA, <italic>Tli</italic>VMA<sup>&#x394;ACD</sup>, and <italic>Pho</italic>VMA<sub>Act</sub> inteins. <bold>(D)</bold> Comparison of the DNA cleavages of the 2037&#xa0;bp DNA fragment, including the <italic>Tli</italic>VMA intein coding sequence at the homing site by the three inteins as in C. This fragment lacked the reconstituted homing site (<italic>HS</italic>) due to the <italic>Tli</italic>VMA intein coding sequence, hence only possessed the alternative site (<italic>AS</italic>). <bold>(E)</bold> DNA cleavages of the 750-bp <italic>Pho</italic> DNA fragment by the reactivated <italic>Pho</italic>VMA<sub>Act</sub> intein with the ACD from <italic>Tli</italic>VMA intein (<italic>Pho</italic>VMA<sub>Act-ACD(<italic>Tli</italic>)</sub>) and the comparison with <italic>Tli</italic>VMA and <italic>Pho</italic>VMA<sub>Act</sub> inteins. In <bold>(B&#x2013;E)</bold>, M stands for the DNA ladder, &#x201c;stop&#x201d; indicates the addition of SDS-containing stop solution after incubation. The migration height for the 750-bp substrate (S) and the 500&#xa0;bp (P1) and 250&#xa0;bp (P2) products are indicated in <bold>(B)</bold>, P2&#x2019; (200&#xa0;bp), and P3 (50&#xa0;bp) are shown in <bold>(C,D)</bold>. <bold>(F)</bold> The Sanger sequencing chromatogram of the 250-bp cleavage product generated by the <italic>Tli</italic>VMA<sup>&#x394;ACD</sup> intein lacking the ACD.</p>
</caption>
<graphic xlink:href="fmolb-09-855511-g004.tif"/>
</fig>
<p>Similarly, <italic>Pho</italic>VMA<sub>Act</sub> intein was able to digest the homing site within the <italic>Tli</italic> genome, albeit less efficiently as its cognate homing site (<xref ref-type="fig" rid="F4">Figure&#x20;4C</xref>; <xref ref-type="sec" rid="s11">Supplementary Figure S4</xref>). This cross-activity between <italic>Tli</italic>VMA and <italic>Pho</italic>VMA<sub>Act</sub> inteins is presumably due to the close homology between the two substrate sequences created from <italic>Pho</italic> and <italic>Tli</italic> genome (<xref ref-type="sec" rid="s11">Supplementary Figures S2C, S6A</xref>). The deletion of ACDs suggests that ACDs could play critical roles in increasing the cleavage specificity of HEN-associated inteins as well as making them more promiscuous by adding the capability to recognize an alternative cleavage&#x20;site.</p>
<p>Next, we were interested in how the ACD in <italic>Tli</italic>VMA intein influences the DNA recognition specificity. We speculated two possible scenarios: direct recognition of the alternative site sequence mediated by the ACD or indirect recognition via a cooperative binding effect. The binding of the intein to the homing site could guide the recognition of the alternative site separated by only 52&#xa0;bp from the homing site by cooperative domain interaction with a second intein molecule involving the ACD. The DNA substrate containing only the alternative cleavage (<italic>AS</italic>) site (intein coding sequence remained inserted into the homing site) indicated that only <italic>Tli</italic>VMA intein bearing the ACD was capable of digesting the DNA substrate, whereas <italic>Tli</italic>VMA<sup>&#x394;ACD</sup> intein was not (<xref ref-type="fig" rid="F4">Figure&#x20;4D</xref>). These results revealed that the <italic>Tli</italic>VMA intein cleaved the alternative site (<italic>AS</italic>) independent of the homing site (<italic>HS</italic>) but depended on the presence of the ACD domain. We also performed DNA-binding tests using the isolated ACD domain of the <italic>Tli</italic>VMA intein and revealed that the ACD seemingly does not contribute to the overall DNA affinity (<xref ref-type="sec" rid="s11">Supplementary Figure&#x20;S5</xref>).</p>
<p>To further validate the role of the ACD in <italic>Tli</italic>VMA intein as a modulator of the DNA recognition responsible for the alternative site, we tested whether grafting of the ACD in <italic>Pho</italic>VMA<sub>Act</sub> intein from <italic>Tli</italic>VMA intein would confer alternative site recognition. We thus created <italic>Pho</italic>VMA<sub>Act-ACD(<italic>Tli</italic>)</sub> intein having the grafted ACD from <italic>Tli</italic>VMA intein (ACD (<italic>Tli</italic>)). Indeed, <italic>Pho</italic>VMA<sub>Act-ACD(<italic>Tli</italic>)</sub> could process both homing and alternative sites of the DNA substrate generated from the <italic>Pho</italic> genome, reminiscent of the cleavage profile produced by the <italic>Tli</italic>VMA intein (<xref ref-type="fig" rid="F4">Figure&#x20;4E</xref>). Furthermore, the swapping of the ACD rendered <italic>Pho</italic>VMA<sub>Act-ACD(<italic>Tli</italic>)</sub> intein more efficient in processing the DNA substrate without altering the apparent overall DNA affinity (<xref ref-type="fig" rid="F3">Figure&#x20;3D</xref>; <xref ref-type="sec" rid="s11">Supplementary Figure S6B</xref>). The weaker activity of <italic>Pho</italic>VMA<sub>Act-ACD(<italic>Tli</italic>)</sub> intein also allowed resolving a preferentiality of the homing site over the alternative site as the latter required a higher enzyme concentration (<xref ref-type="sec" rid="s11">Supplementary Figure&#x20;S6B</xref>).</p>
<p>The crystal structures of <italic>Tli</italic> and <italic>Pho</italic>VMA inteins, inserted at the same VMA-b insertion site of their host proteins, revealed a notable structural difference in their ACDs, largely deviating from each other (<xref ref-type="fig" rid="F2">Figures 2C</xref>, <xref ref-type="fig" rid="F5">5A</xref>,<xref ref-type="fig" rid="F5">B</xref>). The structural difference prompted us to investigate the functional role of ACDs. Our results demonstrated that the ACD in <italic>Tli</italic>VMA intein induced a second cleavage site in addition to the theoretical homing site (<xref ref-type="fig" rid="F4">Figure&#x20;4C</xref>). Interestingly, engineering the reactivated <italic>Pho</italic>VMA intein by grafting the ACD from <italic>Tli</italic>VMA intein triggered cleavage at the alternative site (<italic>AS</italic>) adjacent to the homing site (<italic>HS</italic>), suggesting that the ACD is responsible for the cleavage at the <italic>AS</italic> (<xref ref-type="fig" rid="F4">Figure&#x20;4E</xref>). The ACD of <italic>Tli</italic>VMA intein strongly resembles the helix-turn-helix motif common for many DNA binding proteins, such as transcriptional regulator proteins (<xref ref-type="fig" rid="F5">Figures 5A,B</xref>; <xref ref-type="sec" rid="s11">Supplementary Table S5</xref>) (<xref ref-type="bibr" rid="B3">Anderson et&#x20;al., 1981</xref>; <xref ref-type="bibr" rid="B12">Brennan and Matthews, 1989</xref>). The homology to DNA binding proteins suggests that ACDs mediate contacts with the DNA substrate.</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption>
<p>Effect of swapping ACDs in the VMA inteins on DNA recognition. <bold>(A)</bold> Primary structure comparison of ACDs in <italic>Tli</italic> and <italic>Pho</italic> VMA inteins. Regions with high similarity are highlighted with red rectangles. <bold>(B)</bold> The ACD structures of <italic>Tli</italic> and <italic>Pho</italic> VMA inteins and a comparison with the bacteriophage 434 repressor (434R, PDB: 2or1) (<xref ref-type="bibr" rid="B2">Aggarwal et&#x20;al., 1988</xref>) and the ACDs of PI-<italic>Tko</italic>II (PDB: 2cw7), PI-<italic>Pfu</italic>I (PDB: 1dq3), and PI-<italic>Sce</italic>I (PDB: 1lws). The regions highlighted with red rectangles in <bold>(B)</bold> are colored in red. <bold>(C)</bold> Schematic illustration of the swapping experiments using different ACDs. The ACD in the <italic>Tli</italic>VMA intein was replaced with the respective ACDs from the VMA intein of <italic>P. furiosus</italic> (<italic>Pfu</italic>), <italic>P. abyssi</italic> (<italic>Pab</italic>), or the 434-bacteriophage repressor domain. <bold>(D)</bold> DNA cleavages of &#x03BB;-phage DNA using <italic>Tli</italic> and <italic>Pho</italic> VMA intein variants carrying different ACDs. Cleavage of &#x3bb;-phage DNA was tested by overnight incubation with the indicated intein variants at 80&#xb0;C. Agarose gel analysis of the digestion reactions. Lane 1, &#x3bb;-phage DNA (48k&#xa0;bp) without intein; lane 2, the wild-type <italic>Tli</italic>VMA intein; lane 3, <italic>Tli</italic>VMA intein with deletion of ACD (<italic>Tli</italic>VMA<sup>&#x394;ACD</sup>); lane 4, <italic>Tli</italic>VMA intein with phage 434 repressor as ACD (<italic>Tli</italic>VMA<sub>434</sub>); lane 5, reactivated <italic>Pho</italic>VMA<sub>Act</sub> intein; lane 6, the reactivated <italic>Pho</italic>VMA intein with ACD from <italic>Tli</italic>VMA intein (<italic>Pho</italic>VMA<sub>Act-ACD(<italic>Tli</italic>)</sub>); lane 7, <italic>Tli</italic>VMA with ACD from <italic>Pfu</italic>VMA (<italic>Tli</italic>VMA<sub>ACD(<italic>Pfu</italic>)</sub>); lane 8, <italic>Tli</italic>VMA intein with ACD from <italic>Pab</italic>VMA (<italic>Tli</italic>VMA<sub>ACD(<italic>Pab</italic>)</sub>). M stands for the DNA size ladder.</p>
</caption>
<graphic xlink:href="fmolb-09-855511-g005.tif"/>
</fig>
<p>Next, we wanted to test whether <italic>Tli</italic>VMA intein is a promiscuous endonuclease and could cut unrelated substrates. We, therefore, tested digestion of &#x3bb;-phage DNA by incubating overnight with <italic>Tli</italic>VMA intein or <italic>Tli</italic>VMA<sup>&#x394;ACD</sup> intein lacking the ACD (<xref ref-type="fig" rid="F5">Figures 5C,D</xref>). To our surprise, we identified multiple cleavages in line with our observations using the model DNA substrates generated from <italic>T. litoralis</italic> genomic DNA. Furthermore, similar to our model DNA substrate, deletion of the ACD in <italic>Tli</italic>VMA intein indeed reduced cleavage of &#x3bb;-phage DNA, supporting our hypothesis that the ACD renders the intein endonuclease more promiscuous. In contrast, the activated <italic>Pho</italic>VMA<sub>Act</sub> intein with the endogenous ACD and the activated <italic>Pho</italic>VMA<sub>Act-ACD(<italic>Tli</italic>)</sub> with the grafted ACD from <italic>Tli</italic>VMA intein (ACD (<italic>Tli</italic>)) did not produce any detectable &#x3bb;-phage DNA cleavage, presumably due to the much lower affinity to the DNA substrate (<xref ref-type="fig" rid="F3">Figures 3D</xref>, <xref ref-type="fig" rid="F4">4E</xref>,&#x20;<xref ref-type="fig" rid="F5">5D</xref>).</p>
<p>We wondered how ACDs from other homologous inteins and an unrelated DNA-binding domain of the bacteriophage 434 repressor (434R) would affect cleavages of &#x3bb;-phage DNA by <italic>Tli</italic>VMA intein (<xref ref-type="bibr" rid="B2">Aggarwal et&#x20;al., 1988</xref>). We engineered <italic>Tli</italic>VMA intein by ACD-swapping the 434R domain and found that the engineered <italic>Tli</italic>VMA intein decreased &#x3bb;-phage DNA processing. However, we could still detect some extent of cleavages (<xref ref-type="fig" rid="F5">Figure&#x20;5D</xref>). Replacing the ACD in<italic>Tli</italic>VMA intein with an ACD from the more related inteins like VMA inteins from <italic>P. furiosus</italic> (<italic>Pfu</italic>VMA) and <italic>P. abyssi</italic> (<italic>Pab</italic>VMA) had a milder effect on the cleavage of &#x3bb;-phage DNA (<xref ref-type="fig" rid="F5">Figure&#x20;5D</xref>). Whereas the <italic>Tli</italic>VMA intein variant carrying the ACD from <italic>Pfu</italic>VMA intein (<italic>Tli</italic>VMA<sub>ACD(<italic>Pfu</italic>)</sub>) produced a restriction pattern very similar to the wild-type <italic>Tli</italic>VMA intein, the variant with the ACD from <italic>Pab</italic> (<italic>Tli</italic>VMA<sub>ACD(<italic>Pab</italic>)</sub>) exhibited a less similar pattern (<xref ref-type="fig" rid="F5">Figure&#x20;5D</xref>). The difference in the digestion profiles might arise from the fact that the ACD from <italic>Pfu</italic>VMA intein has eight mutations, while the ACD from <italic>Pab</italic>VMA intein contains 12-residue changes relative to the 55-residue region of the ACD in the <italic>Tli</italic>VMA intein. Interestingly, replacing the ACD in the <italic>Tli</italic>VMA intein with an unrelated DNA binding domain of phage 434R nearly abolished the cleavage of &#x3bb;-phage DNA by the <italic>Tli</italic>VMA intein (<italic>Tli</italic>VMA<sub>434</sub>), indicating that grafting of 434R might disrupt the functional structure completely or create steric hindrances due to the poor protein engineering.</p>
</sec>
</sec>
<sec sec-type="discussion" id="s3">
<title>Discussion</title>
<p>Homing endonucleases as rare cutting DNA endonucleases have sparked great interest in gene targeting and genome engineering (<xref ref-type="bibr" rid="B50">Stoddard, 2011</xref>). Currently, four classes of targetable DNA cleavage enzymes exist: zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), CRISPR/Cas RNA-guided nucleases (RGNs), and LAGLIDADG homing endonucleases (LHEs), the latter also termed &#x201c;Meganucleases.&#x201d; These enzymes can assist in targeted gene modification (<xref ref-type="bibr" rid="B14">Carroll, 2014</xref>). Engineering of rare cutting DNA endonucleases with novel desired recognition sites could open a myriad of <italic>in&#x20;vitro</italic> and <italic>in vivo</italic> applications targeting specific DNA sequences. Whereas the modular architectures of TALENs and ZFNs facilitate their protein engineering attempts to recognize novel sequences (<xref ref-type="bibr" rid="B38">Maeder et&#x20;al., 2008</xref>; <xref ref-type="bibr" rid="B15">Cermak et&#x20;al., 2011</xref>), LAGLIDADG-type homing endonucleases (LHEs) have been the most difficult enzymes to engineer for altered DNA recognition (<xref ref-type="bibr" rid="B52">Taylor et&#x20;al., 2012</xref>).</p>
<p>In this study, we determined the crystal structures of two archaeal inteins inserted at the same VMA-b site, revealing their molecular architecture consisting of HINT, HEN, and ACD. We found that the three-dimensional structures of ACDs were highly diverse among the five solved three-dimensional structures of inteins with nested HEN domains. Moreover, two ACDs from <italic>Tli</italic>VMA intein and PI-<italic>Tko</italic>II resemble typical DNA-binding proteins containing the helix-turn-helix motif (<xref ref-type="fig" rid="F2">Figures 2D,E</xref>). The modular structures of the HEN-containing inteins motivated us to engineer the nested HEN-associated inteins with altered DNA specificities for cleaving novel target sites by engineering the ACDs. We originally assumed that the presence of the ACD provided a higher specificity by additional DNA binding mediated by the&#x20;ACD.</p>
<p>Contrary to our expectation, the deletion of the ACD from <italic>Tli</italic>VMA intein and grafting of the ACD from <italic>Tli</italic>VMA intein to <italic>Pho</italic>VMA intein indicated that the ACD enables recognizing an additional cleavage site (<italic>AS</italic>), thereby rendering the homing endonuclease domain more promiscuous (<xref ref-type="fig" rid="F4">Figures 4C,E</xref>). However, grafting of ACDs from other archaeal VMA inteins and an unrelated phage DNA binding domain resulted in different digestion profiles of &#x3bb;-phage DNA. Protein engineering of ACDs suggests the potential of HEN-associated inteins as a scaffold for creating novel meganucleases capable of recognizing novel target sites. Further detailed characterization of DNA recognition mechanisms by HEN-associated inteins could open the possibility to develop novel reagents with modulated DNA recognition specificities (<xref ref-type="bibr" rid="B43">P&#xe2;ques and Duchateau, 2007</xref>; <xref ref-type="bibr" rid="B14">Carroll, 2014</xref>).</p>
<p>Inteins do not impact the host protein function because protein-splicing produces intact functional host proteins by self-excision of the inteins. Inteins, therefore, are found inserted into essential enzymes such as Vacuolar-type ATPase to ensure their selection. Abrogated inteins that accumulated mutations could result in inactive host proteins detrimental to the host organism. Therefore, protein splicing is required for the integrity of host proteins and establishes the selection. The homing endonuclease activity of inteins, however, is only required for invasion. Once the intein element occupies all target sites and is fixed in the population, the homing endonuclease activity degenerates and eventually becomes extinct, establishing the homing cycle (<xref ref-type="fig" rid="F6">Figure&#x20;6A</xref>) (<xref ref-type="bibr" rid="B13">Burt and Koufopanou, 2004</xref>). In some inteins, HENs have developed a mutualism with HINT by making HINT dependent on the presence of the HEN scaffold for protein splicing (<xref ref-type="bibr" rid="B33">Iwa&#xef; et&#x20;al., 2017</xref>). However, the mutualism between HINT and HEN could only slow down the eventual loss of&#x20;HENs.</p>
<fig id="F6" position="float">
<label>FIGURE 6</label>
<caption>
<p>
<bold>(A)</bold> The intein homing cycle model. The intein homing cycle model starts with the (1) Invasion of a HEN-containing intein via horizontal gene transfer followed by (2) Fixation into vacant alleles within the population. Depletion of HEN homing sites causes (3) Degeneration of the HEN due to accumulation of mutations tolerated by the lack of selection. Degenerated HENs are prone to (4) Deletion, rendering the intein incapable of competing with intein-free alleles, which might cause (5) Intein-loss upon interbreeding with strains providing an intein-free allele. <bold>(B)</bold> Intein spread model. Inteins might obtain an ACD modulating the HEN specificity, e.g., via changes in the ACD which can lower the HEN specificity to find novel insertion sites. Thus, the acquisition of a diverse ACD provides a spreading mechanism to prevent degeneration and extinction.</p>
</caption>
<graphic xlink:href="fmolb-09-855511-g006.tif"/>
</fig>
<p>Our studies on ACDs in archaeal VMA inteins suggest that ACDs play an essential role in directing inteins to new alternative homing sites by acquiring diverse ACDs, presumably to avoid the extinction of HEN and HEN-associated inteins (<xref ref-type="fig" rid="F6">Figure&#x20;6B</xref>). We hypothesize that HEN-associated inteins obtain an ACD from other genes such as transcription factors containing DNA-binding domains by yet unknown mechanisms to avoid the fixation (<xref ref-type="fig" rid="F6">Figure&#x20;6A</xref>). The observed diversity in the structures of ACDs implies the divergent evolution and might support our hypothesis. Moreover, in nature, many genes host multiple inteins. For example, DNA polymerase from <italic>Thermococcus kodakaraensis</italic> hosts the two inteins PI-<italic>Tko</italic>I and PI-<italic>Tko</italic>II, separated by 85&#xa0;amino acid residues in the host protein (<xref ref-type="bibr" rid="B32">Ichiyanagi et&#x20;al., 2000</xref>). Cell division control protein 21 (CDC-21) in <italic>Pyrococcus abyssi</italic> also contains two mini-inteins separated by 48&#xa0;amino-acid residues (<xref ref-type="bibr" rid="B9">Beyer et&#x20;al., 2019</xref>). Thus, the prevalence of genes harboring multiple inteins in nature could support our hypothesis that inteins exploit ACDs for expanding the homing site to spread. However, there might still be other unknown advantages of having alternative cleavage sites by HEN-associated inteins (<xref ref-type="fig" rid="F6">Figure&#x20;6B</xref>).</p>
<p>The structural basis of DNA recognition by HEN-associated inteins still awaits experimental elucidation of the high-resolution structure of DNA/inteins complexes. Such structural information of various HEN-associated inteins could shed light on the evolutionary histories of individual inteins and open a new avenue to develop a novel genetic engineering tool, which is smaller than RNA-guided nucleases for biotechnological applications.</p>
</sec>
<sec sec-type="materials|methods" id="s4">
<title>Materials and Methods</title>
<sec id="s4-1">
<title>Molecular Cloning, Protein Production, and Purification</title>
<p>All plasmids, oligonucleotides, and synthetic DNA substrate molecules used in this study are described in <xref ref-type="sec" rid="s11">Supplementary Table S2</xref>. All recombinant proteins were produced in the <italic>E.&#x20;coli</italic> strain T7 Express (New England Biolabs, USA). Expression details are given in <xref ref-type="sec" rid="s11">Supplementary Table S3</xref>. All inteins carry a substitution of the catalytic cysteine 1 to alanine (C1A) to enable purification as fusion proteins except for those used in protein splicing tests. Residue numbering starts with 1 for this catalytic intein amino acid position and proceeds toward the C-terminus. Intein preceding residues are given negative indices.</p>
<p>Expression cultures were harvested by centrifugation at 4,700<italic>g</italic> for 10&#xa0;min, 4&#xb0;C. Pelleted cells from 1 or 2&#xa0;L cultures were lysed in buffer A (50&#xa0;mM sodium phosphate, pH 8.0, 300&#xa0;mM NaCl) using continuous passaging through an EmulsiFlex-C3 homogenizer (Avestin, Canada) at 15,000&#xa0;psi, 4&#xb0;C for 10&#xa0;min. Lysates were cleared by centrifugation at 38,000<italic>g</italic> for 60&#xa0;min, 4&#xb0;C. Proteins were purified in two steps using 5&#xa0;ml HisTrap HP columns (GE Healthcare Life Sciences, USA) as previously described, including the removal of the hexahistidine tag and MBP and SUMO fusion domains (<xref ref-type="bibr" rid="B28">Guerrero et&#x20;al., 2015</xref>).</p>
<p>After proteolytic removal of the fusion domains using Ubiquitin-like-specific protease 1 (UlpI), thermostable <italic>Pho</italic>VMA<sub>Act</sub> intein (pHBRSF067) and <italic>Tli</italic>VMA<sup>&#x394;ACD</sup> intein (pHBRSF075) were heat fractionated at 80&#xb0;C for 20&#xa0;min before application to the Ni<sup>2&#x2b;</sup>-NTA HisTrap column. After purification, all proteins were dialyzed overnight at 8&#xb0;C against the following buffers: <italic>Tli</italic>VMA intein (pHBRSF063) was first dialyzed against 10&#xa0;mM Tris-HCl pH 8.0, resulting in precipitation of the target protein. Precipitated protein was resolubilized by the addition of 500&#xa0;mM KCl followed by dialysis against 10&#xa0;mM Tris-HCl pH 8.0, 400&#xa0;mM KCl. <italic>Tli</italic>VMA<sup>&#x394;ACD</sup> intein was treated like <italic>Tli</italic>VMA intein, but 800&#xa0;mM KCl was used for resolubilization, and the last step of dialysis was omitted. <italic>Pho</italic>VMA intein (pCARSF54) was dialyzed against deionized water. <italic>Pho</italic>VMA<sub>Act</sub> and <italic>Pho</italic>VMA<sub>Act-ACD(<italic>Tli</italic>)</sub> intein (pHBRSF079) were dialyzed against 10&#xa0;mM Tris-HCl pH 8.0, 300&#xa0;mM KCl. The isolated ACD of <italic>Tli</italic> (pHBRSF082) was dialyzed against 20&#xa0;mM Tris-HCl pH 8.0, 300&#xa0;mM KCl. The <italic>Tli</italic>VMA intein variants containing exchanged ACDs (pHBRSF083, <italic>Pfu</italic>VMA ACD; pHBRSF084, <italic>Pab</italic>VMA ACD; pHBRSF161, 434-repressor domain) were purified like the <italic>Tli</italic>VMA intein (pHBRSF063). After proteolytic removal of the H<sub>6</sub>-MBP-SUMO purification tag, sample and purification buffers were supplemented with 350 and 200&#xa0;mM KCl, respectively. After dialysis against salt-free buffers, the three proteins were resolubilized by addition of 475, 600, and 700&#xa0;mM KCl, respectively.</p>
<p>Proteins were subsequently concentrated using Macrosep<sup>&#xae;</sup> Advance Centrifugal Devices 10K MWCO (PALL). For enzymatic assays, proteins were diluted to 50&#xa0;&#xb5;M with 50&#xa0;mM Tris-HCl pH 8.0, 300&#xa0;mM KCl, 10% (v/v) glycerol, 1&#xa0;mM dithiothreitol (DTT) and stored at &#x2212;80&#xb0;C for further&#x20;use.</p>
</sec>
<sec id="s4-2">
<title>Crystallization, Data Collection, and Structure Solution</title>
<p>All diffracting crystals were obtained at room temperature using the sitting drop vapour diffusion method by mixing 100&#xa0;nL concentrated <italic>Pho</italic>VMA, <italic>Tli</italic>VMA, and <italic>Tli</italic>VMA<sup>&#x394;ACD</sup> inteins with 100&#xa0;nL of the following mother liquors: <italic>Pho</italic>VMA, (100&#xa0;mM HEPES pH 8.0, 5&#xa0;mM cadmium chloride, 5&#xa0;mM magnesium chloride, 5&#xa0;mM nickel (II) chloride, 10% (w/v) polyethylene glycol (PEG) 3,350); <italic>Tli</italic>VMA, (100&#xa0;mM magnesium formate, 15% (w/v) PEG 3350); <italic>Tli</italic>VMA<sup>&#x394;ACD</sup>, (100&#xa0;mM HEPES pH 7.5, 70% (v/v) 2-methyl-2,4-pentanediol (MPD)). Data were collected at beamline i04 (<italic>Pho</italic>VMA and <italic>Tli</italic>VMA<sup>&#x394;ACD</sup> inteins) at Diamond Light Source (Didcot, United&#x20;Kingdom) and beamline ID30A-1 (MASSIF-1, <italic>Tli</italic>VMA intein) at ESRF (Grenoble, France) (<xref ref-type="bibr" rid="B11">Bowler et&#x20;al., 2015</xref>). Data were processed using XDS (<xref ref-type="bibr" rid="B35">Kabsch, 2010</xref>). Structures were solved by molecular replacement (MR) starting from a <italic>Pho</italic>VMA intein search model generated using SWISS-MODEL (<xref ref-type="bibr" rid="B6">Arnold et&#x20;al., 2006</xref>) with the intein homing endonuclease II of <italic>Thermococcus kodakarensis</italic> DNA polymerase (PDB: 2cw7) as a template. Initial models were obtained using the MR pipeline of Auto-Rickshaw (<xref ref-type="bibr" rid="B42">Panjikar et&#x20;al., 2005</xref>) and ARP/wARP (<xref ref-type="bibr" rid="B36">Langer et&#x20;al., 2008</xref>). Model-building and refinement were performed with COOT (<xref ref-type="bibr" rid="B24">Emsley et&#x20;al., 2010</xref>) and PHENIX (<xref ref-type="bibr" rid="B1">Adams et&#x20;al., 2002</xref>). The obtained structures were validated with Molprobity (<xref ref-type="bibr" rid="B16">Chen et&#x20;al., 2010</xref>). Please refer to the Supporting Materials and Methods for a detailed description. Figures presenting three-dimensional coordinates were generated using PyMOL (The PyMOL Molecular Graphics System, Version 2.2.0, Schr&#xf6;dinger, LLC.). Electrostatic surface distributions were calculated and visualized using UCSF Chimera 1.13.1 (<xref ref-type="bibr" rid="B46">Pettersen et&#x20;al., 2004</xref>) with ABS 1.3 (<xref ref-type="bibr" rid="B34">Jurrus et&#x20;al., 2018</xref>) after model preparation with PDB2PQR 2.2.1 (<xref ref-type="bibr" rid="B21">Dolinsky et&#x20;al., 2004</xref>) using a PARSE force field. Sequence alignments were performed with Clustal Omega 1.2.4. (<xref ref-type="bibr" rid="B48">Sievers et&#x20;al., 2011</xref>).</p>
</sec>
<sec id="s4-3">
<title>DNA-Cleavage Analysis by the Inteins</title>
<p>If not indicated otherwise, electrophoretic mobility shift and enzymatic cleavage assays were performed in a total volume of 10&#xa0;&#xb5;L in 10&#xa0;mM Tris-HCl pH 8.0, 100&#xa0;mM KCl, 10&#xa0;mM MgCl<sub>2</sub>, 1&#xa0;mM DTT using 0.05&#xa0;&#xb5;M (750&#xa0;bp), or 0.01&#xa0;&#xb5;M (2037&#xa0;bp) linear dsDNA substrates containing the desired intein homing site (<xref ref-type="sec" rid="s11">Supplementary Table S2</xref>). Typically, reactions contained 0.5&#xa0;&#xb5;M of <italic>Tli</italic>VMA- or 8&#xa0;&#xb5;M of <italic>Pho</italic>VMA-derived inteins and were incubated at 80&#xb0;C for 2&#xa0;h. Where indicated, reactions were stopped by the addition of 1&#xa0;&#xb5;L endonuclease stop solution (5% (w/v) SDS, 250&#xa0;mM EDTA, 100&#xa0;mM Tris-HCl pH 7.5). Mobility shifts and cleavage products were visualized on 1.2% agarose&#x20;gels.</p>
<p>For the determination of HEN cleavage sites, 1.5&#xa0;&#xb5;g of substrate DNA were digested overnight using the respective endonuclease with the above-described buffers, temperature, and concentrations. For the <italic>Tli</italic>VMA intein, a stop solution was used to dissociate the HEN from the restriction products. Products were gel-purified and sequenced via Eurofins Genomics GmbH using the exterior oligonucleotides as used for the generation of the DNA substrates (<xref ref-type="sec" rid="s11">Supplementary Table S2</xref>). For the digestion of &#x3bb;-phage DNA, 1&#xa0;&#xb5;g substrate was incubated overnight with the indicated intein variants as described&#x20;above.</p>
</sec>
<sec id="s4-4">
<title>
<italic>In Vivo</italic> Protein <italic>Cis</italic>-Splicing Assays</title>
<p>Protein <italic>cis</italic>-splicing of intein variants was tested by expressing the indicated intein variants flanked by two B1 domains of the IgG-binding protein G in 5&#xa0;ml cultures of <italic>E.&#x20;coli</italic> strain T7 Express (New England Biolabs, USA) and purified using immobilized metal affinity chromatography as described elsewhere (<xref ref-type="bibr" rid="B10">Beyer et&#x20;al., 2020</xref>). The used plasmids are listed in <xref ref-type="sec" rid="s11">Supplementary Table S2</xref>. The experiments were performed at 30&#x2013;37&#xb0;C and the expression period lasted 3&#x2013;4&#xa0;h. Protein splicing was analyzed by SDS-PAGE using 16.5% gels and Coomassie Blue staining.</p>
</sec>
</sec>
</body>
<back>
<sec id="s6">
<title>Data Availability Statement</title>
<p>The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/<xref ref-type="sec" rid="s11">Supplementary Material</xref>. Atomic coordinates and structure factors for the reported crystal structures have been deposited with the Protein Data bank under accession numbers 7QSS, 7QST, and 7QSU for <italic>Tli</italic>VMA intein (C1A), <italic>Pho</italic>VMA intein (C1A), and <italic>Tli</italic>VMA<sup>&#x394;ACD</sup> intein, respectively.</p>
</sec>
<sec id="s7">
<title>Author Contributions</title>
<p>HI and HB conceptualized the project and contributed to the design of the studies. HB established and produced the recombinant proteins and performed the assays. HB and HI solved the structures. HB wrote the first draft of the manuscript. All authors contributed to the manuscript revision and approved the submitted version.</p>
</sec>
<sec id="s8">
<title>Funding</title>
<p>This work is supported by grants from the Novo Nordisk foundation (NNF17OC0025402 to HB, NNF17OC0027550 to HI), the Academy of Finland (137995, 277335 to HI), and the &#x201c;Freigeist&#x201d; fellowship of the Volkswagen Foundation to HB. Helsinki University Library supported funding for the open access charge.</p>
</sec>
<sec sec-type="COI-statement" id="s9">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s10">
<title>Publisher&#x2019;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ack>
<p>The authors thank S. M. Backlund and C. Albert for technical assistance and K. M. Mikula and the crystallization facility for crystallization and data collections. We acknowledge the Finnish Block Allocation Group for access to the synchrotron light sources.</p>
</ack>
<sec id="s11">
<title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fmolb.2022.855511/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fmolb.2022.855511/full&#x23;supplementary-material</ext-link>
</p>
<supplementary-material xlink:href="DataSheet1.PDF" id="SM1" mimetype="application/PDF" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Adams</surname>
<given-names>P. D.</given-names>
</name>
<name>
<surname>Grosse-Kunstleve</surname>
<given-names>R. W.</given-names>
</name>
<name>
<surname>Hung</surname>
<given-names>L.-W.</given-names>
</name>
<name>
<surname>Ioerger</surname>
<given-names>T. R.</given-names>
</name>
<name>
<surname>McCoy</surname>
<given-names>A. J.</given-names>
</name>
<name>
<surname>Moriarty</surname>
<given-names>N. W.</given-names>
</name>
<etal/>
</person-group> (<year>2002</year>). <article-title>PHENIX: Building New Software for Automated Crystallographic Structure Determination</article-title>. <source>Acta Crystallogr. D Biol. Cryst.</source> <volume>58</volume> (<issue>11</issue>), <fpage>1948</fpage>&#x2013;<lpage>1954</lpage>. <pub-id pub-id-type="doi">10.1107/s0907444902016657</pub-id> </citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Aggarwal</surname>
<given-names>A. K.</given-names>
</name>
<name>
<surname>Rodgers</surname>
<given-names>D. W.</given-names>
</name>
<name>
<surname>Drottar</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Ptashne</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Harrison</surname>
<given-names>S. C.</given-names>
</name>
</person-group> (<year>1988</year>). <article-title>Recognition of a DNA Operator by the Repressor of Phage 434: a View at High Resolution</article-title>. <source>Science</source> <volume>242</volume> (<issue>4880</issue>), <fpage>899</fpage>&#x2013;<lpage>907</lpage>. <pub-id pub-id-type="doi">10.1126/science.3187531</pub-id> </citation>
</ref>
<ref id="B3">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Anderson</surname>
<given-names>W. F.</given-names>
</name>
<name>
<surname>Ohlendorf</surname>
<given-names>D. H.</given-names>
</name>
<name>
<surname>Takeda</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Matthews</surname>
<given-names>B. W.</given-names>
</name>
</person-group> (<year>1981</year>). <article-title>Structure of the Cro Repressor from Bacteriophage &#x3bb; and its Interaction with DNA</article-title>. <source>Nature</source> <volume>290</volume> (<issue>5809</issue>), <fpage>754</fpage>&#x2013;<lpage>758</lpage>. <pub-id pub-id-type="doi">10.1038/290754a0</pub-id> </citation>
</ref>
<ref id="B54">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Aranko</surname>
<given-names>A. S.</given-names>
</name>
<name>
<surname>Oeemig</surname>
<given-names>J. S.</given-names>
</name>
<name>
<surname>Kajander</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Iwa&#x00EF;</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>Intermolecular Domain Swapping Induces Intein-Mediated Protein Alternative Splicing</article-title>. <source>Nat. Chem. Biol.</source> <volume>9</volume> (<issue>10</issue>), <fpage>616</fpage>&#x2013;<lpage>622</lpage>. <pub-id pub-id-type="doi">10.1038/nchembio.1320</pub-id> </citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Aranko</surname>
<given-names>A. S.</given-names>
</name>
<name>
<surname>Wlodawer</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Iwa&#x00EF;</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Nature&#x27;s Recipe for Splitting Inteins</article-title>. <source>Protein Eng. Des. Selection</source> <volume>27</volume> (<issue>8</issue>), <fpage>263</fpage>&#x2013;<lpage>271</lpage>. <pub-id pub-id-type="doi">10.1093/protein/gzu028</pub-id> </citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Argast</surname>
<given-names>G. M.</given-names>
</name>
<name>
<surname>Stephens</surname>
<given-names>K. M.</given-names>
</name>
<name>
<surname>Emond</surname>
<given-names>M. J.</given-names>
</name>
<name>
<surname>Monnat</surname>
<given-names>R. J.</given-names>
<suffix>Jr</suffix>
</name>
</person-group> (<year>1998</year>). <article-title>I- Ppo I and I- Cre I Homing Site Sequence Degeneracy Determined by Random Mutagenesis and Sequential <italic>In Vitro</italic> Enrichment 1&#x20;1Edited by G. Smith</article-title>. <source>J.&#x20;Mol. Biol.</source> <volume>280</volume> (<issue>3</issue>), <fpage>345</fpage>&#x2013;<lpage>353</lpage>. <pub-id pub-id-type="doi">10.1006/jmbi.1998.1886</pub-id> </citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Arnold</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Bordoli</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Kopp</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Schwede</surname>
<given-names>T.</given-names>
</name>
</person-group> (<year>2006</year>). <article-title>The SWISS-MODEL Workspace: a Web-Based Environment for Protein Structure Homology Modelling</article-title>. <source>Bioinformatics</source> <volume>22</volume> (<issue>2</issue>), <fpage>195</fpage>&#x2013;<lpage>201</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/bti770</pub-id> </citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Baek</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>DiMaio</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Anishchenko</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Dauparas</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Ovchinnikov</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>G. R.</given-names>
</name>
<etal/>
</person-group> (<year>2021</year>). <article-title>Accurate Prediction of Protein Structures and Interactions Using a Three-Track Neural Network</article-title>. <source>Science</source> <volume>373</volume> (<issue>6557</issue>), <fpage>871</fpage>&#x2013;<lpage>876</lpage>. <pub-id pub-id-type="doi">10.1126/science.abj8754</pub-id> </citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Barzel</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Naor</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Privman</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Kupiec</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Gophna</surname>
<given-names>U.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>Homing Endonucleases Residing within Inteins: Evolutionary Puzzles Awaiting Genetic Solutions</article-title>. <source>Biochem. Soc. Trans.</source> <volume>39</volume> (<issue>1</issue>), <fpage>169</fpage>&#x2013;<lpage>173</lpage>. <pub-id pub-id-type="doi">10.1042/bst0390169</pub-id> </citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Beyer</surname>
<given-names>H. M.</given-names>
</name>
<name>
<surname>Mikula</surname>
<given-names>K. M.</given-names>
</name>
<name>
<surname>Kudling</surname>
<given-names>T. V.</given-names>
</name>
<name>
<surname>Iwa&#xef;</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Crystal Structures of CDC21-1 Inteins from Hyperthermophilic Archaea Reveal the Selection Mechanism for the Highly Conserved Homing Endonuclease Insertion Site</article-title>. <source>Extremophiles</source> <volume>23</volume> (<issue>6</issue>), <fpage>669</fpage>&#x2013;<lpage>679</lpage>. <pub-id pub-id-type="doi">10.1007/s00792-019-01117-4</pub-id> </citation>
</ref>
<ref id="B10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Beyer</surname>
<given-names>H. M.</given-names>
</name>
<name>
<surname>Mikula</surname>
<given-names>K. M.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Wlodawer</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Iwa&#xef;</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>The crystal Structure of the Naturally Split Gp41&#x2010;1 Intein Guides the Engineering of Orthogonal Split Inteins Fromcis&#x2010;splicing Inteins</article-title>. <source>FEBS J.</source> <volume>287</volume> (<issue>9</issue>), <fpage>1886</fpage>&#x2013;<lpage>1898</lpage>. <pub-id pub-id-type="doi">10.1111/febs.15113</pub-id> </citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bowler</surname>
<given-names>M. W.</given-names>
</name>
<name>
<surname>Nurizzo</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Barrett</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Beteva</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Bodin</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Caserotto</surname>
<given-names>H.</given-names>
</name>
<etal/>
</person-group> (<year>2015</year>). <article-title>MASSIF-1: a Beamline Dedicated to the Fully Automatic Characterization and Data Collection from Crystals of Biological Macromolecules</article-title>. <source>J.&#x20;Synchrotron Radiat.</source> <volume>22</volume> (<issue>6</issue>), <fpage>1540</fpage>&#x2013;<lpage>1547</lpage>. <pub-id pub-id-type="doi">10.1107/s1600577515016604</pub-id> </citation>
</ref>
<ref id="B12">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Brennan</surname>
<given-names>R. G.</given-names>
</name>
<name>
<surname>Matthews</surname>
<given-names>B. W.</given-names>
</name>
</person-group> (<year>1989</year>). <article-title>The helix-turn-helix DNA Binding Motif</article-title>. <source>J.&#x20;Biol. Chem.</source> <volume>264</volume> (<issue>4</issue>), <fpage>1903</fpage>&#x2013;<lpage>1906</lpage>. <pub-id pub-id-type="doi">10.1016/s0021-9258(18)94115-3</pub-id> </citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Burt</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Koufopanou</surname>
<given-names>V.</given-names>
</name>
</person-group> (<year>2004</year>). <article-title>Homing Endonuclease Genes: the Rise and Fall and Rise Again of a Selfish Element</article-title>. <source>Curr. Opin. Genet. Dev.</source> <volume>14</volume> (<issue>6</issue>), <fpage>609</fpage>&#x2013;<lpage>615</lpage>. <pub-id pub-id-type="doi">10.1016/j.gde.2004.09.010</pub-id> </citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Carroll</surname>
<given-names>D.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Genome Engineering with Targetable Nucleases</article-title>. <source>Annu. Rev. Biochem.</source> <volume>83</volume>, <fpage>409</fpage>&#x2013;<lpage>439</lpage>. <pub-id pub-id-type="doi">10.1146/annurev-biochem-060713-035418</pub-id> </citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cermak</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Doyle</surname>
<given-names>E. L.</given-names>
</name>
<name>
<surname>Christian</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Schmidt</surname>
<given-names>C.</given-names>
</name>
<etal/>
</person-group> (<year>2011</year>). <article-title>Efficient Design and Assembly of Custom TALEN and Other TAL Effector-Based Constructs for DNA Targeting</article-title>. <source>Nucleic Acids Res.</source> <volume>39</volume> (<issue>12</issue>), <fpage>e82</fpage>. <pub-id pub-id-type="doi">10.1093/nar/gkr218</pub-id> </citation>
</ref>
<ref id="B16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>V. B.</given-names>
</name>
<name>
<surname>Arendall</surname>
<given-names>W. B.</given-names>
</name>
<name>
<surname>Headd</surname>
<given-names>J.&#x20;J.</given-names>
</name>
<name>
<surname>Keedy</surname>
<given-names>D. A.</given-names>
</name>
<name>
<surname>Immormino</surname>
<given-names>R. M.</given-names>
</name>
<name>
<surname>Kapral</surname>
<given-names>G. J.</given-names>
</name>
<etal/>
</person-group> (<year>2010</year>). <article-title>MolProbity: All-Atom Structure Validation for Macromolecular Crystallography</article-title>. <source>Acta Crystallogr. D Biol. Cryst.</source> <volume>66</volume> (<issue>1</issue>), <fpage>12</fpage>&#x2013;<lpage>21</lpage>. <pub-id pub-id-type="doi">10.1107/s0907444909042073</pub-id> </citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chevalier</surname>
<given-names>B. S.</given-names>
</name>
<name>
<surname>Stoddard</surname>
<given-names>B. L.</given-names>
</name>
</person-group> (<year>2001</year>). <article-title>Homing Endonucleases: Structural and Functional Insight into the Catalysts of Intron/intein Mobility</article-title>. <source>Nucleic Acids Res.</source> <volume>29</volume> (<issue>18</issue>), <fpage>3757</fpage>&#x2013;<lpage>3774</lpage>. <pub-id pub-id-type="doi">10.1093/nar/29.18.3757</pub-id> </citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dalgaard</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Klar</surname>
<given-names>A. J.</given-names>
</name>
<name>
<surname>Moser</surname>
<given-names>M. J.</given-names>
</name>
<name>
<surname>Holley</surname>
<given-names>W. R.</given-names>
</name>
<name>
<surname>Chatterjee</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Mian</surname>
<given-names>I. S.</given-names>
</name>
</person-group> (<year>1997</year>). <article-title>Statistical Modeling and Analysis of the LAGLIDADG Family of Site- Specific Endonucleases and Identification of an Intein that Encodes a Site-specific Endonuclease of the HNH Family</article-title>. <source>Nucleic Acids Res.</source> <volume>25</volume> (<issue>22</issue>), <fpage>4626</fpage>&#x2013;<lpage>4638</lpage>. <pub-id pub-id-type="doi">10.1093/nar/25.22.4626</pub-id> </citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Derbyshire</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Belfort</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>1998</year>). <article-title>Lightning Strikes Twice: Intron-Intein Coincidence</article-title>. <source>Proc. Natl. Acad. Sci.</source> <volume>95</volume> (<issue>4</issue>), <fpage>1356</fpage>&#x2013;<lpage>1357</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.95.4.1356</pub-id> </citation>
</ref>
<ref id="B20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Derbyshire</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Wood</surname>
<given-names>D. W.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Dansereau</surname>
<given-names>J.&#x20;T.</given-names>
</name>
<name>
<surname>Dalgaard</surname>
<given-names>J.&#x20;Z.</given-names>
</name>
<name>
<surname>Belfort</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>1997</year>). <article-title>Genetic Definition of a Protein-Splicing Domain: Functional Mini-Inteins Support Structure Predictions and a Model for Intein Evolution</article-title>. <source>Proc. Natl. Acad. Sci.</source> <volume>94</volume> (<issue>21</issue>), <fpage>11466</fpage>&#x2013;<lpage>11471</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.94.21.11466</pub-id> </citation>
</ref>
<ref id="B21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dolinsky</surname>
<given-names>T. J.</given-names>
</name>
<name>
<surname>Nielsen</surname>
<given-names>J.&#x20;E.</given-names>
</name>
<name>
<surname>McCammon</surname>
<given-names>J.&#x20;A.</given-names>
</name>
<name>
<surname>Baker</surname>
<given-names>N. A.</given-names>
</name>
</person-group> (<year>2004</year>). <article-title>PDB2PQR: an Automated Pipeline for the Setup of Poisson-Boltzmann Electrostatics Calculations</article-title>. <source>Nucleic Acids Res.</source> <volume>32</volume> (<issue>Web Server issue</issue>), <fpage>W665</fpage>&#x2013;<lpage>W667</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkh381</pub-id> </citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Duan</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Gimble</surname>
<given-names>F. S.</given-names>
</name>
<name>
<surname>Quiocho</surname>
<given-names>F. A.</given-names>
</name>
</person-group> (<year>1997</year>). <article-title>Crystal Structure of PI-SceI, a Homing Endonuclease with Protein Splicing Activity</article-title>. <source>Cell</source> <volume>89</volume> (<issue>4</issue>), <fpage>555</fpage>&#x2013;<lpage>564</lpage>. <pub-id pub-id-type="doi">10.1016/s0092-8674(00)80237-8</pub-id> </citation>
</ref>
<ref id="B23">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dujon</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Beifort</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Butow</surname>
<given-names>R. A.</given-names>
</name>
<name>
<surname>Jacq</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Lemieux</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Perlman</surname>
<given-names>P. S.</given-names>
</name>
<etal/>
</person-group> (<year>1989</year>). <article-title>Mobile Introns: Definition of Terms and Recommended Nomenclature</article-title>. <source>Gene</source> <volume>82</volume> (<issue>1</issue>), <fpage>115</fpage>&#x2013;<lpage>118</lpage>. <pub-id pub-id-type="doi">10.1016/0378-1119(89)90035-8</pub-id> </citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Emsley</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Lohkamp</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Scott</surname>
<given-names>W. G.</given-names>
</name>
<name>
<surname>Cowtan</surname>
<given-names>K.</given-names>
</name>
</person-group> (<year>2010</year>). <article-title>Features and Development of Coot</article-title>. <source>Acta Crystallogr. D Biol. Cryst.</source> <volume>66</volume> (<issue>4</issue>), <fpage>486</fpage>&#x2013;<lpage>501</lpage>. <pub-id pub-id-type="doi">10.1107/s0907444910007493</pub-id> </citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Eryilmaz</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Shah</surname>
<given-names>N. H.</given-names>
</name>
<name>
<surname>Muir</surname>
<given-names>T. W.</given-names>
</name>
<name>
<surname>Cowburn</surname>
<given-names>D.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Structural and Dynamical Features of Inteins and Implications on Protein Splicing</article-title>. <source>J.&#x20;Biol. Chem.</source> <volume>289</volume> (<issue>21</issue>), <fpage>14506</fpage>&#x2013;<lpage>14511</lpage>. <pub-id pub-id-type="doi">10.1074/jbc.r113.540302</pub-id> </citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Green</surname>
<given-names>C. M.</given-names>
</name>
<name>
<surname>Novikova</surname>
<given-names>O.</given-names>
</name>
<name>
<surname>Belfort</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>The Dynamic Intein Landscape of Eukaryotes</article-title>. <source>Mobile DNA</source> <volume>9</volume>, <fpage>4</fpage>. <pub-id pub-id-type="doi">10.1186/s13100-018-0111-x</pub-id> </citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Grindl</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Wende</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Pingoud</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Pingoud</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>1998</year>). <article-title>The Protein Splicing Domain of the Homing Endonuclease PI-sceI Is Responsible for Specific DNA Binding</article-title>. <source>Nucleic Acids Res.</source> <volume>26</volume> (<issue>8</issue>), <fpage>1857</fpage>&#x2013;<lpage>1862</lpage>. <pub-id pub-id-type="doi">10.1093/nar/26.8.1857</pub-id> </citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Guerrero</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Ciragan</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Iwa&#xef;</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Tandem SUMO Fusion Vectors for Improving Soluble Protein Expression and Purification</article-title>. <source>Protein Expr. Purif.</source> <volume>116</volume>, <fpage>42</fpage>&#x2013;<lpage>49</lpage>. <pub-id pub-id-type="doi">10.1016/j.pep.2015.08.019</pub-id> </citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hiltunen</surname>
<given-names>M. K.</given-names>
</name>
<name>
<surname>Beyer</surname>
<given-names>H. M.</given-names>
</name>
<name>
<surname>Iwa&#xef;</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Mini-Intein Structures from Extremophiles Suggest a Strategy for Finding Novel Robust Inteins</article-title>. <source>Microorganisms</source> <volume>9</volume> (<issue>6</issue>), <fpage>1226</fpage>. <pub-id pub-id-type="doi">10.3390/microorganisms9061226</pub-id> </citation>
</ref>
<ref id="B30">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hirata</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Ohsumk</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Nakano</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Kawasaki</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Suzuki</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Anraku</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>1990</year>). <article-title>Molecular Structure of a Gene, VMA1, Encoding the Catalytic Subunit of H(&#x2b;)-translocating Adenosine Triphosphatase from Vacuolar Membranes of <italic>Saccharomyces cerevisiae</italic>
</article-title>. <source>J.&#x20;Biol. Chem.</source> <volume>265</volume> (<issue>12</issue>), <fpage>6726</fpage>&#x2013;<lpage>6733</lpage>. <pub-id pub-id-type="doi">10.1016/s0021-9258(19)39210-5</pub-id> </citation>
</ref>
<ref id="B32">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ichiyanagi</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Ishino</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Ariyoshi</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Komori</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Morikawa</surname>
<given-names>K.</given-names>
</name>
</person-group> (<year>2000</year>). <article-title>Crystal Structure of an Archaeal Intein-Encoded Homing Endonuclease PI-PfuI</article-title>. <source>J.&#x20;Mol. Biol.</source> <volume>300</volume> (<issue>4</issue>), <fpage>889</fpage>&#x2013;<lpage>901</lpage>. <pub-id pub-id-type="doi">10.1006/jmbi.2000.3873</pub-id> </citation>
</ref>
<ref id="B33">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Iwa&#xef;</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Mikula</surname>
<given-names>K. M.</given-names>
</name>
<name>
<surname>Oeemig</surname>
<given-names>J.&#x20;S.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Wlodawer</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Structural Basis for the Persistence of Homing Endonucleases in Transcription Factor IIB Inteins</article-title>. <source>J.&#x20;Mol. Biol.</source> <volume>429</volume> (<issue>24</issue>), <fpage>3942</fpage>&#x2013;<lpage>3956</lpage>. <pub-id pub-id-type="doi">10.1016/j.jmb.2017.10.016</pub-id> </citation>
</ref>
<ref id="B34">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jurrus</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Engel</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Star</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Monson</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Brandi</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Felberg</surname>
<given-names>L. E.</given-names>
</name>
<etal/>
</person-group> (<year>2018</year>). <article-title>Improvements to the APBS Biomolecular Solvation Software Suite</article-title>. <source>Protein Sci.</source> <volume>27</volume> (<issue>1</issue>), <fpage>112</fpage>&#x2013;<lpage>128</lpage>. <pub-id pub-id-type="doi">10.1002/pro.3280</pub-id> </citation>
</ref>
<ref id="B35">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kabsch</surname>
<given-names>W.</given-names>
</name>
</person-group> (<year>2010</year>). <article-title>XDS</article-title>. <source>Acta Crystallogr. D Biol. Cryst.</source> <volume>66</volume> (<issue>2</issue>), <fpage>125</fpage>&#x2013;<lpage>132</lpage>. <pub-id pub-id-type="doi">10.1107/s0907444909047337</pub-id> </citation>
</ref>
<ref id="B36">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Langer</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Cohen</surname>
<given-names>S. X.</given-names>
</name>
<name>
<surname>Lamzin</surname>
<given-names>V. S.</given-names>
</name>
<name>
<surname>Perrakis</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2008</year>). <article-title>Automated Macromolecular Model Building for X-ray Crystallography Using ARP/wARP Version 7</article-title>. <source>Nat. Protoc.</source> <volume>3</volume> (<issue>7</issue>), <fpage>1171</fpage>&#x2013;<lpage>1179</lpage>. <pub-id pub-id-type="doi">10.1038/nprot.2008.91</pub-id> </citation>
</ref>
<ref id="B37">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>X.-Q.</given-names>
</name>
</person-group> (<year>2000</year>). <article-title>Protein-splicing Intein: Genetic Mobility, Origin, and Evolution</article-title>. <source>Annu. Rev. Genet.</source> <volume>34</volume>, <fpage>61</fpage>&#x2013;<lpage>76</lpage>. <pub-id pub-id-type="doi">10.1146/annurev.genet.34.1.61</pub-id> </citation>
</ref>
<ref id="B38">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Maeder</surname>
<given-names>M. L.</given-names>
</name>
<name>
<surname>Thibodeau-Beganny</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Osiak</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Wright</surname>
<given-names>D. A.</given-names>
</name>
<name>
<surname>Anthony</surname>
<given-names>R. M.</given-names>
</name>
<name>
<surname>Eichtinger</surname>
<given-names>M.</given-names>
</name>
<etal/>
</person-group> (<year>2008</year>). <article-title>Rapid "Open-Source" Engineering of Customized Zinc-finger Nucleases for Highly Efficient Gene Modification</article-title>. <source>Mol. Cel</source> <volume>31</volume> (<issue>2</issue>), <fpage>294</fpage>&#x2013;<lpage>301</lpage>. <pub-id pub-id-type="doi">10.1016/j.molcel.2008.06.016</pub-id> </citation>
</ref>
<ref id="B39">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Matsumura</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Takahashi</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Inoue</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Yamamoto</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Hashimoto</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Nishioka</surname>
<given-names>M.</given-names>
</name>
<etal/>
</person-group> (<year>2006</year>). <article-title>Crystal Structure of Intein Homing Endonuclease II Encoded in DNA Polymerase Gene from Hyperthermophilic Archaeon Thermococcus Kodakaraensis Strain KOD1</article-title>. <source>Proteins</source> <volume>63</volume> (<issue>3</issue>), <fpage>711</fpage>&#x2013;<lpage>715</lpage>. <pub-id pub-id-type="doi">10.1002/prot.20858</pub-id> </citation>
</ref>
<ref id="B40">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Moure</surname>
<given-names>C. M.</given-names>
</name>
<name>
<surname>Gimble</surname>
<given-names>F. S.</given-names>
</name>
<name>
<surname>Quiocho</surname>
<given-names>F. A.</given-names>
</name>
</person-group> (<year>2002</year>). <article-title>Crystal Structure of the Intein Homing Endonuclease PI-SceI Bound to its Recognition Sequence</article-title>. <source>Nat. Struct. Biol.</source> <volume>9</volume> (<issue>10</issue>), <fpage>764</fpage>&#x2013;<lpage>770</lpage>. <pub-id pub-id-type="doi">10.1038/nsb840</pub-id> </citation>
</ref>
<ref id="B41">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Novikova</surname>
<given-names>O.</given-names>
</name>
<name>
<surname>Jayachandran</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Kelley</surname>
<given-names>D. S.</given-names>
</name>
<name>
<surname>Morton</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Merwin</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Topilina</surname>
<given-names>N. I.</given-names>
</name>
<etal/>
</person-group> (<year>2016</year>). <article-title>Intein Clustering Suggests Functional Importance in Different Domains of Life</article-title>. <source>Mol. Biol. Evol.</source> <volume>33</volume> (<issue>3</issue>), <fpage>783</fpage>&#x2013;<lpage>799</lpage>. <pub-id pub-id-type="doi">10.1093/molbev/msv271</pub-id> </citation>
</ref>
<ref id="B42">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Panjikar</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Parthasarathy</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Lamzin</surname>
<given-names>V. S.</given-names>
</name>
<name>
<surname>Weiss</surname>
<given-names>M. S.</given-names>
</name>
<name>
<surname>Tucker</surname>
<given-names>P. A.</given-names>
</name>
</person-group> (<year>2005</year>). <article-title>Auto-Rickshaw: an Automated crystal Structure Determination Platform as an Efficient Tool for the Validation of an X-ray Diffraction experiment</article-title>. <source>Acta Crystallogr. D Biol. Cryst.</source> <volume>61</volume> (<issue>4</issue>), <fpage>449</fpage>&#x2013;<lpage>457</lpage>. <pub-id pub-id-type="doi">10.1107/s0907444905001307</pub-id> </citation>
</ref>
<ref id="B43">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>P&#xe2;ques</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Duchateau</surname>
<given-names>P.</given-names>
</name>
</person-group> (<year>2007</year>). <article-title>Meganucleases and DNA Double-Strand Break-Induced Recombination: Perspectives for Gene Therapy</article-title>. <source>Curr. Gene Ther.</source> <volume>7</volume> (<issue>1</issue>), <fpage>49</fpage>&#x2013;<lpage>66</lpage>. <pub-id pub-id-type="doi">10.2174/156652307779940216</pub-id> </citation>
</ref>
<ref id="B44">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Perler</surname>
<given-names>F. B.</given-names>
</name>
</person-group> (<year>1998</year>). <article-title>Protein Splicing of Inteins and Hedgehog Autoproteolysis: Structure, Function, and Evolution</article-title>. <source>Cell</source> <volume>92</volume> (<issue>1</issue>), <fpage>1</fpage>&#x2013;<lpage>4</lpage>. <pub-id pub-id-type="doi">10.1016/s0092-8674(00)80892-2</pub-id> </citation>
</ref>
<ref id="B45">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Perler</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Olsen</surname>
<given-names>G. J.</given-names>
</name>
<name>
<surname>Adam</surname>
<given-names>E.</given-names>
</name>
</person-group> (<year>1997</year>). <article-title>Compilation and Analysis of Intein Sequences</article-title>. <source>Nucleic Acids Res.</source> <volume>25</volume> (<issue>6</issue>), <fpage>1087</fpage>&#x2013;<lpage>1093</lpage>. <pub-id pub-id-type="doi">10.1093/nar/25.6.1087</pub-id> </citation>
</ref>
<ref id="B46">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pettersen</surname>
<given-names>E. F.</given-names>
</name>
<name>
<surname>Goddard</surname>
<given-names>T. D.</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>C. C.</given-names>
</name>
<name>
<surname>Couch</surname>
<given-names>G. S.</given-names>
</name>
<name>
<surname>Greenblatt</surname>
<given-names>D. M.</given-names>
</name>
<name>
<surname>Meng</surname>
<given-names>E. C.</given-names>
</name>
<etal/>
</person-group> (<year>2004</year>). <article-title>UCSF Chimera?A Visualization System for Exploratory Research and Analysis</article-title>. <source>J.&#x20;Comput. Chem.</source> <volume>25</volume> (<issue>13</issue>), <fpage>1605</fpage>&#x2013;<lpage>1612</lpage>. <pub-id pub-id-type="doi">10.1002/jcc.20084</pub-id> </citation>
</ref>
<ref id="B47">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pietrokovski</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>1994</year>). <article-title>Conserved Sequence Features of Inteins (Protein Introns) and Their Use in Identifying New Inteins and Related Proteins</article-title>. <source>Protein Sci.</source> <volume>3</volume> (<issue>12</issue>), <fpage>2340</fpage>&#x2013;<lpage>2350</lpage>. <pub-id pub-id-type="doi">10.1002/pro.5560031218</pub-id> </citation>
</ref>
<ref id="B48">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sievers</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Wilm</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Dineen</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Gibson</surname>
<given-names>T. J.</given-names>
</name>
<name>
<surname>Karplus</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>W.</given-names>
</name>
<etal/>
</person-group> (<year>2011</year>). <article-title>Fast, Scalable Generation of High&#x2010;quality Protein Multiple Sequence Alignments Using Clustal Omega</article-title>. <source>Mol. Syst. Biol.</source> <volume>7</volume>, <fpage>539</fpage>. <pub-id pub-id-type="doi">10.1038/msb.2011.75</pub-id> </citation>
</ref>
<ref id="B49">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Southworth</surname>
<given-names>M. W.</given-names>
</name>
<name>
<surname>Adam</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Panne</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Byer</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Kautz</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Perler</surname>
<given-names>F. B.</given-names>
</name>
</person-group> (<year>1998</year>). <article-title>Control of Protein Splicing by Intein Fragment Reassembly</article-title>. <source>EMBO J.</source> <volume>17</volume> (<issue>4</issue>), <fpage>918</fpage>&#x2013;<lpage>926</lpage>. <pub-id pub-id-type="doi">10.1093/emboj/17.4.918</pub-id> </citation>
</ref>
<ref id="B50">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Stoddard</surname>
<given-names>B. L.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>Homing Endonucleases: from Microbial Genetic Invaders to Reagents for Targeted DNA Modification</article-title>. <source>Structure</source> <volume>19</volume> (<issue>1</issue>), <fpage>7</fpage>&#x2013;<lpage>15</lpage>. <pub-id pub-id-type="doi">10.1016/j.str.2010.12.003</pub-id> </citation>
</ref>
<ref id="B51">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Swithers</surname>
<given-names>K. S.</given-names>
</name>
<name>
<surname>Senejani</surname>
<given-names>A. G.</given-names>
</name>
<name>
<surname>Fournier</surname>
<given-names>G. P.</given-names>
</name>
<name>
<surname>Gogarten</surname>
<given-names>J.&#x20;P.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>Conservation of Intron and Intein Insertion Sites: Implications for Life Histories of Parasitic Genetic Elements</article-title>. <source>BMC Evol. Biol.</source> <volume>9</volume>, <fpage>303</fpage>. <pub-id pub-id-type="doi">10.1186/1471-2148-9-303</pub-id> </citation>
</ref>
<ref id="B52">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Taylor</surname>
<given-names>G. K.</given-names>
</name>
<name>
<surname>Petrucci</surname>
<given-names>L. H.</given-names>
</name>
<name>
<surname>Lambert</surname>
<given-names>A. R.</given-names>
</name>
<name>
<surname>Baxter</surname>
<given-names>S. K.</given-names>
</name>
<name>
<surname>Jarjour</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Stoddard</surname>
<given-names>B. L.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>LAHEDES: the LAGLIDADG Homing Endonuclease Database and Engineering Server</article-title>. <source>Nucleic Acids Res.</source> <volume>40</volume> (<issue>Web Server issue</issue>), <fpage>W110</fpage>&#x2013;<lpage>W116</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gks365</pub-id> </citation>
</ref>
<ref id="B53">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Volkmann</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Iwa&#xef;</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>2010</year>). <article-title>Protein Trans-splicing and its Use in Structural Biology: Opportunities and Limitations</article-title>. <source>Mol. Biosyst.</source> <volume>6</volume> (<issue>11</issue>), <fpage>2110</fpage>&#x2013;<lpage>2121</lpage>. <pub-id pub-id-type="doi">10.1039/c0mb00034e</pub-id> </citation>
</ref>
<ref id="B55">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wood</surname>
<given-names>D. W.</given-names>
</name>
<name>
<surname>Camarero</surname>
<given-names>J.&#x20;A.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Intein Applications: from Protein Purification and Labeling to Metabolic Control Methods</article-title>. <source>J.&#x20;Biol. Chem.</source> <volume>289</volume> (<issue>21</issue>), <fpage>14512</fpage>&#x2013;<lpage>14519</lpage>. <pub-id pub-id-type="doi">10.1074/jbc.r114.552653</pub-id> </citation>
</ref>
</ref-list>
</back>
</article>