<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="2.3" xml:lang="EN">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Microbiol.</journal-id>
<journal-title>Frontiers in Microbiology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Microbiol.</abbrev-journal-title>
<issn pub-type="epub">1664-302X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fmicb.2024.1463715</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Microbiology</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Structural and functional insights from the sequences and complex domain architecture of adhesin-like proteins from <italic>Methanobrevibacter smithii</italic> and <italic>Methanosphaera stadtmanae</italic></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Gupta</surname> <given-names>Anjali Bansal</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="https://loop.frontiersin.org/people/2804702/overview"/>
<role content-type="https://credit.niso.org/contributor-roles/conceptualization/"/>
<role content-type="https://credit.niso.org/contributor-roles/formal-analysis/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-original-draft/"/>
<role content-type="https://credit.niso.org/contributor-roles/visualization/"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Seedorf</surname> <given-names>Henning</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x002A;</sup></xref>
<uri xlink:href="https://loop.frontiersin.org/people/50499/overview"/>
<role content-type="https://credit.niso.org/contributor-roles/conceptualization/"/>
<role content-type="https://credit.niso.org/contributor-roles/formal-analysis/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-original-draft/"/>
<role content-type="https://credit.niso.org/contributor-roles/supervision/"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Temasek Life Sciences Laboratory Limited, 1 Research Link National University of Singapore</institution>, <addr-line>Singapore</addr-line>, <country>Singapore</country></aff>
<aff id="aff2"><sup>2</sup><institution>Department of Biological Sciences, National University of Singapore</institution>, <addr-line>Singapore</addr-line>, <country>Singapore</country></aff>
<author-notes>
<fn fn-type="edited-by" id="fn0005">
<p>Edited by: Michel Geovanni Santiago-Mart&#x00ED;nez, University of Connecticut, United States</p>
</fn>
<fn fn-type="edited-by" id="fn0006">
<p>Reviewed by: Evgenii Protasov, Max Planck Institute for Terrestrial Microbiology, Germany</p>
<p>Mario Alberto Mart&#x00ED;nez N&#x00FA;&#x00F1;ez, National Autonomous University of Mexico, Mexico</p>
</fn>
<corresp id="c001">&#x002A;Correspondence: Henning Seedorf, <email>hseedorf@gmail.com</email></corresp>
</author-notes>
<pub-date pub-type="epub">
<day>21</day>
<month>10</month>
<year>2024</year>
</pub-date>
<pub-date pub-type="collection">
<year>2024</year>
</pub-date>
<volume>15</volume>
<elocation-id>1463715</elocation-id>
<history>
<date date-type="received">
<day>12</day>
<month>07</month>
<year>2024</year>
</date>
<date date-type="accepted">
<day>03</day>
<month>09</month>
<year>2024</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2024 Gupta and Seedorf.</copyright-statement>
<copyright-year>2024</copyright-year>
<copyright-holder>Gupta and Seedorf</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<p>Methanogenic archaea, or methanogens, are crucial in guts and rumens, consuming hydrogen, carbon dioxide, and other fermentation products. While their molecular interactions with other microorganisms are not fully understood, genomic sequences provide information. The first genome sequences of human gut methanogens, <italic>Methanosphaera stadtmanae</italic> and <italic>Methanobrevibacter smithii</italic>, revealed genes encoding adhesin-like proteins (ALPs). These proteins were also found in other gut and rumen methanogens, but their characteristics and functions remain largely unknown. This study analyzes the ALP repertoire of <italic>M. stadtmanae</italic> and <italic>M. smithii</italic> using AI-guided protein structure predictions of unique ALP domains. Both genomes encode more than 40 ALPs each, comprising over 10% of their genomes. ALPs contain repetitive sequences, many of which are unmatched in protein domain databases. We present unique sequence signatures of conserved ABD repeats in ALPs and propose a classification based on domain architecture. Our study offers insights into ALP features and how methanogens may interact with other microorganisms.</p>
</abstract>
<kwd-group>
<kwd>adhesins</kwd>
<kwd>methanogens</kwd>
<kwd>adhesin-like proteins (ALPs)</kwd>
<kwd>archaeal big domain</kwd>
<kwd>gut micobiome</kwd>
</kwd-group>
<contract-sponsor id="cn1">Temasek Life Sciences Laboratory<named-content content-type="fundref-id">10.13039/501100010730</named-content></contract-sponsor>
<counts>
<fig-count count="5"/>
<table-count count="3"/>
<equation-count count="0"/>
<ref-count count="51"/>
<page-count count="13"/>
<word-count count="9354"/>
</counts>
<custom-meta-wrap>
<custom-meta>
<meta-name>section-at-acceptance</meta-name>
<meta-value>Biology of Archaea</meta-value>
</custom-meta>
</custom-meta-wrap>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="sec1">
<title>Introduction</title>
<p>Methanogens play an important role in intestinal tracts and rumens as consumers of hydrogen, carbon dioxide, and other end products of bacterial and eukaryotic fermentations. The biochemistry and bioenergetics of methanogenic archaea have been the subjects of research for several decades (<xref ref-type="bibr" rid="ref44">Thauer et al., 2008</xref>), but the molecular interactions of methanogens with their environment, specifically with other organisms, remain largely unknown. The genome sequences of some methanogens do provide potential cues in this regard as they indicate the presence of adhesin-like proteins (ALPs) in these microorganisms.</p>
<p>Adhesin-like proteins were discovered and annotated as &#x201C;asn/thr-rich large proteins&#x201D; in the genome of <italic>Methanosphaera stadtmanae</italic>, the first genome-sequenced human gut methanogen (<xref ref-type="bibr" rid="ref15">Fricke et al., 2006</xref>). At the time of discovery, no close homologs of ALPs were found in other archaeal genomes or public databases. However, these proteins had several characteristic features: (a) an overrepresentation of asparagine and threonine in the protein sequences; (b) the length of ALPs often vastly exceeded the mean protein length in <italic>M. stadtmanae</italic>; (c) repetitive primary sequence motifs of variable length; and (d) most of the ALPs of <italic>M. stadtmanae</italic> were predicted to be anchored N-terminally in the membrane and pointed to the extracellular space. Due to the habitat, it was assumed that the function of these proteins may be relevant to the commensal lifestyle of <italic>M. stadtmanae</italic> (<xref ref-type="bibr" rid="ref15">Fricke et al., 2006</xref>). This assumption was further corroborated when the subsequently sequenced genome of the human gut methanogen, <italic>Methanobrevibacter smithii</italic>, revealed the presence of 48 protein homologs of asn/thr-rich large proteins (<xref ref-type="bibr" rid="ref41">Samuel et al., 2007</xref>). Many of the <italic>M. smithii</italic> ALPs showed similarities to the ALPs of <italic>M. stadtmanae</italic>, but some ALPs appeared to be specific <italic>to M. smithii</italic> or <italic>M. stadtmanae</italic>. In addition, <xref ref-type="bibr" rid="ref41">Samuel et al. (2007)</xref> identified features in some ALPs that are found in bacterial adhesins and named therefore these proteins &#x201C;adhesin-like proteins.&#x201D; The name &#x201C;ALP&#x201D; was subsequently used by most others when annotating these types of genes. However, there is no clear definition of what a methanogen ALP constitutes. Nonetheless, ALPs have since been annotated in the genomes of many other methanogens, primarily belonging to the <italic>Methanobacteriales</italic> and <italic>Methanomassillicoccales</italic>, many of which are found in the intestinal tract of animals and humans (<xref ref-type="bibr" rid="ref6">Borrel et al., 2017</xref>; <xref ref-type="bibr" rid="ref11">de la Cuesta-Zuluaga et al., 2021</xref>; <xref ref-type="bibr" rid="ref29">Leahy et al., 2013</xref>; <xref ref-type="bibr" rid="ref31">Li et al., 2016</xref>; <xref ref-type="bibr" rid="ref38">Poehlein et al., 2017</xref>; <xref ref-type="bibr" rid="ref39">Poehlein and Seedorf, 2016</xref>). The exact functions of most ALPs remain speculative, and only a few studies have experimentally investigated ALPs in more detail. In a study by Hansen et al., it was shown that <italic>M. smithii</italic> ALPs may be differentially expressed, depending on the presence of growth substrates, e.g., formate (<xref ref-type="bibr" rid="ref20">Hansen et al., 2011</xref>). The only experimental evidence that ALPs could potentially act as actual adhesins was obtained in a phage display experiment, which could show that the display of a <italic>Methanobrevibacter ruminantium</italic> ALP domain in rumen extract enabled the enrichment of some protozoa and bacterial microorganisms (<xref ref-type="bibr" rid="ref35">Ng et al., 2016</xref>). However, the exact mechanism for this enrichment remains to be elucidated, as well as many other features of ALPs. The limited knowledge about the structure and function of ALPs is partially due to the long length of these proteins and the current lack of genetic tools to genetically manipulate the methanogens of interest, specifically for <italic>Methanobacteriales</italic> and <italic>Methanomassilicoccales</italic> species.</p>
<p>In recent years, protein databases have substantially increased, especially with the advent of artificial intelligence-supported programs for protein structure predictions like AlphaFold (<xref ref-type="bibr" rid="ref25">Jumper et al., 2021</xref>), which have greatly increased the ability to predict tertiary protein structures. In this study, we leverage these recent developments and analyze in detail the structural architectures of 91 ALPs from two major human gut methanogen symbionts, <italic>M. stadtmanae</italic> and <italic>M. smithii</italic>. Furthermore, we identify common features of well-characterized bacterial adhesins with archaeal ALPs, propose a classification of <italic>M. smithii</italic> and <italic>M. stadtmanae</italic> ALPs, and finally, discuss potential mechanisms on how ALPs may interact with their targets.</p>
</sec>
<sec sec-type="results" id="sec2">
<title>Results</title>
<sec id="sec3">
<title>Detection of adhesin-like proteins in <italic>Methanobrevibacter smithii</italic> and <italic>Methanosphaera stadtmanae</italic></title>
<p>Adhesin-like proteins have complex and diverse sequences with variable lengths. There are currently no comprehensive databases for the annotation of ALPs in methanogen genomes, and ALPs have, in general, not been defined. In this study, we applied an expanded search approach using annotated ALPs from 13 other methanogens (refer to materials and methods) to query the genomes of <italic>M. stadtmanae</italic> and <italic>M. smithii</italic>. We report 49 ALPs in the genome of <italic>M. smithii</italic> and 42 in the <italic>M. stadtmanae</italic> genome, which is more than that reported in earlier studies (<xref ref-type="bibr" rid="ref15">Fricke et al., 2006</xref>; <xref ref-type="bibr" rid="ref41">Samuel et al., 2007</xref>; <xref ref-type="bibr" rid="ref20">Hansen et al., 2011</xref>). Some ALPs were previously annotated as hypothetical proteins and were therefore not reported at the time the genomes were published. The 91 ALPs in the two species reported here account for ~10% of the genome of <italic>M. smithii</italic> and <italic>M. stadtmanae</italic> each (<xref ref-type="table" rid="tab1">Table 1</xref>). This fraction is significantly larger in comparison to bacteria, where only &#x003C;1.5% of the proteome is dedicated to code for adhesins (<xref ref-type="bibr" rid="ref34">Monzon et al., 2021</xref>). This difference is mainly due to the higher number of ALP genes in methanogens as compared to those in bacteria (<xref ref-type="bibr" rid="ref34">Monzon et al., 2021</xref>). The lengths of ALPs ranged from 128 amino acids to 4,691 amino acids in <italic>M. smithii</italic> (average: 1,299 amino acids) and 251 amino acids to 3,356 amino acids in <italic>M. stadtmanae</italic> (average: 1,462 amino acids). Both the number and the length of ALPs contribute toward the large fraction of genome coding for ALPs in two species.</p>
<table-wrap position="float" id="tab1">
<label>Table 1</label>
<caption>
<p>Summary of ALPs in <italic>Methanobrevibacter smithii</italic> and <italic>Methanosphaera stadtmanae.</italic></p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">S.No.</th>
<th align="left" valign="top">Name</th>
<th align="center" valign="top">Genome size (Mbp)</th>
<th align="center" valign="top">Proteome accession</th>
<th align="center" valign="top">Number of ALPs</th>
<th align="center" valign="top">ALP coding proteome (aa)</th>
<th align="center" valign="top">Fraction of ALP coding genome (%)</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">1.</td>
<td align="left" valign="top"><italic>M. smithii</italic></td>
<td align="center" valign="top">1.853</td>
<td align="center" valign="top">NC_009515</td>
<td align="center" valign="top">49</td>
<td align="center" valign="top">67,292</td>
<td align="center" valign="top">10.9</td>
</tr>
<tr>
<td align="left" valign="top">2.</td>
<td align="left" valign="top"><italic>M. stadtmanae</italic></td>
<td align="center" valign="top">1.767</td>
<td align="center" valign="top">NC_007681</td>
<td align="center" valign="top">42</td>
<td align="center" valign="top">63,039</td>
<td align="center" valign="top">10.7</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="sec4">
<title>Identification and characterization of protein domains in <italic>Methanobrevibacter smithii</italic> and <italic>Methanosphaera stadtmanae</italic> ALPs</title>
<p>Functional annotations using reference-based tools available in InterPro and Pfam (<xref ref-type="bibr" rid="ref32">Mistry et al., 2020</xref>; <xref ref-type="bibr" rid="ref5">Blum et al., 2021</xref>; <xref ref-type="bibr" rid="ref36">Paysan-Lafosse et al., 2022</xref>) indicate an absence of domains in a large part of ALP sequences or only limited similarity of domains with Pfam references. Only 30 <italic>M. smithii</italic> and 24 <italic>M. stadtmanae</italic> ALPs could be matched to at least one domain in Pfam, which is now part of the InterPro database. We assigned domains to ALPs based on sequence similarity to the nearest proteins in the AlphaFold database (<xref ref-type="supplementary-material" rid="SM1">Supplementary Table 1</xref>). It was found that archaeal ALPs from the two methanogens have mainly three to four different domain types: membrane-anchoring domain (s) (MAD), archaeal big domain (ABD), and right-handed beta-helical (RBH) domain, while other domains such as transglutaminase-like domain (TG-like) and carbohydrate-binding domains were also detectable in some ALPs (<xref ref-type="fig" rid="fig1">Figure 1</xref>). The frequency of occurrence of the different domains in ALPs is shown in <xref ref-type="table" rid="tab2">Table 2</xref>. The detailed features of these domains are presented in the sections below.</p>
<fig position="float" id="fig1">
<label>Figure 1</label>
<caption>
<p>Domains in <italic>Methanobrevibacter smithii</italic> and <italic>Methanosphaera stadtmanae</italic> ALPs. A typical ALP consists of Membrane Anchoring Domain (MAD), Right-handed beta helical domain (RBH) and Archaeal Big domain (ABD) (shown in schematic representation in <bold>A</bold>). These three domains are shown in <bold>(B)</bold> surface and <bold>(C)</bold> ribbon representation in this figure. In general, ABD domains are found in repeats in most ALPs and together with RBH they present binding sites for unique molecules on surface of other microbes.</p>
</caption>
<graphic xlink:href="fmicb-15-1463715-g001.tif"/>
</fig>
<table-wrap position="float" id="tab2">
<label>Table 2</label>
<caption>
<p>Domains in ALPs of <italic>Methanobrevibacter smithii</italic> and <italic>Methanosphaera stadtmanae</italic>.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top" rowspan="2">S.No.</th>
<th align="left" valign="top" rowspan="2"><italic>Domain annotations (Pfam)</italic></th>
<th align="center" valign="top" colspan="2">Number of ALPs with these domains&#x002A;</th>
</tr>
<tr>
<th align="center" valign="top"><italic>M. smithii</italic></th>
<th align="center" valign="top"><italic>M. stadtmanae</italic></th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">1</td>
<td align="left" valign="top">Membrane-anchoring domain</td>
<td align="center" valign="top">47</td>
<td align="center" valign="top">39</td>
</tr>
<tr>
<td align="left" valign="top">2</td>
<td align="left" valign="top">Archaeal big domain (ABD)</td>
<td align="center" valign="top">45</td>
<td align="center" valign="top">40</td>
</tr>
<tr>
<td align="left" valign="top">3</td>
<td align="left" valign="top">Right-handed beta-helical domain (RBH)</td>
<td align="center" valign="top">28</td>
<td align="center" valign="top">34</td>
</tr>
<tr>
<td align="left" valign="top">4</td>
<td align="left" valign="top">Transglutaminase-like superfamily</td>
<td align="center" valign="top">2</td>
<td align="center" valign="top">3</td>
</tr>
<tr>
<td align="left" valign="top">5</td>
<td align="left" valign="top">Pseudomurein-binding repeat</td>
<td align="center" valign="top">2</td>
<td align="center" valign="top">3</td>
</tr>
<tr>
<td align="left" valign="top">6</td>
<td align="left" valign="top">Carboxypeptidase regulatory-like domain</td>
<td align="center" valign="top">1</td>
<td align="center" valign="top">3</td>
</tr>
<tr>
<td align="left" valign="top">7</td>
<td align="left" valign="top">Chlamydia polymorphic membrane protein (Chlamydia_PMP) repeat</td>
<td align="center" valign="top">1</td>
<td align="center" valign="top">2</td>
</tr>
<tr>
<td align="left" valign="top">8</td>
<td align="left" valign="top">Papain family cysteine protease</td>
<td align="center" valign="top">1</td>
<td align="center" valign="top">1</td>
</tr>
<tr>
<td align="left" valign="top">9</td>
<td align="left" valign="top">PQQ-like domain repeats</td>
<td align="center" valign="top">-</td>
<td align="center" valign="top">1</td>
</tr>
<tr>
<td align="left" valign="top">10</td>
<td align="left" valign="top">Peptidase propeptide and YPEB domain repeats</td>
<td align="center" valign="top">-</td>
<td align="center" valign="top">1</td>
</tr>
<tr>
<td align="left" valign="top">11</td>
<td align="left" valign="top">Putative glycosyl hydrolase domain</td>
<td align="center" valign="top">1</td>
<td align="center" valign="top">-</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><sup>&#x002A;</sup>Functional domain annotations are taken from Pfam database except for RBH and ABD domains. AlphaFold structures were referred to confirm for presence of ABD and RBH domains only.</p>
</table-wrap-foot>
</table-wrap>
</sec>
<sec id="sec5">
<title>Membrane-anchoring domain in <italic>Methanobrevibacter smithii</italic> and <italic>Methanosphaera stadtmanae</italic> ALPs</title>
<p>Two types of MADs were recognized in ALPs of two organisms, transmembrane (TM) <italic>&#x03B1;</italic>-helices and amphipathic helices. TM helices were present in most ALPs of both organisms. In general, TM helices are present at the N-terminus of ALPs; in some cases, e.g., YP_001272624 of <italic>M. smithii,</italic> a duplication of the TMH at the N-terminus, can be observed. In 11 <italic>M. smithii</italic> ALPs, MADs were present at both termini, which may indicate further complexity of potential interaction with other microbes. In comparison, <italic>M. stadtmanae</italic> had a single N-terminal MAD in 39 out of 42 ALPs, and three ALPs had no MAD.</p>
<p>For some ALP sequences (10 in <italic>M. smithii</italic>, four in <italic>M. stadtmanae</italic>), we noticed the presence of N-or C-terminal helices with hydrophobic residues in the AlphaFold structure; however, TMHMM failed to assign them any transmembrane (TM) domain. These were&#x2009;&#x003C;&#x2009;20-amino-acid-long sequences of hydrophobic residues, which might not form a complete TM helix (<xref ref-type="supplementary-material" rid="SM1">Supplementary Table 2</xref>). The typical length of the TM helix was suggested to be 24.0 (&#x00B1;5.6) amino acids (<xref ref-type="bibr" rid="ref2">Baeza-Delgado et al., 2013</xref>) though it depends on the amino acid sequence and hydrophobicity (<xref ref-type="bibr" rid="ref26">Krishnakumar and London, 2007</xref>). We noticed that the hydrophobicity index values of short hydrophobic helices in ALPs typically were&#x2009;&#x003E;&#x2009;1, which was comparable to the values determined for full-length TM helices using the HeliQuest sequence analysis module (<xref ref-type="bibr" rid="ref16">Gautier et al., 2008</xref>). These short helices were rich in long-chain hydrophobic residues such as leucine, isoleucine, and phenylalanine (<xref ref-type="bibr" rid="ref27">Kyte and Doolittle, 1982</xref>). Together with TM helical domains, we marked these short helices as membrane-anchoring domains as they might also help anchor ALPs to the lipid membrane. Earlier studies have shown that the lipid bilayer can adapt to TM helices as short as 10&#x2013;12 leucines (<xref ref-type="bibr" rid="ref3">Baeza-Delgado et al., 2016</xref>) and can adjust to negative mismatches (<xref ref-type="bibr" rid="ref26">Krishnakumar and London, 2007</xref>). It is proposed that the response of short helices to the surrounding lipid bilayer depends on the nature of lipids the helix is in contact with, its amino acid composition, and the distribution of amino acids along the helix. The lipid bilayer may respond to hydrophobic mismatch caused by short helices by compression and chain disordering (<xref ref-type="bibr" rid="ref13">de Planque et al., 1998</xref>). Furthermore, the packing response of lipids around the single helix could be different than that for larger proteins (<xref ref-type="bibr" rid="ref50">Weiss et al., 2003</xref>). Marginally hydrophobic <italic>&#x03B1;</italic>-helices have also been shown useful in membrane protein folding (<xref ref-type="bibr" rid="ref12">De Marothy and Elofsson, 2015</xref>).</p>
<p>In addition to the short membrane-anchoring &#x03B1;-helices, we also observed the presence of amphipathic helices at the terminals in the structures of some ALPs, where in most cases a TM helix was missing. Using helical wheel diagrams (<xref ref-type="supplementary-material" rid="SM1">Supplementary Figure 1</xref>), we showed that such sequences in ALPs could fold into an amphipathic &#x03B1;-helix with a hydrophobic face and a hydrophobicity index &#x003E;50%, as predicted by HeliQuest (<xref ref-type="bibr" rid="ref16">Gautier et al., 2008</xref>). It is possible to assume that such helices could anchor ALPs to a lipid membrane by lying parallel to a lipid bilayer membrane with the hydrophilic surface interacting with charged lipid head groups while the hydrophobic side is exposed to the fatty acid chains of a membrane lipid. Almost all ALPs had at least one MAD except five ALPs (YP_001272625 and YP_001274282 from <italic>M. smithii</italic> and YP_447499, YP_447699, and YP_447953 from <italic>M. stadtmanae</italic>). This could be due to the partial annotations of these proteins from the genomic sequence, or it is also possible that the MAD could not be predicted with given algorithms.</p>
<p>Furthermore, ALPs are also found to be rich in asparagine/lysine/arginine residues at the N-terminus of the transmembrane domain. The cytoplasmic di-lysine motifs shown to be involved in the trafficking of protein to the ER and plasma membrane in earlier studies on eukaryotic cells (<xref ref-type="bibr" rid="ref10">Chau et al., 2021</xref>; <xref ref-type="bibr" rid="ref23">Jackson et al., 2012</xref>) were observed at the N-terminus of transmembrane helices in 22 ALPs, and 36 other ALPs had one of the lysine substituted by asparagine, while others also had arginine and aspartic acid (<xref ref-type="supplementary-material" rid="SM1">Supplementary Table 3</xref>). In ALPs, such motifs might be important for the protein to insert a MAD into the membrane in the required orientation. Interestingly, such motifs were also observed near the C-terminal TM helix, indicating their possible orientation to be on the cytoplasmic side (<xref ref-type="supplementary-material" rid="SM1">Supplementary Table 4</xref>). The presence of such signals adjacent to TM helices gives information about the possible orientation of helices in cell membranes such as <italic>M. smithii</italic>&#x2019;s YP_001272624, which has two TM helices at the N-terminus. The di-lysine motif was found only before the second TM helix, indicating its possible orientation from cytoplasmic to the extracellular side. It is to be noted that such signal sequences were missing in most short helices, suggesting that small helices could help in insertion although not spanning the whole membrane bilayer.</p>
</sec>
<sec id="sec6">
<title>Right-handed beta-helical domain in <italic>Methanobrevibacter smithii</italic> and <italic>Methanosphaera stadtmanae</italic> ALPs</title>
<p>Right-handed beta-helical is the third most represented domain found in ALPs of the two methanogens (<xref ref-type="table" rid="tab2">Table 2</xref>). <italic>Methanobrevibacter smithii</italic> had 28 ALPs with 43 RBH domains, and <italic>M. stadtmanae</italic> has 34 ALPs with 52 RBH domains. Many ALPs had repeats of the RBH domain such as YP_001273761 of <italic>M. smithii</italic>, which had seven repeats.</p>
<p>The RBH domain was initially identified in <italic>Erwinia chrysanthemi</italic>, a plant pathogenic bacterium as a pectin-binding domain of pectate lyase C (<xref ref-type="bibr" rid="ref51">Yoder et al., 1993</xref>), and subsequently has been discovered in many other enzymatic proteins such as those involved in hydrolyzing lectins and other carbohydrates (<xref ref-type="bibr" rid="ref24">Jenkins and Pickersgill, 2001</xref>). The beta-helical rod of this parallel <italic>&#x03B2;</italic>-helical domain provides a larger groove on its surface for recognizing long carbohydrate molecules (<xref ref-type="bibr" rid="ref43">Suits and Boraston, 2013</xref>; <xref ref-type="bibr" rid="ref47">Villarreal et al., 2022</xref>). This is mediated through the chain of conserved asparagines in the loops, which has been suggested to be the most common amino acid in RBH domains (<xref ref-type="bibr" rid="ref22">Iengar et al., 2006</xref>). Loops mainly have charged and polar amino acids, which explains their ability to bind to long polysaccharides. InterPro entry (IPR039448) indicates that the RBH domain is highly represented in bacteria as compared to other groups of organisms, while in Archaea, it is mostly found in <italic>Methanobacteriaceae</italic> and <italic>Methanosarcinaceae</italic> families. Particularly for the former, this could be due to their presence in ALPs, which form a large fraction of the archaeal proteome. Similar structures have also been observed in viruses for the purpose of host attachment and infection (<xref ref-type="bibr" rid="ref48">Weigele et al., 2003</xref>).</p>
<p>The fold was suggested to have diverged from a common ancestor based on the presence of conserved alpha-helix capping the N-terminus of beta-helix. The cap motif also inhibits oligomeric interactions similar to those found in amyloid formations (<xref ref-type="bibr" rid="ref8">Bryan et al., 2011</xref>). Furthermore, it is noted that RBH domains in archaeal ALPs are relatively large. The longest RBH domain (YP_447868 of <italic>M. stadtmanae</italic>) with a length of 1,236 amino acid folds into ~100&#x2009;&#x00C5; long RBH domain rod carrying 14 turns. Although rare in eukaryotes, the RBH domain is highly prevalent in surface proteins of bacteria and fungi, with many of them involved in pathogenesis (<xref ref-type="bibr" rid="ref7">Bradley et al., 2001</xref>).</p>
</sec>
<sec id="sec7">
<title>Archaeal big domain</title>
<p>Archaeal big domains are the most abundant domain repeats in <italic>M. smithii</italic> and <italic>M. stadtmanae</italic> ALPs and are found in almost all the ALP sequences. <xref ref-type="fig" rid="fig2">Figure 2</xref> depicts the phylogenetic tree constructed based on the alignment of 279 <italic>M. smithii</italic> ABDs and 222 <italic>M. stadtmanae</italic> ABDs with 80 bacterial stalk domains (<xref ref-type="bibr" rid="ref34">Monzon et al., 2021</xref>). These ABDs seem to have diverged from bacterial stalk domains as most of the archaeal sequences cluster together in a phylogenetic tree and form groups distinct from the bacterial stalk domain sequences. The archaeal big domain (ABD) definitions are not included in the Pfam domain database (<xref ref-type="bibr" rid="ref32">Mistry et al., 2020</xref>). Pfam either failed to assign domain family to a large part of archaeal ALPs or assigned Big3 (Pfam ID: PF16640) and DUF11 (Pfam ID: PF01345) domains in most cases, generally with high e-value. We also searched the ABD domains in &#x201C;refseq_genomes (1,711 databases) at NCBI, excluding archaea (NCBI taxid: 2,157)&#x201D;, and found no blast hit, indicating that these domains may be unique to archaeal species and are not ubiquitously found in other domains of life, suggesting their potential specific role in archaeal symbiosis.</p>
<fig position="float" id="fig2">
<label>Figure 2</label>
<caption>
<p>Phylogenetic tree of stalk domains from bacteria together with archaeal big domains (ABDs) of <italic>Methanobrevibacter smithii</italic> and <italic>Methanosphaera stadtmanae</italic>. This bootstrapped neighbor-joining tree is constructed using 279 ABD domains belonging to 49 <italic>M. smithii</italic> ALPs, 222 ABD domains of 42 <italic>M. stadtmanae</italic> ALPs, and 80 stalk domains from bacteria, as described in a previous study (<xref ref-type="bibr" rid="ref34">Monzon et al., 2021</xref>). The tree is displayed in circular mode with branch lengths ignored to maintain the clarity of display. Branches corresponding to <italic>M. smithii</italic> and <italic>M. stadtmanae</italic> ABDs are marked in yellow and brown, respectively, whereas bacterial stalk domain branches are in blue. The phylogenetic tree shows the distinct clustering of archaeal ABDs from bacterial stalk domains. ABDs group in a number of clusters as marked with circles against the branches. The clade marked with a purple circle has nine &#x03B2;-stranded ABDs, while the clade marked with a pink circle was used to create WebLogo (<xref ref-type="fig" rid="fig3">Figure 3</xref>). Each clade has ALPs belonging to <italic>M. smithii</italic> and <italic>M. stadtmanae,</italic> and we believe that more ABD clades will be known as more ALPs are studied in other archaeal species.</p>
</caption>
<graphic xlink:href="fmicb-15-1463715-g002.tif"/>
</fig>
<p>We noticed broad clades of ABD domains in <italic>M. smithii</italic> and <italic>M. stadtmanae,</italic> as marked in <xref ref-type="fig" rid="fig2">Figure 2</xref>. Most clades contained no sequences of apparent bacterial origin; some ABD clades clustered with at least four bacterial stalk domain sequences (MucBP (A0A806LF85), LVIVD (A0A0S1YA82), and Trp_ring (F9N556) of non-ESET clan and Big6 (A0A150KJ36), Big3_5 (A0A2V7S5F5), Big3 (R5U8D9), and Big2 (A0A0E1X8Y2) of ESET clan) in the phylogenetic tree (<xref ref-type="fig" rid="fig2">Figure 2</xref>). These could be the precursors from which archaeal ABDs evolved and subsequently diverged to acquire unique features. Some ABDs were longer as compared to the most common ABD domains, and only a few ALPs had such domains (YP_001272746, YP_001272984, YP_001274106, and YP_001274107 of <italic>M. smithii</italic>) (<xref ref-type="supplementary-material" rid="SM1">Supplementary Figure 2</xref>).</p>
<p>We analyzed the frequency and patterns of amino acids present in ABDs and bacterial stalk domains (<xref ref-type="fig" rid="fig3">Figure 3</xref>). The comparison clearly shows that although archaeal ABDs have glycine residues conserved similar to bacterial stalk domains, they also acquired unique features with high conservation. The uniquely conserved residues of archaeal ABDs are marked in <xref ref-type="fig" rid="fig3">Figure 3</xref>.</p>
<fig position="float" id="fig3">
<label>Figure 3</label>
<caption>
<p>Positional entropy of archaeal big domain (ABD). WebLogo is created using multiple sequence alignment of ABDs belonging to both <italic>Methanobrevibacter smithii</italic> and <italic>Methanosphaera stadtmanae</italic> (clade marked with a pink circle in <xref ref-type="fig" rid="fig2">Figure 2</xref>). Hydrophobic residues (phenylalanine, tyrosine, leucine, isoleucine, glycine, valine, and alanine) are marked in black, positively charged amino acids (lysine, arginine, asparagine, and glutamine) are in blue, negatively charged amino acids (aspartic acid and glutamic acid) are marked in red, and threonine and cysteine are in yellow. The relative position of beta strands is also marked below the WebLogo. The lengths of strands and loops are not scaled. The long gaps present in alignment were deleted before creating a logo. The highly conserved glycine residues in loops are boxed.</p>
</caption>
<graphic xlink:href="fmicb-15-1463715-g003.tif"/>
</fig>
<p>An ABD folds into a typical <italic>&#x03B2;</italic>-sandwich in Greek-key topology with seven strands (<xref ref-type="fig" rid="fig4">Figure 4A</xref>). A &#x03B2;-sandwich domain of longer ABDs is formed by nine &#x03B2;-strands (<xref ref-type="fig" rid="fig4">Figure 4B</xref>). The conserved glycine residues present in loops are marked in the representative three-dimensional structure obtained from AlphaFold, as shown in <xref ref-type="fig" rid="fig4">Figure 4A</xref>. Notably, the conserved residues occur in loops, which may be important for interaction with other protein domains, while the core is conserved with hydrophobic residues. Furthermore, we noticed that some strands in ABD folds have conserved long-chain hydrophobic residues such as valine, isoleucine, phenylalanine, and leucine, while others show conservation of smaller residues such as glycine. The representative structure was searched in the Dali database to identify the closest structural homolog of ABD. It is interesting to note that the root mean squared deviation (r.m.s.d) of ABD of <italic>M. smithii</italic> (NCBI accession: YP_001272624) with the nearest structure (Big1 domain of bacterial invasion, PDB ID: 1CWV) was only 1.8&#x2009;&#x00C5;, while they shared only 10% sequence similarity. Similarly, the nearest structural homolog of another ABD domain of the same ALP belonging to a different clade in the phylogenetic tree (HLA Class-I Histocompatibility antigen, PDB ID: 1EWO, r.m.s.d.: 1.8&#x2009;&#x00C5;) was only 18% similar. This indicated sequence divergence from bacterial ancestral homologs while conserving the overall three-dimensional fold of seven &#x03B2;-strands.</p>
<fig position="float" id="fig4">
<label>Figure 4</label>
<caption>
<p>Topology and three-dimensional structures of ABD domains. ABD domains present distinct Greek-key topologies and fold into seven or nine beta-stranded conformations. The highly conserved glycine residues present in loops, as seen in WebLogo, are marked with light yellow circles. The structures were displayed using UCSF ChimeraX version 1.5 (<xref ref-type="bibr" rid="ref37">Pettersen et al., 2021</xref>).</p>
</caption>
<graphic xlink:href="fmicb-15-1463715-g004.tif"/>
</fig>
<p>Archaeal big domain is found in repeats on ALPs and May be important for extending the range to reach symbiotic microbes. ABD repeats of some ALPs are highly similar; for example, YP_447631 of <italic>M. stadtmanae</italic> has 27 ABDs belonging to two major clades in a phylogenetic tree, and 20 of them share 60.6% average similarity with each other, while some other ALPs have divergent ABD sequences belonging to other phylogenetic clades; for example, YP_447476 of <italic>M. stadtmanae</italic> has three ABDs, and all of them cluster with three different clades, indicating early divergence of ABDs from bacteria.</p>
</sec>
<sec id="sec8">
<title>Other domains</title>
<p>Less common domain types in the ALPs of <italic>M. smithii</italic> and <italic>M. stadtmanae</italic> are currently mostly grouped under &#x2018;others&#x2019; as being different from the commonly observed MAD, RBH, and ABD domain types. While the overall occurrence of these domain types is low in ALPs of these two methanogens, it is observed that these domains vary and, in some ALPs, multiple &#x2018;others&#x2019; can be detected. These can comprise domains that show limited structural similarity to domains such as transglutaminase, pseudomurein-binding protein, PQQ-like domain, and lectin-like domain. These are listed in <xref ref-type="table" rid="tab2">Table 2</xref>, along with the number of occurrences in ALPs of two organisms.</p>
</sec>
<sec id="sec9">
<title>Groups within <italic>Methanobrevibacter smithii</italic> and <italic>Methanosphaera stadtmanae</italic> ALPs</title>
<p>Functional domain annotations using Pfam indicated that <italic>M. smithii</italic> and <italic>M. stadtmanae</italic> ALPs are less diverse as compared to bacteria. Only 11 domain families were identified from a sequence-based search in the Pfam database in both species together. On the other hand, in bacteria, altogether 109 types of stalk and adhesive domains are present (<xref ref-type="bibr" rid="ref34">Monzon et al., 2021</xref>; <xref ref-type="bibr" rid="ref33">Monzon and Bateman, 2022</xref>). The most common architecture in analyzed archaeal ALPs was RBH domain repeats at the N-terminus, followed by ABD domain repeats. A single transmembrane helix at the N-terminus in the majority of ALPs could act as a membrane-anchoring domain; however, in others, it was present at the C-terminus or at both ends. In addition to the above ALPs, there were 14 sequences in <italic>M. smithii</italic> and <italic>M. stadtmanae,</italic> respectively, that had missing RBH and ABD domains, although these were picked in our sequence-based search. These could be partial as they are short and could likely present incomplete domains. These sequences were discarded and not classified as ALPs in this study (<xref ref-type="supplementary-material" rid="SM1">Supplementary Tables 5</xref><xref ref-type="supplementary-material" rid="SM1">A</xref>,<xref ref-type="supplementary-material" rid="SM1">B</xref>).</p>
<p>The alignment of ALPs with different repetitive structures of varying length may lead to misalignments, potentially introducing errors in phylogenetic inference. However, the detailed characterization of individual ALP domains allows to group different ALPs based on their specific domain architecture and independent of their sequence similarity (<xref ref-type="fig" rid="fig5">Figure 5</xref>). As an alternative approach, we have used density-based clustering of text strings that represent the domain architecture, which allows to bin ALPs into different distinct classes. Based on the clustering, we proposed five groups of ALPs in <italic>M. smithii</italic> and <italic>M. stadtmanae</italic>. A growing number of ALP groups might be expected in the future as more archaeal species are analyzed for divergent ALPs. The groups proposed here are based on the presence/absence and positions of ABD, RBH, transglutaminase, and other domains, as described further. In general, we observed that ALPs contain at least either the ABD or RBH domain. If the protein sequence did not have any of these domains, we did not classify it as ALP, although these proteins were picked up in our blast searches together with other ALPs. There were nine such sequences in <italic>M. smithii</italic> and five in <italic>M. stadtmanae</italic> (<xref ref-type="supplementary-material" rid="SM1">Supplementary Tables 5</xref><xref ref-type="supplementary-material" rid="SM1">A</xref>,<xref ref-type="supplementary-material" rid="SM1">B</xref>). All ALPs of <italic>M. stadtmanae</italic> had a MAD only at the N-terminus (except three ALPs with no MAD), while 11 out of 49 <italic>M. smithii</italic> ALPs had MAD on both termini. Furthermore, five ALPs of <italic>M. smithii</italic> (NCBI accession: YP_001272625, YP_001272839, YP_001273878, YP_001274107, and YP_001274163) were not fully annotated for domains due to low sequence similarity with AlphaFold structures or partially predicted structures. Thus, they were tentatively assigned ALP groups based on current knowledge of their domains. The following three groups (excluding subgroups) and one currently uncategorized set of &#x201C;others&#x201D; ALPs are proposed in <italic>M. smithii</italic> and <italic>M. stadtmanae</italic> regardless of the position of the MAD, and they are listed in <xref ref-type="supplementary-material" rid="SM1">Supplementary Tables 6</xref><xref ref-type="supplementary-material" rid="SM1">A</xref>,<xref ref-type="supplementary-material" rid="SM1">B</xref>.</p>
<fig position="float" id="fig5">
<label>Figure 5</label>
<caption>
<p>Phylogenetic tree of <italic>Methanobrevibacter smithii</italic> and <italic>Methanosphaera stadtmanae</italic> ALPs showing ALP groups based on domain architecture. These 1,000 bootstrapped neighbor-joining trees are constructed using 49 <italic>M. smithii</italic> ALPs and 42 <italic>M. stadtmanae</italic> full-length ALPs. The tree is displayed in circular mode with branch lengths ignored to keep the clarity of the display. Against each branch, the domain architecture of individual proteins is shown as described in the Methods section. Branches are labeled with the NCBI accession numbers of protein. The branches are colored for two methanogens, <italic>M. smithii</italic> (yellow) and <italic>M. stadtmanae</italic> (brown) similar to that of <xref ref-type="fig" rid="fig2">Figure 2</xref>. Five different domains have been marked as shown in &#x201C;Domains&#x201D; legend. Protein lengths are scaled in the tree display. Four ALPs are marked with an asterisk. These ALPs have ABD domains with nine &#x03B2;-strand Greek-key topologies. ALP groups and subgroups have been shown using six pastel colors as marked in &#x201C;ALP groups&#x201D; legend. The ALP groups do not form clusters in the phylogenetic tree because the tree is created from sequence-based clustering.</p>
</caption>
<graphic xlink:href="fmicb-15-1463715-g005.tif"/>
</fig>
<sec id="sec10">
<title>Group-I</title>
<p>This is the largest ALP group in both <italic>M. smithii</italic> and <italic>M. stadtmanae</italic> (<italic>n</italic>&#x2009;=&#x2009;52), and most proteins in this group consist of only three domain types, i.e., MAD, RBH domain, and ABD. No &#x201C;Other&#x201D; domain is present in this ALP group. The MAD is N-terminal in 48 ALPs, the only modifications being a duplication in one ALP (YP_001272624 of <italic>M. smithii</italic>) and an additional C-terminal MAD in two ALPs (YP_001272984 and YP_001273972 of <italic>M. smithii</italic>). Visual inspection of this group indicates that this group can be further divided into two subgroups:</p>
<list list-type="simple">
<list-item>
<p><italic>Subgroup-IA:</italic> In most cases (<italic>n</italic>&#x2009;=&#x2009;40), the RBH domain is adjacent to the MAD, followed by repeats of ABD domains in varying numbers. In general, the RBH domain is present as a single domain in the majority of ALPs (10 out of 16 <italic>M. smithii</italic> ALPs and 18 out of 24 <italic>M. stadtmanae</italic> ALPs). Several variations of this pattern are observed in this subgroup, e.g., multiple RBH domains can be present. A notable example is YP_001273761, which has seven RBH domain repeats at the N-terminus, followed by 18 ABD domain repeats. This is also the largest ALP found in <italic>M. smithii</italic> with a length of 4,691 amino acids. Compared to other subgroups, the ALPs of this group are larger.</p>
</list-item>
<list-item>
<p><italic>Subgroup-IB:</italic> Similar to subgroup-IA, ALPs of this group also contain only ABD and RBH domains in addition to MAD; however, the relative positions of ABD and RBH domains are not fixed. In some ALPs, ABD repeats are found N-terminally to the RBH domain, e.g., YP_448130. Except for YP_001273972, all other ALPs of this group have a single MAD at the N-terminus. YP_001274127 has an amphipathic helix at the N-terminus, which could act as a membrane anchor as there is no TM domain. This ALP is also unique with a small RBH domain at the C-terminus and ABD domains at the N-terminus to RBH. <italic>M. smithii</italic> has seven and <italic>M. stadtmanae</italic> has five ALPs belonging to this group. No membrane anchor domain was located in YP_447953 of <italic>M. stadtmanae</italic>.</p>
</list-item>
</list>
</sec>
<sec id="sec11">
<title>Group-II</title>
<p>Adhesin-like proteins of this group have only one domain, either ABD or RBH, present in them, in addition to a MAD at either or both terminals. Although such ALPs are present in our dataset (15 in <italic>M. smithii</italic> and five in <italic>M. stadtmanae</italic>), it might be possible that these are proteins with unidentified N-or C-terminal domains because of low sequence homology within the domains in AlphaFold structures. This is evident from the fact that four <italic>M. smithii</italic> ALPs of this group are partially annotated for domains. It is also possible that protein sequences of these ALPs are partial; for example, YP_001274262 has only one ABD in addition to two amphipathic helices possibly acting as MADs at the N-terminus, and it is only 203 amino acids long. Similarly, YP_001274311 of <italic>M. smithii</italic> is 156 amino acids long and has one amphipathic helix at the N-terminus and only one ABD domain.</p>
<list list-type="simple">
<list-item>
<p><italic>Subgroup-IIA</italic>: ALPs of this subgroup (<italic>n</italic>&#x2009;=&#x2009;13) are characterized by having only ABD, while RBH and &#x2018;Other&#x2019; domains are completely absent. MAD is found N-terminal in eight ALPs, one ALP has C-terminal MADs, one on both termini, while in the other three ALPs, no MAD could be identified. ABD is present with a varying number of repeats, e.g., 1&#x2013;17 repeats.</p>
</list-item>
<list-item>
<p><italic>Subgroup-IIB</italic>: This group of ALPs has only the RBH domain and no other domain in addition to MAD. Five ALPs belong to this subgroup (four from <italic>M. smithii</italic> and one from <italic>M. stadtmanae</italic>). All ALPs from <italic>M. smithii</italic> of this group have a MAD on both termini, while <italic>M. stadtmanae</italic> ALP has one only N-terminally. Repeats of the RBH domain are present, which could mediate interactions by extending the length of ALPs.</p>
</list-item>
</list>
</sec>
<sec id="sec12">
<title>Group-III [Transglutaminase (TG)-type]</title>
<p>Adhesin-like proteins in this subgroup have a transglutaminase domain at the C-terminus and ABDs at the N-terminus. Pfam also identified a pseudomurein-binding repeat domain beside the transglutaminase domain. Compared to other groups, this group has shorter ALPs. These ALPs have a single N-terminus transmembrane helix serving as a membrane anchor. Furthermore, there are no RBH domains in this group of ALPs, which probably points out that the RBH domain could also act as an adhesive domain in ALPs of other groups in addition to extending ALPs to reach the surface of other microorganisms. Currently, there are only four ALPs in this group from two organisms.</p>
</sec>
<sec id="sec13">
<title>Others</title>
<p>There are seven <italic>M. smithii</italic> and six <italic>M. stadtmanae</italic> ALPs in our dataset that contain domains in addition to RBH, ABD, or transglutaminase domains (<xref ref-type="table" rid="tab2">Table 2</xref>). These unique domains could be involved in specific substrate binding. YP_001273741 has MAD on both termini. The seven <italic>M. smithii</italic> domains have alpha&#x2013;beta folds in the closest AlphaFold structures. Six <italic>M. stadtmanae</italic> ALPs are as annotated by Pfam. Furthermore, the four <italic>M. smithii</italic> ALPs (YP_001272746, YP_001272984, YP_001274106, and YP_001274107) with a longer ABD domain (with nine <italic>&#x03B2;</italic>-strands) are marked with an asterisk in a phylogenetic tree (<xref ref-type="fig" rid="fig5">Figure 5</xref>). YP_001272746 and YP_001272984 have ABD with a beta-strand extending from a loop of the RBH domain. These two ALPs have MAD on both sides. It is interesting to note that most ALPs in other groups, with MAD on both sides, have single domains and no repeats of ABD or RBH domains. Since these ABDs are structurally distinct from other ABDs, it might indicate the functional divergence acquired from other ALPs.</p>
</sec>
</sec>
</sec>
<sec sec-type="discussion" id="sec14">
<title>Discussion</title>
<p>Nearly 20&#x2009;years after their discovery, this study revisits the repertoire of adhesin-like proteins of <italic>M. stadtmanae</italic> and <italic>M. smithii,</italic> with a special emphasis on individual domain structures and overall domain architecture. The analysis is particularly aided by the recent advances in structure predictions, which has enabled us to identify and perform comparisons between different domain types and within domains, e.g., different ABD types. These findings allow us to define minimal ALPs more specifically, e.g., the presence of an ABD domain (and/or potentially also RBH), which are common features of the analyzed ALPs in these two organisms. Further analyses on other methanogens are required to determine if such a minimal criterion can identify ALPs in other organisms, especially those that are more distantly related, such as <italic>Methanomassilicoccales</italic>.</p>
<p>Using structure predictions has allowed us to characterize ALPs in much more detail but may also have some caveats. While AlphaFold offers insights into the potential folding of proteins, it is also known to have limitations. The inability to perform structure predictions of long proteins can be overcome, but it is currently more difficult to assess if the predicted protein may potentially also bind co-factors or ions. For example, for the Salmonella giant adhesin SiiE, a conserved sequence motif of five aspartates has been reported and experimentally verified to be involved in the binding of Ca<sup>2+</sup>. Calcium ions have been implicated in stabilizing and rigidifying the protein (<xref ref-type="bibr" rid="ref4">Barlag and Hensel, 2015</xref>; <xref ref-type="bibr" rid="ref17">Griessl et al., 2013</xref>; <xref ref-type="bibr" rid="ref18">Guo et al., 2017</xref>). The same motif does not appear to be conserved in ABDs, but the presence of other conserved residues, such as aspartic acid in loops, indicates that the binding of ions, such as Ca<sup>2+</sup>, is reported here and may point toward similar mechanisms. Likewise, the potential presence of post-translational modifications in ABDs and other ALP domains may only be elucidated through experimental analyses of purified proteins.</p>
<p>Adhesin-like proteins are likely to play an important role under <italic>in vivo</italic> conditions, where growth substrates may be limited, and cell growth is more likely to occur in syntrophic and biofilm-like microbe&#x2013;microbe interactions. The actual binding of ALPs to a substrate can currently also not be deduced from the structural predictions. Some of the bacterial big domains and the bacterial repetitive beta-helical domains have been experimentally verified to bind carbohydrates (<xref ref-type="bibr" rid="ref9">Burnim et al., 2024</xref>). However, whether ALPs may function similarly remains to be experimentally investigated. The observed repetitive beta-helical folds point toward the binding of polymeric structures, such as polysaccharides (as has been proposed; <xref ref-type="bibr" rid="ref24">Jenkins and Pickersgill, 2001</xref>), for bacterial proteins with these domains and have been identified for some proteins with similar folds, such as pectin lyase (<xref ref-type="bibr" rid="ref51">Yoder et al., 1993</xref>). None of the currently known methanogens can produce methane from polysaccharides, indicating that the respective catalytic function of ALPs for direct polysaccharide degradation in methanogens is unlikely. However, RBH domains from methanogen ALPs may still fulfill important functions in methanogens as a mere binding/adhesive domain, which enables methanogens to adhere to other microorganisms that may provide substrates for methanogenesis. The proximity of methanogens to polysaccharide degraders would be beneficial as the fermentation of polysaccharides, such as pectin, would produce gaseous growth substrates (CO<sub>2</sub> and hydrogen), which diffuse slowly through liquid matter (and in the presence of potentially competing microorganisms, such as sulfate reducers or acetogens). It would therefore be quite possible that ALPs, and specifically RBH domains, could mediate the adhesion of methanogens to such microorganisms, for example, through binding bacterial (or fungal/protozoan) cell-surface glycans. Glycan compositions may vary considerably between different species, and this may also partially explain the large repertoire of ALPs in the methanogen genome. ALPs could therefore be an interesting target to disrupt specific microbial interactions and reduce methanogen population that could account for reduced methane emissions or modulate the intestinal hydrogen metabolism in monogastrics.</p>
<p>Another explanation may be that ALPs may bind other polymeric/polysaccharide structures that do not directly contribute to increasing the substrate transfer between fermenters and methanogens. This could be the binding of ALPs to host-surface structure (host glycans) to mediate adhesion and retention of methanogens in the anoxic gut environment, or the binding to other symbiotic partners, such as Nanoarchaeota, which have recently been described for some methanogens, such as <italic>Methanobrevibacter oralis</italic> (<xref ref-type="bibr" rid="ref21">Hassani et al., 2023</xref>). A recent study on extracellular vesicles (EV) may also be of interest in this regard (<xref ref-type="bibr" rid="ref49">Weinberger et al., 2024</xref>). It was shown that EVs that are produced by gut methanogens are enriched in ALPs. ALPs may therefore potentially allow to direct EVs to specific interaction partners or link them to the cell the EV originated from, e.g., through ALPs with pseudomurein-binding domains.</p>
<p>Finally, it can also not be completely ruled out that some ALPs may bind to polysaccharides (or other polymers) found in organic matter. This may also help to reduce the distance between the methanogen and the polysaccharide degrader to enable the growth of the methanogens and/or to bind to the polysaccharide to last longer periods when growth conditions are not favorable (e.g., oxic) and metabolism of anaerobes is stopped or becomes dormant.</p>
<p>In conclusion, it can be stated that this bioinformatic analysis does provide novel insights into the structure and domain architecture of ALPs. However, it needs to be noted that many questions surrounding ALPs remain currently unanswered and may require more than computational analysis. This concerns not only the exact function and mechanism of these proteins but also the regulation of their expression as well as their evolution. The framework presented here for the proposed grouping of ALPs is a starting point for the classification of ALPs. The number of ALP groups is likely to increase as more genomes of methanogens are being analyzed in depth for ALP architectures, there are advancements in structural ALP analyses, and experimental verifications of the archaeal &#x201C;adhesiome&#x201D; functions are performed.</p>
</sec>
<sec sec-type="materials|methods" id="sec15">
<title>Materials and methods</title>
<sec id="sec16">
<title>Identification of complete sets of ALPs in <italic>Methanobrevibacter smithii</italic> and <italic>Methanosphaera stadtmanae</italic></title>
<p>First, a raw dataset of putative ALPs was constructed by extracting protein FASTA sequences from an annotation-based search using keywords &#x201C;adhesin-like protein&#x201D; and &#x201C;Asn/Thr-rich large protein&#x201D; in 13 complete <italic>Methanobacteriales</italic> proteomes (<xref ref-type="supplementary-material" rid="SM1">Supplementary Table 7</xref>). This search picked 516 putative ALPs. These 516 ALPs were taken to query proteomes specifically of <italic>M. smithii</italic> and <italic>M. stadtmanae</italic> with the aim of identifying distantly related ALP sequences using the BLASTP algorithm with an e-value cutoff of 0.0001 (<xref ref-type="bibr" rid="ref1">Altschul et al., 1990</xref>). All the sequence hits thus obtained were combined in a single FASTA file, separately for both species, and the resulting FASTA file contained 58 and 47 unique sequences belonging to <italic>M. smithii</italic> and <italic>M. stadtmanae,</italic> respectively.</p>
</sec>
<sec id="sec17">
<title>Domain annotations of ALPs</title>
<p>Each of the 105 ALPs of both organisms was queried against the AlphaFold database (<xref ref-type="bibr" rid="ref25">Jumper et al., 2021</xref>; <xref ref-type="bibr" rid="ref46">Varadi et al., 2021</xref>) via EBI&#x2019;s &#x201C;Sequence Similarity Search&#x201D; tool<xref ref-type="fn" rid="fn0001"><sup>1</sup></xref> to identify the closest structure based on sequence homology. In the first round of searches, we used full-length ALP sequences. Overall, 43 of 58 <italic>M. smithii</italic> ALPs and 32 of 47 <italic>M. stadtmanae</italic> ALPs could be matched to &#x003E;90% identity for most of their sequence length, while 11 <italic>M. smithii</italic> and 13 <italic>M. stadtmanae</italic> ALPs matched to 40&#x2013;90% identity. Structures based on the highest matching score were downloaded from the AlphaFold database. The amino acid sequences of the downloaded structures were individually aligned with the corresponding ALP sequence by Clustalx version 2.0 (<xref ref-type="bibr" rid="ref28">Larkin et al., 2007</xref>) in order to locate the putative structural domains on ALP protein sequences. The remaining ALPs (five of 58 ALPs in <italic>M. smithii</italic> and eight of 47 ALPs in <italic>M. stadtmanae</italic>) had sequence identity between 20 and 30%, and thus identified domains for these ALPs were less reliable based on sequence homology to the nearest structures (<xref ref-type="supplementary-material" rid="SM1">Supplementary Tables 5</xref><xref ref-type="supplementary-material" rid="SM1">A</xref>,<xref ref-type="supplementary-material" rid="SM1">B</xref>).</p>
<p>Since AlphaFold structures have currently a limitation of 2,700 amino acids for proteins, many ALP sequences could be annotated only partially for domains. For the remaining part of ALP sequences (partially annotated sequences that could not be annotated in the first round), we did another iteration of identifying the closest structures by querying only the unannotated part of the ALP sequence and keeping the identity cutoff to 40%. This allowed identifying further domains in ALPs and filled annotation gaps. In few cases, if at least one ABD was identified by AlphaFold, we were able to manually identify other ABD repeats. For example, the two closest AlphaFold structures (AlphaFold IDs: B9ACY6 and R7PVK4) were downloaded for an ALP (Accession: YP_001272746, length: 2879) from <italic>M. smithii</italic> that matched to &#x003E;90% sequence identity. Only one ABD was present in B9ACY6, while six ABDs were located based on R7PVK4. However, we could locate another 17 ABD repeat domains similar to the ones located with the help of the nearest AlphaFold structures and extend the domains further at the terminus. Similarly, an AlphaFold structure (AlphaFold ID: B9AFH2) was matched to <italic>M. smithii</italic> ALP (Accession: YP_001273761) at 78% identity. The structure had only two RBH domains; however, we could locate five more RBH domain repeats on this ALP sequence. We have also marked these manually identified domains in <xref ref-type="fig" rid="fig5">Figure 5</xref> together with other domains.</p>
<p>In rare cases where domains were not identified by AlphaFold, we took annotations from Pfam to improve the domain identifications. InterProScan searches (<xref ref-type="bibr" rid="ref5">Blum et al., 2021</xref>; <xref ref-type="bibr" rid="ref36">Paysan-Lafosse et al., 2022</xref>) were carried out for all ALPs. The sequences of domains identified by Pfam were aligned with annotated domains in other ALPs, followed by manually extending the domain boundaries on both sides. Most Pfam predictions agreed with AlphaFold predictions except ABD, which was assigned bacterial Ig-like domains (Pfam IDs: PF16640, PF02369, and PF02368) by Pfam. This could be because the ABD domain definition is missing in the UniProt database. Pfam was also useful for assigning functional annotation for AlphaFold domains in ALPs, as listed in <xref ref-type="table" rid="tab2">Table 2</xref>.</p>
<p>The transmembrane helices were identified by TMHMM<xref ref-type="fn" rid="fn0002"><sup>2</sup></xref> (<xref ref-type="bibr" rid="ref19">Hallgren et al., 2022</xref>) and confirmed by the presence of helical structures in AlphaFold structures. HeliQuest was used to predict amphipathic helix with analysis window size taken as 18 amino acids (<xref ref-type="bibr" rid="ref16">Gautier et al., 2008</xref>). In general, hydrophobicity values of approximately 0.5 indicate the possibility of forming an amphipathic helix. The PDB files were displayed using ChimeraX (<xref ref-type="bibr" rid="ref37">Pettersen et al., 2021</xref>).</p>
</sec>
<sec id="sec18">
<title>Sequence alignments, phylogenetic analysis, and sequence logos</title>
<p>All multiple sequence alignments were created by Clustal version 2.0 (<xref ref-type="bibr" rid="ref28">Larkin et al., 2007</xref>; <xref ref-type="bibr" rid="ref45">Thompson et al., 1997</xref>) using default parameters. The phylogenetic tree was calculated using the neighbor-joining method (<xref ref-type="bibr" rid="ref40">Saitou and Nei, 1987</xref>) as implemented in Clustal v2.0, based on the multiple sequence alignment and 1,000 bootstrap trials to confirm the robustness of branches (<xref ref-type="bibr" rid="ref14">Felsenstein, 1985</xref>). The trees were displayed by iToL v6.8<xref ref-type="fn" rid="fn0003"><sup>3</sup></xref> (<xref ref-type="bibr" rid="ref30">Letunic and Bork, 2021</xref>). The sequence logo was created by WebLogo<xref ref-type="fn" rid="fn0004"><sup>4</sup></xref> using a user-defined coloring scheme.</p>
<p>We also downloaded 82 bacterial stalk domain sequences from the Pfam database as per the accessions given in (<xref ref-type="bibr" rid="ref34">Monzon et al., 2021</xref>) representing 3,542 bacterial fibrillar adhesins. These belong to the ESET clan and others. The pairwise sequence comparisons of 80 stalk domains of bacteria and major clades of ABD domains were carried out using Clustal Omega v1.2.4 (<xref ref-type="bibr" rid="ref42">Sievers et al., 2011</xref>) in clustering mode. The pairwise distance matrix was calculated and converted into percent identities. The percent identity values were averaged across ABD clades and other groups and mentioned in <xref ref-type="table" rid="tab3">Table 3</xref> separately for two organisms.</p>
<table-wrap position="float" id="tab3">
<label>Table 3</label>
<caption>
<p>Average % identities within and across ABD clades with bacterial stalk domains.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th/>
<th align="left" valign="top">Clade-I ABD</th>
<th align="left" valign="top">Clade-II ABD</th>
<th align="left" valign="top">Clade-III ABD</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">Clade-I</td>
<td align="left" valign="top">26.4 (<italic>M. smithii</italic>-32, <italic>M. stadtmanae</italic>-34)</td>
<td/>
<td/>
</tr>
<tr>
<td align="left" valign="top">Clade-II</td>
<td align="left" valign="top"><italic>M. smithii</italic>-16.8, <italic>M. stadtmanae</italic>-17</td>
<td align="left" valign="top">19.7 (<italic>M. smithii</italic>-27, <italic>M. stadtmanae</italic>-20)</td>
<td/>
</tr>
<tr>
<td align="left" valign="top">Clade-III</td>
<td align="left" valign="top"><italic>M. smithii</italic>-15.2</td>
<td align="left" valign="top"><italic>M. smithii</italic>-16.6</td>
<td align="left" valign="top"><italic>M. smithii</italic>-31.9</td>
</tr>
<tr>
<td align="left" valign="top">Bacterial stalk</td>
<td align="left" valign="top">14.2</td>
<td align="left" valign="top">14.5</td>
<td align="left" valign="top">14.8</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="sec19">
<title>Density-based clustering of ALPs based on domain architecture</title>
<p>Adhesin-like protein domains were abbreviated by single letter codes (TMD&#x2009;=&#x2009;t, RBH&#x2009;=&#x2009;r, ABD&#x2009;=&#x2009;a, pseudomurein-binding domain&#x2009;=&#x2009;b, transglutaminase&#x2009;=&#x2009;g, OTHER&#x2009;=&#x2009;o), and domain sequences of individual ALPs were transcribed into strings of letters (e.g., sequence TMD-RBH-ABD-ABD-ABD-ABD-ABD&#x2009;=&#x2009;traaaaa). The DBSCAN library in R was used to perform density-based clustering of the string. The analysis was run using default parameters. Clusters were defined by an epsilon (eps) of 0.5 and minimum number of points (minPts) as 1.</p>
</sec>
</sec>
</body>
<back>
<sec sec-type="data-availability" id="sec20">
<title>Data availability statement</title>
<p>The original contributions presented in the study are included in the article/<xref ref-type="supplementary-material" rid="SM1">Supplementary material</xref>, further inquiries can be directed to the corresponding author.</p>
</sec>
<sec sec-type="author-contributions" id="sec21">
<title>Author contributions</title>
<p>AG: Conceptualization, Formal analysis, Writing &#x2013; original draft, Visualization. HS: Conceptualization, Formal analysis, Writing &#x2013; original draft, Supervision.</p>
</sec>
<sec sec-type="funding-information" id="sec22">
<title>Funding</title>
<p>The author(s) declare that financial support was received for the research, authorship, and/or publication of this article. This work was supported by Temasek Life Sciences Laboratory, Singapore.</p>
</sec>
<sec sec-type="COI-statement" id="sec23">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="sec24">
<title>Publisher&#x2019;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<sec sec-type="supplementary-material" id="sec25">
<title>Supplementary material</title>
<p>The Supplementary material for this article can be found online at: <ext-link xlink:href="https://www.frontiersin.org/articles/10.3389/fmicb.2024.1463715/full#supplementary-material" ext-link-type="uri">https://www.frontiersin.org/articles/10.3389/fmicb.2024.1463715/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="Data_Sheet_1.PDF" id="SM1" mimetype="application/pdf" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<fn-group>
<fn id="fn0001">
<p><sup>1</sup><ext-link xlink:href="https://www.ebi.ac.uk/Tools/sss/fasta/" ext-link-type="uri">https://www.ebi.ac.uk/Tools/sss/fasta/</ext-link>
</p>
</fn>
<fn id="fn0002">
<p><sup>2</sup><ext-link xlink:href="https://services.healthtech.dtu.dk/services/TMHMM-2.0/" ext-link-type="uri">https://services.healthtech.dtu.dk/services/TMHMM-2.0/</ext-link>
</p>
</fn>
<fn id="fn0003">
<p><sup>3</sup><ext-link xlink:href="https://itol.embl.de/" ext-link-type="uri">https://itol.embl.de/</ext-link>
</p>
</fn>
<fn id="fn0004">
<p><sup>4</sup><ext-link xlink:href="https://weblogo.berkeley.edu/logo.cgi" ext-link-type="uri">https://weblogo.berkeley.edu/logo.cgi</ext-link>
</p>
</fn>
</fn-group>
<ref-list>
<title>References</title>
<ref id="ref1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Altschul</surname> <given-names>S. F.</given-names></name> <name><surname>Gish</surname> <given-names>W.</given-names></name> <name><surname>Miller</surname> <given-names>W.</given-names></name> <name><surname>Myers</surname> <given-names>E. W.</given-names></name> <name><surname>Lipman</surname> <given-names>D. J.</given-names></name></person-group> (<year>1990</year>). <article-title>Basic local alignment search tool</article-title>. <source>J. Mol. Biol.</source> <volume>215</volume>, <fpage>403</fpage>&#x2013;<lpage>410</lpage>. doi: <pub-id pub-id-type="doi">10.1016/S0022-2836(05)80360-2</pub-id></citation></ref>
<ref id="ref2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Baeza-Delgado</surname> <given-names>C.</given-names></name> <name><surname>Marti-Renom</surname> <given-names>M. A.</given-names></name> <name><surname>Mingarro</surname> <given-names>I.</given-names></name></person-group> (<year>2013</year>). <article-title>Structure-based statistical analysis of transmembrane helices</article-title>. <source>Eur. Biophys. J.</source> <volume>42</volume>, <fpage>199</fpage>&#x2013;<lpage>207</lpage>. doi: <pub-id pub-id-type="doi">10.1007/s00249-012-0813-9</pub-id>, PMID: <pub-id pub-id-type="pmid">22588483</pub-id></citation></ref>
<ref id="ref3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Baeza-Delgado</surname> <given-names>C.</given-names></name> <name><surname>von Heijne</surname> <given-names>G.</given-names></name> <name><surname>Marti-Renom</surname> <given-names>M. A.</given-names></name> <name><surname>Mingarro</surname> <given-names>I.</given-names></name></person-group> (<year>2016</year>). <article-title>Biological insertion of computationally designed short transmembrane segments</article-title>. <source>Sci. Rep.</source> <volume>6</volume>:<fpage>23397</fpage>. doi: <pub-id pub-id-type="doi">10.1038/srep23397</pub-id>, PMID: <pub-id pub-id-type="pmid">26987712</pub-id></citation></ref>
<ref id="ref4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Barlag</surname> <given-names>B.</given-names></name> <name><surname>Hensel</surname> <given-names>M.</given-names></name></person-group> (<year>2015</year>). <article-title>The giant adhesin SiiE of <italic>Salmonella enterica</italic></article-title>. <source>Molecules</source> <volume>20</volume>, <fpage>1134</fpage>&#x2013;<lpage>1150</lpage>. doi: <pub-id pub-id-type="doi">10.3390/molecules20011134</pub-id>, PMID: <pub-id pub-id-type="pmid">25587788</pub-id></citation></ref>
<ref id="ref5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Blum</surname> <given-names>M.</given-names></name> <name><surname>Chang</surname> <given-names>H. Y.</given-names></name> <name><surname>Chuguransky</surname> <given-names>S.</given-names></name> <name><surname>Grego</surname> <given-names>T.</given-names></name> <name><surname>Kandasaamy</surname> <given-names>S.</given-names></name> <name><surname>Mitchell</surname> <given-names>A.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>The InterPro protein families and domains database: 20 years on</article-title>. <source>Nucleic Acids Res.</source> <volume>49</volume>, <fpage>D344</fpage>&#x2013;<lpage>D354</lpage>. doi: <pub-id pub-id-type="doi">10.1093/nar/gkaa977</pub-id>, PMID: <pub-id pub-id-type="pmid">33156333</pub-id></citation></ref>
<ref id="ref6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Borrel</surname> <given-names>G.</given-names></name> <name><surname>McCann</surname> <given-names>A.</given-names></name> <name><surname>Deane</surname> <given-names>J.</given-names></name> <name><surname>Neto</surname> <given-names>M. C.</given-names></name> <name><surname>Lynch</surname> <given-names>D. B.</given-names></name> <name><surname>Brug&#x00E8;re</surname> <given-names>J. F.</given-names></name> <etal/></person-group>. (<year>2017</year>). <article-title>Genomics and metagenomics of trimethylamine-utilizing Archaea in the human gut microbiome</article-title>. <source>ISME J.</source> <volume>11</volume>, <fpage>2059</fpage>&#x2013;<lpage>2074</lpage>. doi: <pub-id pub-id-type="doi">10.1038/ismej.2017.72</pub-id>, PMID: <pub-id pub-id-type="pmid">28585938</pub-id></citation></ref>
<ref id="ref7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bradley</surname> <given-names>P.</given-names></name> <name><surname>Cowen</surname> <given-names>L.</given-names></name> <name><surname>Menke</surname> <given-names>M.</given-names></name> <name><surname>King</surname> <given-names>J.</given-names></name> <name><surname>Berger</surname> <given-names>B.</given-names></name></person-group> (<year>2001</year>). <article-title>BETAWRAP: successful prediction of parallel beta-helices from primary sequence reveals an association with many microbial pathogens</article-title>. <source>Proc. Natl. Acad. Sci. USA</source> <volume>98</volume>, <fpage>14819</fpage>&#x2013;<lpage>14824</lpage>. doi: <pub-id pub-id-type="doi">10.1073/pnas.251267298</pub-id>, PMID: <pub-id pub-id-type="pmid">11752429</pub-id></citation></ref>
<ref id="ref8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bryan</surname> <given-names>A. W.</given-names> <suffix>Jr.</suffix></name> <name><surname>Starner-Kreinbrink</surname> <given-names>J. L.</given-names></name> <name><surname>Hosur</surname> <given-names>R.</given-names></name> <name><surname>Clark</surname> <given-names>P. L.</given-names></name> <name><surname>Berger</surname> <given-names>B.</given-names></name></person-group> (<year>2011</year>). <article-title>Structure-based prediction reveals capping motifs that inhibit &#x03B2;-helix aggregation</article-title>. <source>Proc. Natl. Acad. Sci. USA</source> <volume>108</volume>, <fpage>11099</fpage>&#x2013;<lpage>11104</lpage>. doi: <pub-id pub-id-type="doi">10.1073/pnas.1017504108</pub-id>, PMID: <pub-id pub-id-type="pmid">21685332</pub-id></citation></ref>
<ref id="ref9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Burnim</surname> <given-names>A. A.</given-names></name> <name><surname>Dufault-Thompson</surname> <given-names>K.</given-names></name> <name><surname>Jiang</surname> <given-names>X.</given-names></name></person-group> (<year>2024</year>). <article-title>The three-sided right-handed &#x03B2;-helix is a versatile fold for glycan interactions</article-title>. <source>Glycobiology</source> <volume>34</volume>. doi: <pub-id pub-id-type="doi">10.1093/glycob/cwae037</pub-id>, PMID: <pub-id pub-id-type="pmid">38767844</pub-id></citation></ref>
<ref id="ref10"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chau</surname> <given-names>S.</given-names></name> <name><surname>Fujii</surname> <given-names>A.</given-names></name> <name><surname>Wang</surname> <given-names>Y.</given-names></name> <name><surname>Vandebroek</surname> <given-names>A.</given-names></name> <name><surname>Goda</surname> <given-names>W.</given-names></name> <name><surname>Yasui</surname> <given-names>M.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>Di-lysine motif-like sequences formed by deleting the C-terminal domain of aquaporin-4 prevent its trafficking to the plasma membrane</article-title>. <source>Genes Cells</source> <volume>26</volume>, <fpage>152</fpage>&#x2013;<lpage>164</lpage>. doi: <pub-id pub-id-type="doi">10.1111/gtc.12829</pub-id></citation></ref>
<ref id="ref11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>de la Cuesta-Zuluaga</surname> <given-names>J.</given-names></name> <name><surname>Spector</surname> <given-names>T. D.</given-names></name> <name><surname>Youngblut</surname> <given-names>N. D.</given-names></name> <name><surname>Ley</surname> <given-names>R. E.</given-names></name></person-group> (<year>2021</year>). <article-title>Genomic insights into adaptations of trimethylamine-utilizing methanogens to diverse habitats, including the human gut</article-title>. <source>Msystems</source> <volume>6</volume>:<fpage>e00939</fpage>. doi: <pub-id pub-id-type="doi">10.1128/mSystems.00939-20</pub-id></citation></ref>
<ref id="ref12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>De Marothy</surname> <given-names>M. T.</given-names></name> <name><surname>Elofsson</surname> <given-names>A.</given-names></name></person-group> (<year>2015</year>). <article-title>Marginally hydrophobic transmembrane &#x03B1;-helices shaping membrane protein folding</article-title>. <source>Protein Sci.</source> <volume>24</volume>, <fpage>1057</fpage>&#x2013;<lpage>1074</lpage>. doi: <pub-id pub-id-type="doi">10.1002/pro.2698</pub-id></citation></ref>
<ref id="ref13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>de Planque</surname> <given-names>M. R. R.</given-names></name> <name><surname>Greathouse</surname> <given-names>D. V.</given-names></name> <name><surname>Koeppe</surname> <given-names>R. E.</given-names></name> <name><surname>Sch&#x00E4;fer</surname> <given-names>H.</given-names></name> <name><surname>Marsh</surname> <given-names>D.</given-names></name> <name><surname>Killian</surname> <given-names>J. A.</given-names></name></person-group> (<year>1998</year>). <article-title>Influence of lipid/peptide hydrophobic mismatch on the thickness of Diacylphosphatidylcholine bilayers. A 2H NMR and ESR study using designed transmembrane &#x03B1;-helical peptides and gramicidin a</article-title>. <source>Biochemistry</source> <volume>37</volume>, <fpage>9333</fpage>&#x2013;<lpage>9345</lpage>. doi: <pub-id pub-id-type="doi">10.1021/bi980233r</pub-id>, PMID: <pub-id pub-id-type="pmid">9649314</pub-id></citation></ref>
<ref id="ref14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Felsenstein</surname> <given-names>J.</given-names></name></person-group> (<year>1985</year>). <article-title>Confidence limits on phylogenies: an approach using the bootstrap</article-title>. <source>Evolution</source> <volume>39</volume>, <fpage>783</fpage>&#x2013;<lpage>791</lpage>. doi: <pub-id pub-id-type="doi">10.2307/2408678</pub-id>, PMID: <pub-id pub-id-type="pmid">28561359</pub-id></citation></ref>
<ref id="ref15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fricke</surname> <given-names>W. F.</given-names></name> <name><surname>Seedorf</surname> <given-names>H.</given-names></name> <name><surname>Henne</surname> <given-names>A.</given-names></name> <name><surname>Kr&#x00FC;er</surname> <given-names>M.</given-names></name> <name><surname>Liesegang</surname> <given-names>H.</given-names></name> <name><surname>Hedderich</surname> <given-names>R.</given-names></name> <etal/></person-group>. (<year>2006</year>). <article-title>The genome sequence of <italic>Methanosphaera stadtmanae</italic> reveals why this human intestinal archaeon is restricted to methanol and H2 for methane formation and ATP synthesis</article-title>. <source>J. Bacteriol.</source> <volume>188</volume>, <fpage>642</fpage>&#x2013;<lpage>658</lpage>. doi: <pub-id pub-id-type="doi">10.1128/JB.188.2.642-658.2006</pub-id>, PMID: <pub-id pub-id-type="pmid">16385054</pub-id></citation></ref>
<ref id="ref16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gautier</surname> <given-names>R.</given-names></name> <name><surname>Douguet</surname> <given-names>D.</given-names></name> <name><surname>Antonny</surname> <given-names>B.</given-names></name> <name><surname>Drin</surname> <given-names>G.</given-names></name></person-group> (<year>2008</year>). <article-title>HELIQUEST: a web server to screen sequences with specific alpha-helical properties</article-title>. <source>Bioinformatics</source> <volume>24</volume>, <fpage>2101</fpage>&#x2013;<lpage>2102</lpage>. doi: <pub-id pub-id-type="doi">10.1093/bioinformatics/btn392</pub-id></citation></ref>
<ref id="ref17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Griessl</surname> <given-names>M. H.</given-names></name> <name><surname>Schmid</surname> <given-names>B.</given-names></name> <name><surname>Kassler</surname> <given-names>K.</given-names></name> <name><surname>Braunsmann</surname> <given-names>C.</given-names></name> <name><surname>Ritter</surname> <given-names>R.</given-names></name> <name><surname>Barlag</surname> <given-names>B.</given-names></name> <etal/></person-group>. (<year>2013</year>). <article-title>Structural insight into the Giant Ca2+-binding Adhesin SiiE: implications for the adhesion of <italic>Salmonella enterica</italic> to polarized epithelial cells</article-title>. <source>Structure</source> <volume>21</volume>, <fpage>741</fpage>&#x2013;<lpage>752</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.str.2013.02.020</pub-id>, PMID: <pub-id pub-id-type="pmid">23562396</pub-id></citation></ref>
<ref id="ref18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Guo</surname> <given-names>S.</given-names></name> <name><surname>Stevens</surname> <given-names>C. A.</given-names></name> <name><surname>Vance</surname> <given-names>T. D. R.</given-names></name> <name><surname>Olijve</surname> <given-names>L. L. C.</given-names></name> <name><surname>Graham</surname> <given-names>L. A.</given-names></name> <name><surname>Campbell</surname> <given-names>R. L.</given-names></name> <etal/></person-group>. (<year>2017</year>). <article-title>Structure of a 1.5-MDa adhesin that binds its Antarctic bacterium to diatoms and ice</article-title>. <source>Sci. Adv.</source> <volume>3</volume>:<fpage>e1701440</fpage>. doi: <pub-id pub-id-type="doi">10.1126/sciadv.1701440</pub-id>, PMID: <pub-id pub-id-type="pmid">28808685</pub-id></citation></ref>
<ref id="ref19"><citation citation-type="other"><person-group person-group-type="author"><name><surname>Hallgren</surname> <given-names>J.</given-names></name> <name><surname>Tsirigos</surname> <given-names>K. D.</given-names></name> <name><surname>Pedersen</surname> <given-names>M. D.</given-names></name> <name><surname>Almagro Armenteros</surname> <given-names>J. J.</given-names></name> <name><surname>Marcatili</surname> <given-names>P.</given-names></name></person-group> (<year>2022</year>). DeepTMHMM predicts alpha and beta transmembrane proteins using deep neural networks. bioRxiv [Preprint]. doi: <pub-id pub-id-type="doi">10.1101/2022.04.08.487609v1</pub-id></citation></ref>
<ref id="ref20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hansen</surname> <given-names>E. E.</given-names></name> <name><surname>Lozupone</surname> <given-names>C. A.</given-names></name> <name><surname>Rey</surname> <given-names>F. E.</given-names></name> <name><surname>Wu</surname> <given-names>M.</given-names></name> <name><surname>Guruge</surname> <given-names>J. L.</given-names></name> <name><surname>Narra</surname> <given-names>A.</given-names></name> <etal/></person-group>. (<year>2011</year>). <article-title>Pan-genome of the dominant human gut-associated archaeon, <italic>Methanobrevibacter smithii</italic>, studied in twins</article-title>. <source>Proc. Natl. Acad. Sci. USA</source> <volume>108</volume>, <fpage>4599</fpage>&#x2013;<lpage>4606</lpage>. doi: <pub-id pub-id-type="doi">10.1073/pnas.1000071108</pub-id></citation></ref>
<ref id="ref21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hassani</surname> <given-names>Y.</given-names></name> <name><surname>Aboudharam</surname> <given-names>G.</given-names></name> <name><surname>Drancourt</surname> <given-names>M.</given-names></name> <name><surname>Grine</surname> <given-names>G.</given-names></name></person-group> (<year>2023</year>). <article-title>Current knowledge and clinical perspectives for a unique new phylum: Nanaorchaeota</article-title>. <source>Microbiol. Res.</source> <volume>276</volume>:<fpage>127459</fpage>. doi: <pub-id pub-id-type="doi">10.1016/j.micres.2023.127459</pub-id>, PMID: <pub-id pub-id-type="pmid">37557061</pub-id></citation></ref>
<ref id="ref22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Iengar</surname> <given-names>P.</given-names></name> <name><surname>Joshi</surname> <given-names>N. V.</given-names></name> <name><surname>Balaram</surname> <given-names>P.</given-names></name></person-group> (<year>2006</year>). <article-title>Conformational and sequence signatures in &#x03B2; Helix proteins</article-title>. <source>Structure</source> <volume>14</volume>, <fpage>529</fpage>&#x2013;<lpage>542</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.str.2005.11.021</pub-id>, PMID: <pub-id pub-id-type="pmid">16531237</pub-id></citation></ref>
<ref id="ref23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jackson</surname> <given-names>L. P.</given-names></name> <name><surname>Lewis</surname> <given-names>M.</given-names></name> <name><surname>Kent</surname> <given-names>H. M.</given-names></name> <name><surname>Edeling</surname> <given-names>M. A.</given-names></name> <name><surname>Evans</surname> <given-names>P. R.</given-names></name> <name><surname>Duden</surname> <given-names>R.</given-names></name> <etal/></person-group>. (<year>2012</year>). <article-title>Molecular basis for recognition of dilysine trafficking motifs by COPI</article-title>. <source>Dev. Cell</source> <volume>23</volume>, <fpage>1255</fpage>&#x2013;<lpage>1262</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.devcel.2012.10.017</pub-id>, PMID: <pub-id pub-id-type="pmid">23177648</pub-id></citation></ref>
<ref id="ref24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jenkins</surname> <given-names>J.</given-names></name> <name><surname>Pickersgill</surname> <given-names>R.</given-names></name></person-group> (<year>2001</year>). <article-title>The architecture of parallel &#x03B2;-helices and related folds</article-title>. <source>Prog. Biophys. Mol. Biol.</source> <volume>77</volume>, <fpage>111</fpage>&#x2013;<lpage>175</lpage>. doi: <pub-id pub-id-type="doi">10.1016/S0079-6107(01)00013-X</pub-id></citation></ref>
<ref id="ref25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jumper</surname> <given-names>J.</given-names></name> <name><surname>Evans</surname> <given-names>R.</given-names></name> <name><surname>Pritzel</surname> <given-names>A.</given-names></name> <name><surname>Green</surname> <given-names>T.</given-names></name> <name><surname>Figurnov</surname> <given-names>M.</given-names></name> <name><surname>Ronneberger</surname> <given-names>O.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>Highly accurate protein structure prediction with AlphaFold</article-title>. <source>Nature</source> <volume>596</volume>, <fpage>583</fpage>&#x2013;<lpage>589</lpage>. doi: <pub-id pub-id-type="doi">10.1038/s41586-021-03819-2</pub-id>, PMID: <pub-id pub-id-type="pmid">34265844</pub-id></citation></ref>
<ref id="ref26"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Krishnakumar</surname> <given-names>S. S.</given-names></name> <name><surname>London</surname> <given-names>E.</given-names></name></person-group> (<year>2007</year>). <article-title>Effect of sequence hydrophobicity and bilayer width upon the minimum length required for the formation of transmembrane helices in membranes</article-title>. <source>J. Mol. Biol.</source> <volume>374</volume>, <fpage>671</fpage>&#x2013;<lpage>687</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.jmb.2007.09.037</pub-id>, PMID: <pub-id pub-id-type="pmid">17950311</pub-id></citation></ref>
<ref id="ref27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kyte</surname> <given-names>J.</given-names></name> <name><surname>Doolittle</surname> <given-names>R. F.</given-names></name></person-group> (<year>1982</year>). <article-title>A simple method for displaying the hydropathic character of a protein</article-title>. <source>J. Mol. Biol.</source> <volume>157</volume>, <fpage>105</fpage>&#x2013;<lpage>132</lpage>. doi: <pub-id pub-id-type="doi">10.1016/0022-2836(82)90515-0</pub-id>, PMID: <pub-id pub-id-type="pmid">7108955</pub-id></citation></ref>
<ref id="ref28"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Larkin</surname> <given-names>M. A.</given-names></name> <name><surname>Blackshields</surname> <given-names>G.</given-names></name> <name><surname>Brown</surname> <given-names>N. P.</given-names></name> <name><surname>Chenna</surname> <given-names>R.</given-names></name> <name><surname>McGettigan</surname> <given-names>P. A.</given-names></name> <name><surname>McWilliam</surname> <given-names>H.</given-names></name> <etal/></person-group>. (<year>2007</year>). <article-title>Clustal W and Clustal X version 2.0</article-title>. <source>Bioinformatics</source> <volume>23</volume>, <fpage>2947</fpage>&#x2013;<lpage>2948</lpage>. doi: <pub-id pub-id-type="doi">10.1093/bioinformatics/btm404</pub-id>, PMID: <pub-id pub-id-type="pmid">17846036</pub-id></citation></ref>
<ref id="ref29"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Leahy</surname> <given-names>S. C.</given-names></name> <name><surname>Kelly</surname> <given-names>W. J.</given-names></name> <name><surname>Li</surname> <given-names>D.</given-names></name> <name><surname>Li</surname> <given-names>Y.</given-names></name> <name><surname>Altermann</surname> <given-names>E.</given-names></name> <name><surname>Lambie</surname> <given-names>S. C.</given-names></name> <etal/></person-group>. (<year>2013</year>). <article-title>The complete genome sequence of Methanobrevibacter sp.AbM4</article-title>. <source>Stand. Genomic Sci.</source> <volume>8</volume>, <fpage>215</fpage>&#x2013;<lpage>227</lpage>. doi: <pub-id pub-id-type="doi">10.4056/sigs.3977691</pub-id>, PMID: <pub-id pub-id-type="pmid">23991254</pub-id></citation></ref>
<ref id="ref30"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Letunic</surname> <given-names>I.</given-names></name> <name><surname>Bork</surname> <given-names>P.</given-names></name></person-group> (<year>2021</year>). <article-title>Interactive tree of life (iTOL) v5: an online tool for phylogenetic tree display and annotation</article-title>. <source>Nucleic Acids Res.</source> <volume>49</volume>, <fpage>W293</fpage>&#x2013;<lpage>W296</lpage>. doi: <pub-id pub-id-type="doi">10.1093/nar/gkab301</pub-id>, PMID: <pub-id pub-id-type="pmid">33885785</pub-id></citation></ref>
<ref id="ref31"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>Y.</given-names></name> <name><surname>Leahy</surname> <given-names>S. C.</given-names></name> <name><surname>Jeyanathan</surname> <given-names>J.</given-names></name> <name><surname>Henderson</surname> <given-names>G.</given-names></name> <name><surname>Cox</surname> <given-names>F.</given-names></name> <name><surname>Altermann</surname> <given-names>E.</given-names></name> <etal/></person-group>. (<year>2016</year>). <article-title>The complete genome sequence of the methanogenic archaeon ISO4-H5 provides insights into the methylotrophic lifestyle of a ruminal representative of the Methanomassiliicoccales</article-title>. <source>Stand. Genomic Sci.</source> <volume>11</volume>, <fpage>1</fpage>&#x2013;<lpage>12</lpage>. doi: <pub-id pub-id-type="doi">10.1186/s40793-016-0183-5</pub-id></citation></ref>
<ref id="ref32"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mistry</surname> <given-names>J.</given-names></name> <name><surname>Chuguransky</surname> <given-names>S.</given-names></name> <name><surname>Williams</surname> <given-names>L.</given-names></name> <name><surname>Qureshi</surname> <given-names>M.</given-names></name> <name><surname>Salazar</surname> <given-names>G. A.</given-names></name> <name><surname>Sonnhammer</surname> <given-names>E. L. L.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Pfam: the protein families database in 2021</article-title>. <source>Nucleic Acids Res.</source> <volume>49</volume>, <fpage>D412</fpage>&#x2013;<lpage>D419</lpage>. doi: <pub-id pub-id-type="doi">10.1093/nar/gkaa913</pub-id></citation></ref>
<ref id="ref33"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Monzon</surname> <given-names>V.</given-names></name> <name><surname>Bateman</surname> <given-names>A.</given-names></name></person-group> (<year>2022</year>). <article-title>Large-scale discovery of microbial fibrillar adhesins and identification of novel members of adhesive domain families</article-title>. <source>J. Bacteriol.</source> <volume>204</volume>, <fpage>e00107</fpage>&#x2013;<lpage>e00122</lpage>. doi: <pub-id pub-id-type="doi">10.1128/jb.00107-22</pub-id></citation></ref>
<ref id="ref34"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Monzon</surname> <given-names>V.</given-names></name> <name><surname>Lafita</surname> <given-names>A.</given-names></name> <name><surname>Bateman</surname> <given-names>A.</given-names></name></person-group> (<year>2021</year>). <article-title>Discovery of fibrillar adhesins across bacterial species</article-title>. <source>BMC Genomics</source> <volume>22</volume>:<fpage>550</fpage>. doi: <pub-id pub-id-type="doi">10.1186/s12864-021-07586-2</pub-id>, PMID: <pub-id pub-id-type="pmid">34275445</pub-id></citation></ref>
<ref id="ref35"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ng</surname> <given-names>F.</given-names></name> <name><surname>Kittelmann</surname> <given-names>S.</given-names></name> <name><surname>Patchett</surname> <given-names>M. L.</given-names></name> <name><surname>Attwood</surname> <given-names>G. T.</given-names></name> <name><surname>Janssen</surname> <given-names>P. H.</given-names></name> <name><surname>Rakonjac</surname> <given-names>J.</given-names></name> <etal/></person-group>. (<year>2016</year>). <article-title>An adhesin from hydrogen-utilizing rumen methanogen <italic>Methanobrevibacter ruminantium</italic> M1 binds a broad range of hydrogen-producing microorganisms</article-title>. <source>Environ. Microbiol.</source> <volume>18</volume>, <fpage>3010</fpage>&#x2013;<lpage>3021</lpage>. doi: <pub-id pub-id-type="doi">10.1111/1462-2920.13155</pub-id>, PMID: <pub-id pub-id-type="pmid">26643468</pub-id></citation></ref>
<ref id="ref36"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Paysan-Lafosse</surname> <given-names>T.</given-names></name> <name><surname>Blum</surname> <given-names>M.</given-names></name> <name><surname>Chuguransky</surname> <given-names>S.</given-names></name> <name><surname>Grego</surname> <given-names>T.</given-names></name> <name><surname>L&#x00E1;zaro Pinto</surname> <given-names>B.</given-names></name> <name><surname>Salazar</surname> <given-names>G. A.</given-names></name> <etal/></person-group>. (<year>2022</year>). <article-title>InterPro in 2022</article-title>. <source>Nucleic Acids Res.</source> <volume>51</volume>, <fpage>D418</fpage>&#x2013;<lpage>D427</lpage>. doi: <pub-id pub-id-type="doi">10.1093/nar/gkac993</pub-id></citation></ref>
<ref id="ref37"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pettersen</surname> <given-names>E. F.</given-names></name> <name><surname>Goddard</surname> <given-names>T. D.</given-names></name> <name><surname>Huang</surname> <given-names>C. C.</given-names></name> <name><surname>Meng</surname> <given-names>E. C.</given-names></name> <name><surname>Couch</surname> <given-names>G. S.</given-names></name> <name><surname>Croll</surname> <given-names>T. I.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>UCSF ChimeraX: structure visualization for researchers, educators, and developers</article-title>. <source>Protein Sci.</source> <volume>30</volume>, <fpage>70</fpage>&#x2013;<lpage>82</lpage>. doi: <pub-id pub-id-type="doi">10.1002/pro.3943</pub-id>, PMID: <pub-id pub-id-type="pmid">32881101</pub-id></citation></ref>
<ref id="ref38"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Poehlein</surname> <given-names>A.</given-names></name> <name><surname>Daniel</surname> <given-names>R.</given-names></name> <name><surname>Seedorf</surname> <given-names>H.</given-names></name></person-group> (<year>2017</year>). <article-title>The draft genome of the non-host-associated Methanobrevibacter arboriphilus strain DH1 encodes a large repertoire of Adhesin-like proteins</article-title>. <source>Archaea</source> <volume>2017</volume>, <fpage>1</fpage>&#x2013;<lpage>9</lpage>. doi: <pub-id pub-id-type="doi">10.1155/2017/4097425</pub-id>, PMID: <pub-id pub-id-type="pmid">28634433</pub-id></citation></ref>
<ref id="ref39"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Poehlein</surname> <given-names>A.</given-names></name> <name><surname>Seedorf</surname> <given-names>H.</given-names></name></person-group> (<year>2016</year>). <article-title>Draft genome sequences of <italic>Methanobrevibacter curvatus</italic> DSM11111, <italic>Methanobrevibacter cuticularis</italic> DSM11139, <italic>Methanobrevibacter filiformis</italic> DSM11501, and <italic>Methanobrevibacter oralis</italic> DSM7256</article-title>. <source>Genome Announc.</source> <volume>4</volume>, <fpage>e00617</fpage>&#x2013;<lpage>e00618</lpage>. doi: <pub-id pub-id-type="doi">10.1128/genomea.00617-16</pub-id></citation></ref>
<ref id="ref40"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Saitou</surname> <given-names>N.</given-names></name> <name><surname>Nei</surname> <given-names>M.</given-names></name></person-group> (<year>1987</year>). <article-title>The neighbor-joining method: a new method for reconstructing phylogenetic trees</article-title>. <source>Mol. Biol. Evol.</source> <volume>4</volume>, <fpage>406</fpage>&#x2013;<lpage>425</lpage>, PMID: <pub-id pub-id-type="pmid">3447015</pub-id></citation></ref>
<ref id="ref41"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Samuel</surname> <given-names>B. S.</given-names></name> <name><surname>Hansen</surname> <given-names>E. E.</given-names></name> <name><surname>Manchester</surname> <given-names>J. K.</given-names></name> <name><surname>Coutinho</surname> <given-names>P. M.</given-names></name> <name><surname>Henrissat</surname> <given-names>B.</given-names></name> <name><surname>Fulton</surname> <given-names>R.</given-names></name> <etal/></person-group>. (<year>2007</year>). <article-title>Genomic and metabolic adaptations of <italic>Methanobrevibacter smithii</italic> to the human gut</article-title>. <source>Proc. Natl. Acad. Sci. USA</source> <volume>104</volume>, <fpage>10643</fpage>&#x2013;<lpage>10648</lpage>. doi: <pub-id pub-id-type="doi">10.1073/pnas.0704189104</pub-id>, PMID: <pub-id pub-id-type="pmid">17563350</pub-id></citation></ref>
<ref id="ref42"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sievers</surname> <given-names>F.</given-names></name> <name><surname>Wilm</surname> <given-names>A.</given-names></name> <name><surname>Dineen</surname> <given-names>D.</given-names></name> <name><surname>Gibson</surname> <given-names>T. J.</given-names></name> <name><surname>Karplus</surname> <given-names>K.</given-names></name> <name><surname>Li</surname> <given-names>W.</given-names></name> <etal/></person-group>. (<year>2011</year>). <article-title>Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal omega</article-title>. <source>Mol. Syst. Biol.</source> <volume>7</volume>:<fpage>539</fpage>. doi: <pub-id pub-id-type="doi">10.1038/msb.2011.75</pub-id></citation></ref>
<ref id="ref43"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Suits</surname> <given-names>M. D.</given-names></name> <name><surname>Boraston</surname> <given-names>A. B.</given-names></name></person-group> (<year>2013</year>). <article-title>Structure of the <italic>Streptococcus pneumoniae</italic> surface protein and adhesin PfbA</article-title>. <source>PLoS One</source> <volume>8</volume>:<fpage>e67190</fpage>. doi: <pub-id pub-id-type="doi">10.1371/journal.pone.0067190</pub-id>, PMID: <pub-id pub-id-type="pmid">23894284</pub-id></citation></ref>
<ref id="ref44"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Thauer</surname> <given-names>R. K.</given-names></name> <name><surname>Kaster</surname> <given-names>A. K.</given-names></name> <name><surname>Seedorf</surname> <given-names>H.</given-names></name> <name><surname>Buckel</surname> <given-names>W.</given-names></name> <name><surname>Hedderich</surname> <given-names>R.</given-names></name></person-group> (<year>2008</year>). <article-title>Methanogenic archaea: ecologically relevant differences in energy conservation</article-title>. <source>Nat. Rev. Microbiol.</source> <volume>6</volume>, <fpage>579</fpage>&#x2013;<lpage>591</lpage>. doi: <pub-id pub-id-type="doi">10.1038/nrmicro1931</pub-id>, PMID: <pub-id pub-id-type="pmid">18587410</pub-id></citation></ref>
<ref id="ref45"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Thompson</surname> <given-names>J. D.</given-names></name> <name><surname>Gibson</surname> <given-names>T. J.</given-names></name> <name><surname>Plewniak</surname> <given-names>F.</given-names></name> <name><surname>Jeanmougin</surname> <given-names>F.</given-names></name> <name><surname>Higgins</surname> <given-names>D. G.</given-names></name></person-group> (<year>1997</year>). <article-title>The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools</article-title>. <source>Nucleic Acids Res.</source> <volume>25</volume>, <fpage>4876</fpage>&#x2013;<lpage>4882</lpage>. doi: <pub-id pub-id-type="doi">10.1093/nar/25.24.4876</pub-id>, PMID: <pub-id pub-id-type="pmid">9396791</pub-id></citation></ref>
<ref id="ref46"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Varadi</surname> <given-names>M.</given-names></name> <name><surname>Anyango</surname> <given-names>S.</given-names></name> <name><surname>Deshpande</surname> <given-names>M.</given-names></name> <name><surname>Nair</surname> <given-names>S.</given-names></name> <name><surname>Natassia</surname> <given-names>C.</given-names></name> <name><surname>Yordanova</surname> <given-names>G.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>AlphaFold protein structure database: massively expanding the structural coverage of protein-sequence space with high-accuracy models</article-title>. <source>Nucleic Acids Res.</source> <volume>50</volume>, <fpage>D439</fpage>&#x2013;<lpage>D444</lpage>. doi: <pub-id pub-id-type="doi">10.1093/nar/gkab1061</pub-id></citation></ref>
<ref id="ref47"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Villarreal</surname> <given-names>F.</given-names></name> <name><surname>Stocchi</surname> <given-names>N.</given-names></name> <name><surname>Ten Have</surname> <given-names>A.</given-names></name></person-group> (<year>2022</year>). <article-title>Functional classification and characterization of the fungal glycoside hydrolase 28 protein family</article-title>. <source>J. Fungi</source> <volume>8</volume>. doi: <pub-id pub-id-type="doi">10.3390/jof8030217</pub-id></citation></ref>
<ref id="ref48"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Weigele</surname> <given-names>P. R.</given-names></name> <name><surname>Scanlon</surname> <given-names>E.</given-names></name> <name><surname>King</surname> <given-names>J.</given-names></name></person-group> (<year>2003</year>). <article-title>Homotrimeric, beta-stranded viral adhesins and tail proteins</article-title>. <source>J. Bacteriol.</source> <volume>185</volume>, <fpage>4022</fpage>&#x2013;<lpage>4030</lpage>. doi: <pub-id pub-id-type="doi">10.1128/JB.185.14.4022-4030.2003</pub-id>, PMID: <pub-id pub-id-type="pmid">12837775</pub-id></citation></ref>
<ref id="ref49"><citation citation-type="other"><person-group person-group-type="author"><name><surname>Weinberger</surname> <given-names>V.</given-names></name> <name><surname>Darnhofer</surname> <given-names>B.</given-names></name> <name><surname>Mertelj</surname> <given-names>P.</given-names></name> <name><surname>Stentz</surname> <given-names>R.</given-names></name> <name><surname>Thapa</surname> <given-names>H. B.</given-names></name> <name><surname>Jones</surname> <given-names>E.</given-names></name> <etal/></person-group>. (<year>2024</year>). Proteomic and metabolomic profiling of archaeal extracellular vesicles from the human gut. bioRxiv [Preprint]. doi: <pub-id pub-id-type="doi">10.1101/2024.06.22.600174v1</pub-id></citation></ref>
<ref id="ref50"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Weiss</surname> <given-names>T. M.</given-names></name> <name><surname>van der Wel</surname> <given-names>P. C. A.</given-names></name> <name><surname>Killian</surname> <given-names>J. A.</given-names></name> <name><surname>Koeppe</surname> <given-names>R. E.</given-names> <suffix>II</suffix></name> <name><surname>Huang</surname> <given-names>H. W.</given-names></name></person-group> (<year>2003</year>). <article-title>Hydrophobic mismatch between helices and lipid bilayers</article-title>. <source>Biophys. J.</source> <volume>84</volume>, <fpage>379</fpage>&#x2013;<lpage>385</lpage>. doi: <pub-id pub-id-type="doi">10.1016/S0006-3495(03)74858-9</pub-id>, PMID: <pub-id pub-id-type="pmid">12524291</pub-id></citation></ref>
<ref id="ref51"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yoder</surname> <given-names>M. D.</given-names></name> <name><surname>Keen</surname> <given-names>N. T.</given-names></name> <name><surname>Jurnak</surname> <given-names>F.</given-names></name></person-group> (<year>1993</year>). <article-title>New domain motif: the structure of pectate lyase C, a secreted plant virulence factor</article-title>. <source>Science</source> <volume>260</volume>, <fpage>1503</fpage>&#x2013;<lpage>1507</lpage>. doi: <pub-id pub-id-type="doi">10.1126/science.8502994</pub-id>, PMID: <pub-id pub-id-type="pmid">8502994</pub-id></citation></ref>
</ref-list>
</back>
</article>