<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Plant Sci.</journal-id>
<journal-title>Frontiers in Plant Science</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Plant Sci.</abbrev-journal-title>
<issn pub-type="epub">1664-462X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fpls.2017.01050</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Plant Science</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Phylogenomic Analysis and Dynamic Evolution of Chloroplast Genomes in Salicaceae</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Huang</surname> <given-names>Yuan</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<xref ref-type="author-notes" rid="fn002"><sup>&#x2020;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/447644/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Wang</surname> <given-names>Jun</given-names></name>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
<xref ref-type="author-notes" rid="fn002"><sup>&#x2020;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/361118/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Yang</surname> <given-names>Yongping</given-names></name>
<xref ref-type="aff" rid="aff4"><sup>4</sup></xref>
<xref ref-type="author-notes" rid="fn001"><sup>&#x002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/371290/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Fan</surname> <given-names>Chuanzhu</given-names></name>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
<xref ref-type="author-notes" rid="fn001"><sup>&#x002A;</sup></xref>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Chen</surname> <given-names>Jiahui</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff4"><sup>4</sup></xref>
<xref ref-type="author-notes" rid="fn001"><sup>&#x002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/368046/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Key Laboratory for Plant Diversity and Biogeography of East Asia, Chinese Academy of Sciences</institution> <country>Kunming, China</country></aff>
<aff id="aff2"><sup>2</sup><institution>School of Life Sciences, Yunnan Normal University</institution> <country>Kunming, China</country></aff>
<aff id="aff3"><sup>3</sup><institution>Department of Biological Sciences, Wayne State University, Detroit</institution> <country>MI, United States</country></aff>
<aff id="aff4"><sup>4</sup><institution>Institute of Tibetan Plateau Research at Kunming, Kunming Institute of Botany, Chinese Academy of Sciences</institution> <country>Kunming, China</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: <italic>Fulvio Cruciani, Sapienza Universit&#x00E0; di Roma, Italy</italic></p></fn>
<fn fn-type="edited-by"><p>Reviewed by: <italic>Tae-Jin Yang, Seoul National University, South Korea; Yuanhu Xuan, Shenyang Agricultural University, China; Angelica Cibrian, Center for Research and Advanced Studies of the National Polytechnic Institute (CINVESTAV), Mexico</italic></p></fn>
<fn fn-type="corresp" id="fn001"><p>&#x002A;Correspondence: <italic>Jiahui Chen, <email>chenjh@mail.kib.ac.cn</email> Yongping Yang, <email>yangyp@mail.kib.ac.cn</email> Chuanzhu Fan, <email>cfan@wayne.edu</email></italic></p></fn>
<fn fn-type="other" id="fn002"><p><italic><sup>&#x2020;</sup>These authors have contributed equally to this work.</italic></p></fn>
<fn fn-type="other" id="fn003"><p>This article was submitted to Plant Genetics and Genomics, a section of the journal Frontiers in Plant Science</p></fn></author-notes>
<pub-date pub-type="epub">
<day>20</day>
<month>06</month>
<year>2017</year>
</pub-date>
<pub-date pub-type="collection">
<year>2017</year>
</pub-date>
<volume>08</volume>
<elocation-id>1050</elocation-id>
<history>
<date date-type="received">
<day>20</day>
<month>08</month>
<year>2016</year>
</date>
<date date-type="accepted">
<day>31</day>
<month>05</month>
<year>2017</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2017 Huang, Wang, Yang, Fan and Chen.</copyright-statement>
<copyright-year>2017</copyright-year>
<copyright-holder>Huang, Wang, Yang, Fan and Chen</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract>
<p>Chloroplast genomes of plants are highly conserved in both gene order and gene content. Analysis of the whole chloroplast genome is known to provide much more informative DNA sites and thus generates high resolution for plant phylogenies. Here, we report the complete chloroplast genomes of three <italic>Salix</italic> species in family Salicaceae. Phylogeny of Salicaceae inferred from complete chloroplast genomes is generally consistent with previous studies but resolved with higher statistical support. Incongruences of phylogeny, however, are observed in genus <italic>Populus</italic>, which most likely results from homoplasy. By comparing three <italic>Salix</italic> chloroplast genomes with the published chloroplast genomes of other Salicaceae species, we demonstrate that the synteny and length of chloroplast genomes in Salicaceae are highly conserved but experienced dynamic evolution among species. We identify seven positively selected chloroplast genes in Salicaceae, which might be related to the adaptive evolution of Salicaceae species. Comparative chloroplast genome analysis within the family also indicates that some chloroplast genes are lost or became pseudogenes, infer that the chloroplast genes horizontally transferred to the nucleus genome. Based on the complete nucleus genome sequences from two Salicaceae species, we remarkably identify that the entire chloroplast genome is indeed transferred and integrated to the nucleus genome in the individual of the reference genome of <italic>P. trichocarpa</italic> at least once. This observation, along with presence of the large nuclear plastid DNA (NUPTs) and NUPTs-containing multiple chloroplast genes in their original order in the chloroplast genome, favors the DNA-mediated hypothesis of organelle to nucleus DNA transfer. Overall, the phylogenomic analysis using chloroplast complete genomes clearly elucidates the phylogeny of Salicaceae. The identification of positively selected chloroplast genes and dynamic chloroplast-to-nucleus gene transfers in Salicaceae provide resources to better understand the successful adaptation of Salicaceae species.</p>
</abstract>
<kwd-group>
<kwd>chloroplast genome</kwd>
<kwd>phylogenomics</kwd>
<kwd>phylogenetic incongruence</kwd>
<kwd>NUPT</kwd>
<kwd>evolution</kwd>
<kwd>organellar horizontal gene transfer</kwd>
<kwd>Salicaceae</kwd>
</kwd-group>
<contract-num rid="cn001">31590820</contract-num>
<contract-num rid="cn001">31590823</contract-num>
<contract-num rid="cn001">31270271</contract-num>
<contract-num rid="cn001">31670198</contract-num>
<contract-num rid="cn001">31560062</contract-num>
<contract-sponsor id="cn001">National Natural Science Foundation of China<named-content content-type="fundref-id">10.13039/501100001809</named-content></contract-sponsor>
<counts>
<fig-count count="5"/>
<table-count count="3"/>
<equation-count count="0"/>
<ref-count count="66"/>
<page-count count="13"/>
<word-count count="0"/>
</counts>
</article-meta>
</front>
<body>
<sec><title>Introduction</title>
<p>The chloroplast (cp) is the photosynthetic organelle that provides energy for plants and algae. It is believed that chloroplasts arose from endosymbiosis between a photosynthetic bacterium and non-photosynthetic host (<xref ref-type="bibr" rid="B14">Dyall et al., 2004</xref>). The chloroplast has its own genome, which is generally non-recombinant and uniparentally inherited (<xref ref-type="bibr" rid="B4">Birky, 1995</xref>). Chloroplast genes are involved in major functions, which include sugar synthesis, starch storage, the production of several amino acids, lipids, vitamins, and pigments. They are also involved in key sulfur and nitrogen metabolic pathways. In angiosperms, most cp genomes are composed of circular DNA molecules ranging from 120 to 160 kb in length and have a quadripartite organization consisting of two copies of inverted repeats (IRs) of about 20&#x2013;28 kb in size. These IRs divide the rest of cp genome into an 80&#x2013;90 kb Large Single Copy (LSC) region and a 16&#x2013;27 kb Small Single Copy (SSC) region (<xref ref-type="bibr" rid="B26">Jansen et al., 2005</xref>). The gene content and order of cp genomes of angiosperms are generally conserved, which encode four rRNAs, 30 tRNAs, and about 80 unique proteins (<xref ref-type="bibr" rid="B9">Chumley et al., 2006</xref>).</p>
<p>Chloroplast-derived DNA sequences have been widely used for phylogenetic studies, and complete cp genome sequences could provide valuable data sets for resolving complex evolutionary relationships (<xref ref-type="bibr" rid="B25">Jansen et al., 2007</xref>; <xref ref-type="bibr" rid="B38">Moore et al., 2010</xref>). However, acquiring large coverage of cp genomes has typically been limited by conventional DNA sequencing technology. As Next-Generation Sequencing technologies have revolutionized DNA sequencing (<xref ref-type="bibr" rid="B47">Shendure and Ji, 2008</xref>), it is now more convenient to obtain complete cp genome sequences with low cost, to extend gene-based phylogenetics to genome-based phylogenomics, and to examine phylogeny and evolutionary events of plant species using complete entire cp genome sequences.</p>
<p>Salicaceae s.str consists of two genera (<xref ref-type="bibr" rid="B39">Ohashi, 2001</xref>): <italic>Salix</italic> with about 450&#x2013;580 species (<xref ref-type="bibr" rid="B17">Fang et al., 1999</xref>; <xref ref-type="bibr" rid="B2">Argus et al., 2010</xref>), and <italic>Populus</italic> with about 30 species (<xref ref-type="bibr" rid="B2">Argus et al., 2010</xref>). Species of Salicaceae are widely distributed in the world, except in Oceania and Antarctica. They are mostly found in the Northern Template Zone and are one of the main groups of trees and shrubs in those areas (<xref ref-type="bibr" rid="B48">Skvortsov, 1999</xref>; <xref ref-type="bibr" rid="B2">Argus et al., 2010</xref>). Plants of Salicaceae are often grown for ornament, shelterbelts, timber, pulp, and specialty wood products. Some shrub species of <italic>Salix</italic> are deemed as most suitable for bioenergy crops (<xref ref-type="bibr" rid="B27">Karp and Shield, 2008</xref>).</p>
<p>Because of dioecious reproduction, simple flowers, common natural hybridization, and large intraspecific phenotypic variation, both the resolution of taxonomy and the systematics of Salicaceae based on morphology, especially <italic>Salix</italic>, have been extremely difficult (<xref ref-type="bibr" rid="B48">Skvortsov, 1999</xref>; <xref ref-type="bibr" rid="B2">Argus et al., 2010</xref>). Molecular methods (e.g., molecular marker techniques, molecular phylogenetics and DNA barcoding) provide effective information for taxonomy, species identification, and phylogenetics of Salicaceae. However, previous molecular systematic analyses revealed that the phylogeny of Salicaceae, based on single or a few genetic markers, succeeds in resolving relationships in generic or sub-generic levels, but limits or has almost no resolution in infra-subgeneric level, specifically in subgenera <italic>Chameatia</italic>-<italic>Vetrix</italic> clade (<xref ref-type="bibr" rid="B31">Leskinen and Alstrom-Rapaport, 1999</xref>; <xref ref-type="bibr" rid="B20">Hamzeh and Dayanandan, 2004</xref>; <xref ref-type="bibr" rid="B7">Chen et al., 2008</xref>; <xref ref-type="bibr" rid="B21">Hardig et al., 2010</xref>). Therefore, DNA markers with higher resolution for phylogenetic analysis of unresolved lineages remain to be examined in Salicaceae.</p>
<p>Here, we report the complete cp genome sequences of three <italic>Salix</italic> species and further integrate the 11 available cp genomes of Salicaceae. Hence, all of the main lineages of Salicaceae have their representative species present in this study. The questions that we addressed in this study are: (1) What are potential DNA markers in cp genomes that can be used for phylogenetic analysis of Salicaceae? (2) What is the phylogenetic relation of Salicaceae based on phylogenomic analysis of complete Salicaceae cp genomes? (3) What are the structures and contents of cp genomes in Salicaceae? and (4) What are the evolution and dynamics patterns of cp genomes revealed by examining evolution of cp genes and DNA horizontally transferring events from cp to nucleus?</p>
</sec>
<sec id="s1" sec-type="materials|methods">
<title>Materials and Methods</title>
<sec><title>Plant Materials</title>
<p>Three species, <italic>S. tetrasperma</italic>, <italic>S. babylonica</italic>, and <italic>S. oreinoma</italic>, representing two subgenera of the genus <italic>Salix</italic>, were sampled. We collected healthy, tender and fresh leaves from adult plants of target species. The voucher herbarium specimens for the three sampled <italic>Salix</italic> species are all deposited in Herbarium of Kunming Institute of Botany, Chinese Academy of Sciences (KUN) (Supplementary Table <xref ref-type="supplementary-material" rid="SM1">S1</xref>).</p>
</sec>
<sec><title>Chloroplast DNA Extraction, Sequencing, Genome, Assembly, and PCR-Based Validation</title>
<p>Total DNA enriched with cp DNA was extracted from 200 g of fresh leaves according to the methods of <xref ref-type="bibr" rid="B26">Jansen et al. (2005)</xref> and <xref ref-type="bibr" rid="B66">Zhang et al. (2011)</xref>. Purified DNA (5 mg) was fragmented and used to construct short-insert libraries according to the manufacturer&#x2019;s manual (Illumina Inc., San Diego, CA, United States). DNA from the different individuals was indexed by tags and pooled together in one lane of Illumina&#x2019;s Genome Analyzer for sequencing.</p>
<p>We filtered out non-cp DNA reads from the raw sequences based on the known cp genome sequences. Next, the filtered reads were used to <italic>de novo</italic> assemble the cp genomes with SOAPdenovo software, which is specially designed to assemble Illumina short reads. SOAPdenovo pipeline (e.g., <italic>k</italic> = 31 bp and scaffolding contigs with a minimum size of 100 bp) can carry out accurate analyses of unexplored genomes, resolve repeat regions in contig assembly and improve gap closing, etc., in a cost effective way (<xref ref-type="bibr" rid="B32">Luo et al., 2012</xref>). Then, all contigs were mapped to the reference cp genome of <italic>P. trichocarpa</italic> using BLAST<sup><xref ref-type="fn" rid="fn01">1</xref></sup> search from NCBI with default parameters. The orders of aligned contigs were determined according to the reference genome. Gaps between the <italic>de novo</italic> contigs were replaced with consensus sequences of raw reads mapped to the reference genomes.</p>
<p>Based on the reference genomes, the four junctions between LSC/IRs and SSC/IRs were confirmed with PCR-based product sequencing, respectively. To avoid assembly errors and to obtain high quality complete cp genome sequences, validation of assembly was also carried out with intensive PCR-based sequencing. We designed 39 pairs of primers (see Supplementary Table <xref ref-type="supplementary-material" rid="SM2">S2</xref> for detail) based on the varied regions of the eight preliminary cp genome assemblies. PCR products were sequenced using the BigDye v3.1 Terminator Kit for ABI 3730xl (Life Technologies, Carlsbad, CA, United States). Sanger sequences and assembled genomes were aligned using Geneious 7 (<xref ref-type="bibr" rid="B28">Kearse et al., 2012</xref>) to determine if there were any differences. The complete cp genome sequences were deposited in GenBank (Supplementary Table <xref ref-type="supplementary-material" rid="SM1">S1</xref>).</p>
</sec>
<sec><title>Chloroplast Genome Annotation</title>
<p>Genome annotation was accomplished using the Dual Organellar Genome Annotator (DOGMA) (<xref ref-type="bibr" rid="B59">Wyman et al., 2004</xref>) and also compared with the available complete chloroplast genome of <italic>P. trichocarpa</italic> (GenBank accession number <ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="NC009143">NC009143</ext-link>) to annotate the genes encoding proteins, transfer RNAs (tRNAs), and ribosomal RNAs (rRNAs). All of the identified tRNA genes were further verified using the corresponding structures predicted by tRNAscan-Se v1.21 (<xref ref-type="bibr" rid="B45">Schattner et al., 2005</xref>).</p>
</sec>
<sec><title>Phylogenetic and Network Analyses</title>
<p>We performed phylogenetic analyses based on different datasets, i.e., whole cp genome minus the second inverted repeat region (IRa) to avoid considering the same information twice, concatenated non-coding sequences (including intergenic spacers and introns), and concatenated exons of protein-coding genes (<italic>ndhA</italic> was excluded due to its pseudogenization in <italic>S. babylonica</italic>). Sequences were aligned with Muscle [62]. After manual correction in Geneious (<xref ref-type="bibr" rid="B28">Kearse et al., 2012</xref>), phylogenetic analyses were performed based on maximum likelihood (ML) criteria and Bayesian inference (BI). The ML analysis employing a GTR+G model of substitution for all datasets was run in RAxML v7.2.8 (<xref ref-type="bibr" rid="B49">Stamatakis, 2006</xref>), and a bootstrap analysis of 2000 replicates was performed simultaneously (option &#x201C;-f a&#x201D;). BI was performed with MrBayes v3.2 (<xref ref-type="bibr" rid="B42">Ronquist et al., 2012</xref>). Two independent Markov chain Monte Carlo (MCMC) chains were run, each with three heated and one cold chain for three million generations. Each chain started with a random tree, default priors, and sampling trees every 100 generations. The GTR+G model was used for all datasets as suggested by RaxML manual.</p>
<p>Topological incongruence among conflicting datasets was tested using Dendroscope v. 3.2.8 under the galled network consensus algorithm (<xref ref-type="bibr" rid="B24">Huson and Scornavacca, 2012</xref>). ML trees of concatenated protein coding sequences and whole genome sequence were used to infer phylogenetic networks.</p>
</sec>
<sec><title>Molecular Evolution Analysis</title>
<p>We collected the coding DNA sequence (CDS) of each orthologous gene in eight <italic>Populus</italic> species and six <italic>Salix</italic> species, and aligned them with translation aligner using Geneious (<xref ref-type="bibr" rid="B28">Kearse et al., 2012</xref>), which could take into account frame shifts and premature stop codons, and generate codon-based alignments. We used the branch site model of PAML to compute the <italic>K</italic>a/<italic>K</italic>s ratio for orthologous genes in each external and internal branch of phylogeny trees that were generated based on protein coding sequence alignment with ML method. We tested two branch site models (with the parameters model = 2 and NSsites = 2): the &#x201C;model 1&#x201D; with both the branch site specific <italic>K</italic>a/<italic>K</italic>s and background <italic>K</italic>a/<italic>K</italic>s varying freely, and the &#x201C;model 2&#x201D; with the branch site specific <italic>K</italic>a/<italic>K</italic>s fixed at 1 and background <italic>K</italic>a/<italic>K</italic>s varying freely (<xref ref-type="bibr" rid="B63">Yang, 2007</xref>). We then performed the Likelihood Ratio Test (LRT), which tests whether the likelihood of the &#x201C;model 1&#x201D; is significantly different from that of the &#x201C;model 2&#x201D; by comparing two times the log likelihood difference. We computed <italic>p</italic>-values using a chi-square distribution with one degree of freedom (<xref ref-type="bibr" rid="B62">Yang, 1998</xref>).</p>
</sec>
<sec><title>Analysis of Chloroplast-Nuclear DNA Transfer</title>
<p>We used the pipelines developed by UCSC genome browser to search for the homologous regions of the chloroplast genome in the nuclear genome for <italic>P. trichocarpa</italic><sup><xref ref-type="fn" rid="fn02">2</xref></sup> and <italic>S. purpurea</italic><sup><xref ref-type="fn" rid="fn03">3</xref></sup>, respectively, since the nuclear genome sequences are only available for these two species (<xref ref-type="bibr" rid="B53">Tuskan et al., 2006</xref>). We first aligned the chloroplast genome to the nuclear genome of the two reference genomes with LASTZ (<xref ref-type="bibr" rid="B22">Harris, 2007</xref>). We then transformed the &#x201C;lav&#x201D; output format of LASTZ to &#x201C;axt&#x201D; format using lavToAxt. Finally we chained the &#x201C;axt&#x201D; files using axtChain and generated chain format outputs (<xref ref-type="bibr" rid="B29">Kent et al., 2003</xref>; <xref ref-type="bibr" rid="B46">Schwartz et al., 2003</xref>).</p>
<p>Although we did not set the identity filter of LASTZ for blocks or HSPs (high-scoring segment pairs), we computed the identity of the final homologous sequence pairs generated by axtChain. The identity ranges from 60 to 99%. Based on the chain file, we generated the homologous regions between the chloroplast genome and nuclear genome, which could represent chloroplast-nuclear DNA transfer events.</p>
</sec>
<sec><title>Monte Carlo Sampling Testing</title>
<p>To test the hypotheses of whole chloroplast genome horizontally transferring to nucleus in <italic>P. trichocarpa</italic>, we conducted 100,000 Monte Carlo sampling test, which is implemented by the comparison of the observed data with random samples generated in accordance with the hypothesis being tested. The significance of the Monte Carlo sampling test is determined by the rank of the test criterion of the observed data relative to the test criteria of the random samples composing the reference set (<xref ref-type="bibr" rid="B23">Hope, 1968</xref>). Specifically, we assumed the whole chloroplast genome was split into eight fragments, which were inferred from the alignment between the chloroplast genome and nuclear genome in species <italic>P. trichocarpa</italic>. For each simulation, we randomly sampled the insertion locations of the eight fragments in the whole <italic>P. trichocarpa</italic> genome. We then counted the number of simulations, in which the five fragments (i.g. the 1st, 3rd, 5th, 7th, and 8th fragments) were inserted into the same chromosome within the range of the length of chromosome 13 and were arranged according to their previous order in the chloroplast genome. Next, we divided this number by the number of simulations (i.g. 100,000). This mathematical derivative is treated as the <italic>P</italic>-value for inference of the testing hypothesis significance.</p>
</sec>
</sec>
<sec><title>Results</title>
<sec><title>Genome Assembly and PCR-Based Validation</title>
<p>Using the Illumina Hiseq 2000 system, we obtain 1,392,310, 3,779,094, and 4,595,286 bp paired-end clean reads (average read length 91 bp) for <italic>S. babylonica</italic>, <italic>S. tetrasperma</italic>, and <italic>S. oreinoma</italic>, respectively. We mapped these sequence reads to the reference cp genome of <italic>S. purpurea</italic> and achieved at least 1600&#x00D7; (1615&#x00D7; for <italic>S. babylonica</italic>, 4384&#x00D7; for <italic>S. tetrasperma</italic>, 5330&#x00D7; for <italic>S. oreinoma</italic>) coverage for cp genome. Based on <italic>de novo</italic> and reference-guide assembly, we obtain the complete cp genome for <italic>S. tetrasperma</italic>. The assemblies of the other two cp genomes contain seven to eight gaps, and we filled the gaps using PCR-based sequencing.</p>
<p>Four junction regions of cp genomes were validated using PCR-based sequencing for each cp genome, respectively. Furthermore, in order to overcome the errors of heterogeneous indels from homopolymeric repeats (<xref ref-type="bibr" rid="B37">Moore et al., 2006</xref>; <xref ref-type="bibr" rid="B61">Yang et al., 2010</xref>), we conducted PCR-based validation to correct the errors. We designed 18 pairs of primers based on the variable regions of alignments to validate these sequences in each cp genome (Supplementary Table <xref ref-type="supplementary-material" rid="SM2">S2</xref>). In total, we amplified and sequenced &#x223C; 152 kb from all the three <italic>Salix</italic> species. Then, we compared these sequences directly to the assembled genomes and we observed no nucleotide mismatches or indels/insertions. This result confirmed the reliability of assembled chloroplast genome sequences. Finally, we obtain the complete cp genome sequences of <italic>S. babylonica</italic>, <italic>S. tetrasperma</italic>, and <italic>S. oreinoma.</italic></p>
</sec>
<sec><title>Salicaceae Chloroplast Genome Structure and Content</title>
<p>The complete cp genomes of the three <italic>Salix</italic> species sequenced vary from 155,531 to 155,740 bp in size and exhibit a typical circular structure including a pair of IRs (range from 27384 to 27436 bp) and two single-copy regions (LSC, 84466&#x2013;84580 bp; SSC, 15862&#x2013;16323 bp). Each of the three cp genomes contains 111 unique genes (110 for <italic>S. babylonica</italic>) (<bold>Figure <xref ref-type="fig" rid="F1">1</xref></bold> and <bold>Table <xref ref-type="table" rid="T1">1</xref></bold>), including 77 unique CDSs (76 for <italic>S. babylonica</italic> because of the pseudogenization of <italic>ndhA</italic>), four unique rRNAs, and 30 unique tRNAs. Seventeen genes contain introns; 14 of them (<italic>atpF</italic>, <italic>ndhA</italic>, <italic>ndhB</italic>, <italic>petB</italic>, <italic>petD</italic>, <italic>rpl2</italic>, <italic>rpl16</italic>, <italic>rpoC1</italic>, <italic>trnA</italic><sup>UGC</sup>, <italic>trnG</italic><sup>GCC</sup>, <italic>trnI</italic><sup>GAU</sup>, <italic>trnK</italic><sup>UUU</sup>, <italic>trnL</italic><sup>UAA</sup>, and <italic>trnV</italic><sup>UAC</sup>) exhibit one intron and three of them (<italic>clpP</italic>, <italic>rps12</italic>, and <italic>ycf3</italic>) contain two introns. All the CDSs have canonical ATG start codons except <italic>ndhD</italic>, which has GTG as the start codon.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption><p>Gene map of the three sequenced <italic>Salix</italic> chloroplast genomes. Genes shown outside the outer circle are transcribed clockwise and those inside are transcribed counterclockwise. Genes belonging to different functional groups are color-coded. Dashed area in the inner circle indicates the GC content of the chloroplast genome of <italic>Salix tetrasperma</italic>.</p></caption>
<graphic xlink:href="fpls-08-01050-g001.tif"/>
</fig>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Features of chloroplast complete genomes in Salicaceae.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left"></td>
<td valign="top" align="center"></td>
<td valign="top" align="center"></td>
<td valign="top" align="center"></td>
<td valign="top" align="center"></td>
<td valign="top" align="center"></td>
<th valign="top" align="center">Length of</th>
<th valign="top" align="center">Length of</th>
<th valign="top" align="center">Length of</th>
<td valign="top" align="center"></td>
<th valign="top" align="center">No. of</th>
<th valign="top" align="center">No. of</th>
<th valign="top" align="center">No. of</th>
<th valign="top" align="center">No. of</th>
</tr>
<tr>
<td valign="top" align="left"></td>
<th valign="top" align="center">Genome</th>
<th valign="top" align="center">Length</th>
<th valign="top" align="center">Length</th>
<th valign="top" align="center">Length</th>
<th valign="top" align="center">Length</th>
<th valign="top" align="center">coding</th>
<th valign="top" align="center">protein coding</th>
<th valign="top" align="center">non-coding</th>
<th valign="top" align="center">GC</th>
<th valign="top" align="center">unique</th>
<th valign="top" align="center">unique</th>
<th valign="top" align="center">unique</th>
<th valign="top" align="center">unique</th>
</tr>
<tr>
<th valign="top" align="left">Species</th>
<th valign="top" align="center">size</th>
<th valign="top" align="center">of LSC</th>
<th valign="top" align="center">of SSC</th>
<th valign="top" align="center">of IRA</th>
<th valign="top" align="center">of IRB</th>
<th valign="top" align="center">sequence (%)</th>
<th valign="top" align="center">sequence (%)</th>
<th valign="top" align="center">sequence</th>
<th valign="top" align="center">(%)</th>
<th valign="top" align="center">gene</th>
<th valign="top" align="center">CDS</th>
<th valign="top" align="center">tRNA</th>
<th valign="top" align="center">rRNA</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left"><italic>Salix tetrasperma</italic><sup>#</sup></td>
<td valign="top" align="center">155671</td>
<td valign="top" align="center">84584</td>
<td valign="top" align="center">16291</td>
<td valign="top" align="center">27398</td>
<td valign="top" align="center">27398</td>
<td valign="top" align="center">91593 (58.84)</td>
<td valign="top" align="center">80115 (51.47)</td>
<td valign="top" align="center">64078</td>
<td valign="top" align="center">36.7</td>
<td valign="top" align="center">113</td>
<td valign="top" align="center">77</td>
<td valign="top" align="center">30</td>
<td valign="top" align="center">4</td>
</tr>
<tr>
<td valign="top" align="left"><italic>S. babylonica</italic><sup>#</sup></td>
<td valign="top" align="center">155246</td>
<td valign="top" align="center">84516</td>
<td valign="top" align="center">15830</td>
<td valign="top" align="center">27450</td>
<td valign="top" align="center">27450</td>
<td valign="top" align="center">90483 (58.28)</td>
<td valign="top" align="center">79005 (50.89)</td>
<td valign="top" align="center">64763</td>
<td valign="top" align="center">36.6</td>
<td valign="top" align="center">112</td>
<td valign="top" align="center">76</td>
<td valign="top" align="center">30</td>
<td valign="top" align="center">4</td>
</tr>
<tr>
<td valign="top" align="left"><italic>S. oreinoma</italic><sup>#</sup></td>
<td valign="top" align="center">155531</td>
<td valign="top" align="center">84470</td>
<td valign="top" align="center">16213</td>
<td valign="top" align="center">27424</td>
<td valign="top" align="center">27424</td>
<td valign="top" align="center">92026 (59.17)</td>
<td valign="top" align="center">80187 (51.56)</td>
<td valign="top" align="center">63505</td>
<td valign="top" align="center">36.7</td>
<td valign="top" align="center">113</td>
<td valign="top" align="center">77</td>
<td valign="top" align="center">30</td>
<td valign="top" align="center">4</td>
</tr>
<tr>
<td valign="top" align="left"><italic>S. interior</italic><sup>&#x2217;</sup></td>
<td valign="top" align="center">156620</td>
<td valign="top" align="center">85980</td>
<td valign="top" align="center">16308</td>
<td valign="top" align="center">27166</td>
<td valign="top" align="center">27166</td>
<td valign="top" align="center">91738 (58.57)</td>
<td valign="top" align="center">80079 (51.13)</td>
<td valign="top" align="center">64882</td>
<td valign="top" align="center">37.0</td>
<td valign="top" align="center">113</td>
<td valign="top" align="center">77</td>
<td valign="top" align="center">30</td>
<td valign="top" align="center">4</td>
</tr>
<tr>
<td valign="top" align="left"><italic>S. purpurea</italic><sup>&#x2217;</sup></td>
<td valign="top" align="center">155590</td>
<td valign="top" align="center">84452</td>
<td valign="top" align="center">16220</td>
<td valign="top" align="center">27459</td>
<td valign="top" align="center">27459</td>
<td valign="top" align="center">92014 (59.14)</td>
<td valign="top" align="center">80175 (51.53)</td>
<td valign="top" align="center">63576</td>
<td valign="top" align="center">36.7</td>
<td valign="top" align="center">113</td>
<td valign="top" align="center">77</td>
<td valign="top" align="center">30</td>
<td valign="top" align="center">4</td>
</tr>
<tr>
<td valign="top" align="left"><italic>S. suchowensis</italic><sup>&#x2217;</sup></td>
<td valign="top" align="center">155214</td>
<td valign="top" align="center">84077</td>
<td valign="top" align="center">16221</td>
<td valign="top" align="center">27458</td>
<td valign="top" align="center">27458</td>
<td valign="top" align="center">92014 (59.28)</td>
<td valign="top" align="center">80175 (51.65)</td>
<td valign="top" align="center">63200</td>
<td valign="top" align="center">36.7</td>
<td valign="top" align="center">113</td>
<td valign="top" align="center">77</td>
<td valign="top" align="center">30</td>
<td valign="top" align="center">4</td>
</tr>
<tr>
<td valign="top" align="left"><italic>Populus alba</italic><sup>&#x2217;</sup></td>
<td valign="top" align="center">156505</td>
<td valign="top" align="center">84618</td>
<td valign="top" align="center">16567</td>
<td valign="top" align="center">27660</td>
<td valign="top" align="center">27660</td>
<td valign="top" align="center">92467 (59.08)</td>
<td valign="top" align="center">80628 (51.52)</td>
<td valign="top" align="center">64038</td>
<td valign="top" align="center">36.7</td>
<td valign="top" align="center">113</td>
<td valign="top" align="center">77</td>
<td valign="top" align="center">30</td>
<td valign="top" align="center">4</td>
</tr>
<tr>
<td valign="top" align="left"><italic>P. balsamifera</italic><sup>&#x2217;</sup></td>
<td valign="top" align="center">157094</td>
<td valign="top" align="center">84922</td>
<td valign="top" align="center">16499</td>
<td valign="top" align="center">27846</td>
<td valign="top" align="center">27827</td>
<td valign="top" align="center">92542 (58.91)</td>
<td valign="top" align="center">80703 (51.37)</td>
<td valign="top" align="center">64552</td>
<td valign="top" align="center">36.7</td>
<td valign="top" align="center">113</td>
<td valign="top" align="center">77</td>
<td valign="top" align="center">30</td>
<td valign="top" align="center">4</td>
</tr>
<tr>
<td valign="top" align="left"><italic>P. euphratica</italic><sup>&#x2217;</sup></td>
<td valign="top" align="center">156766</td>
<td valign="top" align="center">84887</td>
<td valign="top" align="center">16589</td>
<td valign="top" align="center">27644</td>
<td valign="top" align="center">27646</td>
<td valign="top" align="center">92406 (58.95)</td>
<td valign="top" align="center">80568 (51.39)</td>
<td valign="top" align="center">64360</td>
<td valign="top" align="center">36.7</td>
<td valign="top" align="center">113</td>
<td valign="top" align="center">77</td>
<td valign="top" align="center">30</td>
<td valign="top" align="center">4</td>
</tr>
<tr>
<td valign="top" align="left"><italic>P. fremontii</italic><sup>&#x2217;</sup></td>
<td valign="top" align="center">157446</td>
<td valign="top" align="center">85454</td>
<td valign="top" align="center">16318</td>
<td valign="top" align="center">27837</td>
<td valign="top" align="center">27837</td>
<td valign="top" align="center">92575 (58.80)</td>
<td valign="top" align="center">80736 (51.28)</td>
<td valign="top" align="center">64871</td>
<td valign="top" align="center">36.7</td>
<td valign="top" align="center">113</td>
<td valign="top" align="center">77</td>
<td valign="top" align="center">30</td>
<td valign="top" align="center">4</td>
</tr>
<tr>
<td valign="top" align="left"><italic>P. tremula</italic><sup>&#x2217;</sup></td>
<td valign="top" align="center">156067</td>
<td valign="top" align="center">84377</td>
<td valign="top" align="center">16490</td>
<td valign="top" align="center">27600</td>
<td valign="top" align="center">27600</td>
<td valign="top" align="center">92473 (59.25)</td>
<td valign="top" align="center">80634 (51.67)</td>
<td valign="top" align="center">63594</td>
<td valign="top" align="center">36.8</td>
<td valign="top" align="center">113</td>
<td valign="top" align="center">77</td>
<td valign="top" align="center">30</td>
<td valign="top" align="center">4</td>
</tr>
<tr>
<td valign="top" align="left"><italic>P. yunnanensis</italic><sup>&#x2217;</sup></td>
<td valign="top" align="center">155776</td>
<td valign="top" align="center">83955</td>
<td valign="top" align="center">16549</td>
<td valign="top" align="center">27636</td>
<td valign="top" align="center">27636</td>
<td valign="top" align="center">91404 (58.68)</td>
<td valign="top" align="center">79565 (51.08)</td>
<td valign="top" align="center">64372</td>
<td valign="top" align="center">36.8</td>
<td valign="top" align="center">113</td>
<td valign="top" align="center">77</td>
<td valign="top" align="center">30</td>
<td valign="top" align="center">4</td>
</tr>
<tr>
<td valign="top" align="left"><italic>P. cathayana</italic><sup>&#x2217;</sup></td>
<td valign="top" align="center">155449</td>
<td valign="top" align="center">83911</td>
<td valign="top" align="center">16488</td>
<td valign="top" align="center">27525</td>
<td valign="top" align="center">27525</td>
<td valign="top" align="center">92423 (59.46)</td>
<td valign="top" align="center">80584 (51.84)</td>
<td valign="top" align="center">63026</td>
<td valign="top" align="center">36.9</td>
<td valign="top" align="center">113</td>
<td valign="top" align="center">77</td>
<td valign="top" align="center">30</td>
<td valign="top" align="center">4</td>
</tr>
<tr>
<td valign="top" align="left"><italic>P. trichocarpa</italic><sup>&#x2217;</sup></td>
<td valign="top" align="center">157033</td>
<td valign="top" align="center">85129</td>
<td valign="top" align="center">16600</td>
<td valign="top" align="center">27652</td>
<td valign="top" align="center">27652</td>
<td valign="top" align="center">92440 (58.87)</td>
<td valign="top" align="center">80601 (51.33)</td>
<td valign="top" align="center">64593</td>
<td valign="top" align="center">36.7</td>
<td valign="top" align="center">113</td>
<td valign="top" align="center">77</td>
<td valign="top" align="center">30</td>
<td valign="top" align="center">4</td></tr>
</tbody>
</table>
<table-wrap-foot>
<attrib><italic>Sequence lengths are measured by bp. <sup>#</sup>cp genome sequenced in this study. <sup>&#x2217;</sup>cp genomes of Salicaceae are available in GenBank</italic>.</attrib>
</table-wrap-foot>
</table-wrap>
</sec>
<sec><title>Molecular Marker Identification</title>
<p>Based on cp genome alignment from 14 Salicaceae species, we identified 30 highly divergent non-coding regions for implementation of phylogenetic analysis in <italic>Salix</italic>, <italic>Populus</italic> and Salicaceae. Those molecular markers are mostly derived from intergenic or intronic non-coding regions, though four protein-coding genes including <italic>ccsA</italic>, <italic>rpl20</italic>, <italic>rps7</italic> and <italic>ndhA/ndhE</italic> are also identified. Among the four genes, <italic>rps7</italic> is highly diverged and lineage-specific within genus <italic>Salix</italic>. The length of <italic>rps7</italic> is 468 bp for all <italic>Populus</italic> species except <italic>P. cathayana</italic>, whereas the three <italic>Salix</italic> species (i.g. <italic>S. oreinoma</italic>, <italic>S. suchowensis</italic>, and <italic>S. purpurea</italic>), which are closely related to subgenera <italic>Chamaetia</italic> and <italic>Vetrix</italic>, have identical <italic>rps7</italic> sequences with a length of 270 bp. The other three <italic>Salix</italic> species (i.g. <italic>S. interior</italic>, <italic>S. babylonica</italic>, and <italic>S. tetrasperma</italic>) in subgenus <italic>Salix</italic> have identical <italic>rps7</italic> sequences with a length of only 180 bp (<bold>Figures <xref ref-type="fig" rid="F2">2A</xref></bold>, <bold><xref ref-type="fig" rid="F3">3</xref></bold>, and Supplementary Table <xref ref-type="supplementary-material" rid="SM2">S3</xref>).</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption><p>Maximum likelihood tree of Salicaceae. <bold>(A)</bold> The major variation events of protein coding sequences as compared to <italic>Populus trichocarpa</italic> and phylogeny based on concatenated protein coding dataset; <bold>(B)</bold> Phylogeny based on whole genome and concatenated non-coding datasets; <bold>(C)</bold> Rectangular cladogram of phylogenetic network. Clade supports were reported as maximum likelihood bootstrap support value/bayesian posterior probability (only those below100% or 1 were shown). Conflicting clades between different phylogenetic trees were highlighted by red.</p></caption>
<graphic xlink:href="fpls-08-01050-g002.tif"/>
</fig>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption><p>Alignment of translated <italic>rps7</italic> coding region in 14 Salicaceae species with <italic>Licania</italic> as outgroup.</p></caption>
<graphic xlink:href="fpls-08-01050-g003.tif"/>
</fig>
<p>The phylogenetic resolution of the genus <italic>Salix</italic> is proven to be extremely difficult (<xref ref-type="bibr" rid="B48">Skvortsov, 1999</xref>; <xref ref-type="bibr" rid="B2">Argus et al., 2010</xref>; <xref ref-type="bibr" rid="B57">Wu et al., 2015</xref>) and no agreement on <italic>Salix</italic> phylogeny is made thus far. Molecular phylogenetic analysis in <italic>Salix</italic> only succeeds in resolving relationships within the subgeneric level, but fails in the taxonomic level under subgenus and shows almost no resolution in the clade with about 73% of <italic>Salix</italic> species (i.e., subgenera <italic>Chamaetia</italic> and <italic>Vetrix</italic>) (<xref ref-type="bibr" rid="B31">Leskinen and Alstrom-Rapaport, 1999</xref>; <xref ref-type="bibr" rid="B3">Azuma et al., 2000</xref>; <xref ref-type="bibr" rid="B6">Chen et al., 2010</xref>; <xref ref-type="bibr" rid="B21">Hardig et al., 2010</xref>; <xref ref-type="bibr" rid="B1">Abdollahzadeh et al., 2011</xref>; <xref ref-type="bibr" rid="B57">Wu et al., 2015</xref>). Therefore, there is a need to identify highly divergent regions in <italic>Salix</italic> cp genomes as molecular markers.</p>
</sec>
<sec><title>Phylogenomic Analyses of Salicaceae</title>
<p>The genus <italic>Populus</italic> is comprised of ca. 30 species, which can be divided into six major clades (<xref ref-type="bibr" rid="B16">Eckenwalder, 1996</xref>; <xref ref-type="bibr" rid="B2">Argus et al., 2010</xref>). We selected eight species from four clades in our phylogenomic analyses. Our analysis indicates that <italic>Populus</italic> is monophyletic. Phylogeny, based on whole genome and concatenated non-coding datasets, shares the same topology (<bold>Figure <xref ref-type="fig" rid="F2">2B</xref></bold>). However, phylogenetic relationship based on concatenated protein coding dataset is different in the topological positions of <italic>P. euphratica</italic>, <italic>P. cathayana</italic> and <italic>P. balsamifera</italic> (<bold>Figure <xref ref-type="fig" rid="F2">2A</xref></bold>), as indicated by network analysis (<bold>Figure <xref ref-type="fig" rid="F2">2C</xref></bold>). The three New World <italic>Populus</italic> species (<italic>P. trichocarpa</italic>, <italic>P. balsamifera</italic>, <italic>P. fremontii</italic>, Clade A in <bold>Figure <xref ref-type="fig" rid="F2">2A</xref></bold>) form a robust monophyly. This clade is sister to the remaining <italic>Populus</italic> species in the protein coding gene tree (<bold>Figure <xref ref-type="fig" rid="F2">2A</xref></bold>). However, <italic>P. euphratica</italic> is sister to <italic>P. cathayana</italic> from the tree based on whole cp genome and non-coding regions (<bold>Figure <xref ref-type="fig" rid="F2">2B</xref></bold>). In the tree based on protein coding regions, <italic>P. euphratica</italic> and <italic>P. cathayana</italic> fall in the clade that contains all species from Old World with low support value and short branch length (clade B in <bold>Figure <xref ref-type="fig" rid="F2">2A</xref></bold>). This topology is absent from previous studies with more <italic>Populus</italic> species sampled (<xref ref-type="bibr" rid="B20">Hamzeh and Dayanandan, 2004</xref>; <xref ref-type="bibr" rid="B56">Wang et al., 2014</xref>), indicating that insufficient sampling might have caused this topology.</p>
</sec>
<sec><title>Molecular Evolution of Salicaceae Chloroplast Genome</title>
<p>We identified at least four possible pseudogenization events in <italic>Salicaceae</italic>: (1) <italic>ndhA</italic> in <italic>S. babylonica</italic>, which contains a 1039 bp deletion including a start codon in exon 1; (2) <italic>accD</italic> in <italic>P. yunnanensis</italic>, which loses 1056 bp (72%) of gene content; (3) <italic>ycf1</italic> in <italic>P. yunnanensis</italic>, which is deleted >5000 bp (94%) of DNA sequences; (4) <italic>psbC</italic> in <italic>P. cathayana</italic>, which misses &#x223C;1000 bp (&#x223C;70%) of DNA sequences (<bold>Figure <xref ref-type="fig" rid="F2">2A</xref></bold>).</p>
<p>We identified seven significantly positively selected genes (<italic>atpE</italic>, <italic>rps7</italic>, <italic>ycf2</italic>, <italic>ccsA</italic>, <italic>petD</italic>, <italic>psbC</italic>, and <italic>psbJ</italic>) that contain positively selected sites (<bold>Table <xref ref-type="table" rid="T2">2</xref></bold>). Among them, three positively selected genes (i.e., <italic>rps7</italic>, <italic>petD</italic>, and <italic>psbC</italic>) are identified in <italic>P. cathayana</italic>, <italic>ycf2</italic> is identified in <italic>P. yunnanensis</italic>, <italic>ccsA</italic> and <italic>psbJ</italic> are identified in <italic>P. tremula</italic>. An exception, <italic>atpE</italic>, shows significant positive selection on one site in the <italic>Salix</italic> subgenus <italic>Salix</italic> clade (i.e., <italic>S. babylonica</italic>, <italic>S. tetrasperma</italic>, and <italic>S. interior</italic>) (<bold>Table <xref ref-type="table" rid="T2">2</xref></bold>).</p>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Positive selection sites identified using Codeml under branch-site model.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<th valign="top" align="left">Gene</th>
<th valign="top" align="center">Branch</th>
<th valign="top" align="center">Null</th>
<th valign="top" align="center">Alternative</th>
<th valign="top" align="center"><italic>p</italic>-value</th>
<th valign="top" align="left">Putative sites under positive selection, amino acid, and corresponding posterior probability</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left"><italic>atpE</italic></td>
<td valign="top" align="left"><italic>S. babylonica, S. tetrasperma</italic>, <italic>S. interior</italic></td>
<td valign="top" align="left">-578.123</td>
<td valign="top" align="left">-566.714</td>
<td valign="top" align="left">1.78<italic>E</italic>-06</td>
<td valign="top" align="left">130 S 0.970<sup>&#x2217;</sup></td>
</tr>
<tr>
<td valign="top" align="left"><italic>rps7</italic></td>
<td valign="top" align="left"><italic>P. cathayana</italic></td>
<td valign="top" align="left">-445.250</td>
<td valign="top" align="left">-437.826</td>
<td valign="top" align="left">1.17<italic>E</italic>-04</td>
<td valign="top" align="left">37 S 0.997<sup>&#x2217;&#x2217;</sup>; 38 L 0.992<sup>&#x2217;&#x2217;</sup>; 39 A 0.955<sup>&#x2217;</sup>; 40 Y 0.995<sup>&#x2217;&#x2217;</sup>; 41 Q 0.997<sup>&#x2217;&#x2217;</sup>; 42 I 0.909; 43 L 0.996<sup>&#x2217;&#x2217;</sup>; 44 Y 0.999<sup>&#x2217;&#x2217;</sup>; 45 R 0.986<sup>&#x2217;</sup>; 46 A 0.987<sup>&#x2217;</sup>; 47 M 0.961<sup>&#x2217;</sup>; 48 K 0.952<sup>&#x2217;</sup>; 49 K 0.989<sup>&#x2217;</sup>; 52 Q 0.978<sup>&#x2217;</sup>; 54 T 0.965<sup>&#x2217;</sup></td>
</tr>
<tr>
<td valign="top" align="left"><italic>ycf2</italic></td>
<td valign="top" align="left"><italic>P. yunnanensis</italic></td>
<td valign="top" align="left">-9664.014</td>
<td valign="top" align="left">-9628.861</td>
<td valign="top" align="left">5.07<italic>E</italic>-17</td>
<td valign="top" align="left">2271 M 0.982<sup>&#x2217;</sup>; 2272 A 0.982<sup>&#x2217;</sup>; 2275 G 0.969<sup>&#x2217;</sup></td>
</tr>
<tr>
<td valign="top" align="left"><italic>ccsA</italic></td>
<td valign="top" align="left"><italic>P. tremula</italic></td>
<td valign="top" align="left">-1585.549</td>
<td valign="top" align="left">-1581.119</td>
<td valign="top" align="left">2.92<italic>E</italic>-03</td>
<td valign="top" align="left">319 I 0.959<sup>&#x2217;</sup></td>
</tr>
<tr>
<td valign="top" align="left"><italic>petD</italic></td>
<td valign="top" align="left"><italic>P. cathayana</italic></td>
<td valign="top" align="left">-499.511</td>
<td valign="top" align="left">-491.763</td>
<td valign="top" align="left">8.27<italic>E</italic>-05</td>
<td valign="top" align="left">116 N 0.988<sup>&#x2217;</sup>; 117 V 0.989<sup>&#x2217;</sup></td>
</tr>
<tr>
<td valign="top" align="left"><italic>psbC</italic></td>
<td valign="top" align="left"><italic>P. cathayana</italic></td>
<td valign="top" align="left">-872.644</td>
<td valign="top" align="left">-829.745</td>
<td valign="top" align="left">1.99<italic>E</italic>-20</td>
<td valign="top" align="left">101 E 1.000<sup>&#x2217;&#x2217;</sup>; 102 V 1.000<sup>&#x2217;&#x2217;</sup>; 103 I 0.999<sup>&#x2217;&#x2217;</sup>; 104 D 0.997<sup>&#x2217;&#x2217;</sup>; 105 T 0.979<sup>&#x2217;</sup>; 106 F 0.998<sup>&#x2217;&#x2217;</sup>; 107 P 0.998<sup>&#x2217;&#x2217;</sup>; 108 Y 0.978<sup>&#x2217;</sup>; 109 F 0.974<sup>&#x2217;</sup>; 110 V 0.999<sup>&#x2217;&#x2217;</sup>; 111 S 1.000<sup>&#x2217;&#x2217;</sup>; 112 G 0.999<sup>&#x2217;&#x2217;</sup>; 113 V 1.000<sup>&#x2217;&#x2217;</sup>; 114 L 0.980<sup>&#x2217;</sup>; 115 H 1.000<sup>&#x2217;&#x2217;</sup>; 116 L 1.000<sup>&#x2217;&#x2217;</sup>; 117 I 0.974<sup>&#x2217;</sup>; 119 S 0.998<sup>&#x2217;&#x2217;</sup>; 120 A 1.000<sup>&#x2217;&#x2217;</sup>; 121 V 0.996<sup>&#x2217;&#x2217;</sup>; 122 L 0.995<sup>&#x2217;&#x2217;</sup>; 123 G 0.979<sup>&#x2217;</sup>; 124 F 0.999<sup>&#x2217;&#x2217;</sup>; 125 G 0.998<sup>&#x2217;&#x2217;</sup>; 127 I 0.978<sup>&#x2217;</sup>; 128 Y 0.998<sup>&#x2217;&#x2217;</sup>; 129 H 0.996<sup>&#x2217;&#x2217;</sup>; 130 A 1.000<sup>&#x2217;&#x2217;</sup>; 131 L 0.992<sup>&#x2217;&#x2217;</sup>; 132 L 0.996<sup>&#x2217;&#x2217;</sup>; 133 G 0.982<sup>&#x2217;</sup>; 134 P 0.980<sup>&#x2217;</sup>; 135 E 1.000<sup>&#x2217;&#x2217;</sup>; 136 T 0.997<sup>&#x2217;&#x2217;</sup>; 137 L 0.999<sup>&#x2217;&#x2217;</sup>; 138 E 0.998<sup>&#x2217;&#x2217;</sup>; 139 E 0.974<sup>&#x2217;</sup></td>
</tr>
<tr>
<td valign="top" align="left"><italic>psbJ</italic></td>
<td valign="top" align="left"><italic>P. tremula</italic></td>
<td valign="top" align="left">-196.515</td>
<td valign="top" align="left">-192.834</td>
<td valign="top" align="left">6.66<italic>E</italic>-03</td>
<td valign="top" align="left">20 P 0.981<sup>&#x2217;</sup></td></tr>
</tbody>
</table>
<table-wrap-foot>
<attrib><italic><sup>&#x2217;</sup>Posterior probability > 95%, <sup>&#x2217;&#x2217;</sup>Posterior probability > 99%. &#x201C;Null&#x201D; and &#x201C;Alternative&#x201D; columns list likelihood values obtained under the null model and alternative model.</italic></attrib>
</table-wrap-foot>
</table-wrap>
</sec>
<sec><title>Chloroplast-Nuclear DNA Transfer</title>
<p>We used LASTZ (<xref ref-type="bibr" rid="B22">Harris, 2007</xref>) to search for NUPT including DNA fragments less than 200 bp. We identified 571 and 713 NUPTs with total ungapped lengths of ca. 536 and 193 kb in <italic>P. trichocarpa</italic> and <italic>S. purpurea</italic>, respectively (<bold>Figure <xref ref-type="fig" rid="F4">4</xref></bold>, <bold>Table <xref ref-type="table" rid="T3">3</xref></bold>, and Supplementary Table <xref ref-type="supplementary-material" rid="SM2">S7</xref>). This result is different from a previous study that used BLAST method for searching NUPT (<xref ref-type="bibr" rid="B64">Yoshida et al., 2013</xref>). The number of NUPTs in <italic>S. purpurea</italic> is larger than that in <italic>P. trichocarpa</italic>, but the total length of NUPTs in <italic>S. purpurea</italic> is much lower than that in <italic>P. trichocarpa</italic> (<bold>Table <xref ref-type="table" rid="T3">3</xref></bold>).</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption><p>Illustration of NUPTs size distribution in <italic>Populus trichocarpa</italic> and <italic>Salix purpurea</italic>.</p></caption>
<graphic xlink:href="fpls-08-01050-g004.tif"/>
</fig>
<table-wrap position="float" id="T3">
<label>Table 3</label>
<caption><p>Sizes of chloroplast genome, of nuclear genomes, and of NUPTs detected by BlastZ.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<th valign="top" align="left">Category</th>
<th valign="top" align="left"><italic>Populus trichocarpa</italic></th>
<th valign="top" align="left"><italic>Salix purpurea</italic></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Nuclear genome size (Mb)</td>
<td valign="top" align="left">481</td>
<td valign="top" align="left">450</td>
</tr>
<tr>
<td valign="top" align="left">cp genome size (Kb)</td>
<td valign="top" align="left">157,033</td>
<td valign="top" align="left">155,590</td>
</tr>
<tr>
<td valign="top" align="left">Total number of NUPTs</td>
<td valign="top" align="left">574</td>
<td valign="top" align="left">713</td>
</tr>
<tr>
<td valign="top" align="left">Total length of NUPTs (bp)</td>
<td valign="top" align="left">2,869,374</td>
<td valign="top" align="left">491,553</td>
</tr>
<tr>
<td valign="top" align="left">Total gap length of NUPTs (bp)</td>
<td valign="top" align="left">2,333,068</td>
<td valign="top" align="left">298,700</td>
</tr>
<tr>
<td valign="top" align="left">Query aligned length (bp)</td>
<td valign="top" align="left">536,306</td>
<td valign="top" align="left">192,853</td>
</tr>
<tr>
<td valign="top" align="left">Query gap (bp)</td>
<td valign="top" align="left">1,204,611</td>
<td valign="top" align="left">565,676</td>
</tr>
<tr>
<td valign="top" align="left">Number of NUPTs genes in nuclear gene region</td>
<td valign="top" align="left">314</td>
<td valign="top" align="left">41</td>
</tr>
<tr>
<td valign="top" align="left">Number of NUPTs CDS in nuclear CDS regions</td>
<td valign="top" align="left">233</td>
<td valign="top" align="left">25</td>
</tr>
<tr>
<td valign="top" align="left">Length of cp CDS transferred in nuclear CDS region (bp)/proportion to NUPT length</td>
<td valign="top" align="left">126,219/23.5%</td>
<td valign="top" align="left">8,312/4.3%</td>
</tr>
<tr>
<td valign="top" align="left">Number of transferable genes</td>
<td valign="top" align="left">75</td>
<td valign="top" align="left">31</td>
</tr>
<tr>
<td valign="top" align="left">NUPTs in proportional of cp genome</td>
<td valign="top" align="left">341.50%</td>
<td valign="top" align="left">123.90%</td>
</tr>
<tr>
<td valign="top" align="left">NUPTs in proportional of nuclear genome</td>
<td valign="top" align="left">0.111%</td>
<td valign="top" align="left">0.043%</td>
</tr>
<tr>
<td valign="top" align="left">Number of transferable genes</td>
<td valign="top" align="left">75</td>
<td valign="top" align="left">31</td></tr>
</tbody>
</table>
<table-wrap-foot>
<attrib><italic>Number of NUPTs in <italic>Populus trichocarpa</italic> with length and identity no less than 100 bp and 90%, respectively, as is the criteria of <xref ref-type="bibr" rid="B64">Yoshida et al. (2013)</xref>, is 175, and corresponding total length is 388,664bp. While in <italic>Salix trichocarpa</italic>, NUPT number in the criteria of <xref ref-type="bibr" rid="B64">Yoshida et al. (2013)</xref> is 141, and corresponding total length is 47,363bp</italic>.</attrib>
</table-wrap-foot>
</table-wrap>
<p>The fragmented assembly of <italic>S. purpurea</italic> genome (2015) might cause uncertainty for NUPT searching. Therefore, we subsequently discuss the plastid-to-nucleus DNA transfer in Salicaceae, based on the analysis from <italic>P. trichocarpa</italic>. The most abundant NUPTs in <italic>P. trichocarpa</italic> are around 70&#x2013;199 bp (<bold>Figure <xref ref-type="fig" rid="F4">4</xref></bold>). The number of NUPTs with 1000 bp longer is 77, and these large NUPTs are responsible for the majority of the total NUPT length (>75%, ca. 410 kb out of 536 kb). Furthermore, the average identity of those large NUPTs is 90.2%, and that of the six largest NUPTs with length greater than 10 kb is 99.4% (Supplementary Table <xref ref-type="supplementary-material" rid="SM2">S7</xref>), indicating that large NUPTs are relatively recently integrated into the nucleus (see Data Sheet S1 for detail). Additionally, all large NUPTs contain large gaps.</p>
<p>In our analyses, we identified five aligned regions (&#x223C;4,000&#x2013;16,000 bp in length) and three/four gaps (&#x223C;7,000&#x2013;15,000 bp in length) between the chloroplast and nuclear genomes in the same order as in the cp genome by LASTZ (<bold>Figure <xref ref-type="fig" rid="F5">5</xref></bold>). We conducted MC sampling to demonstrate that it is more likely due to the insertion of the whole cp genome into the nuclear genome and generation of the gaps through insertion/translocation rather than the separate insertions of split cp genome fragments. Previously, it is also discovered that large cp fragments can be inserted into the nuclear genome based on the similarity between cp genome sequence and nuclear genome sequence using the sequence alignment tools (e.g., BLAST), and sequencing the insertion junction product (<xref ref-type="bibr" rid="B65">Yuan et al., 2002</xref>; <xref ref-type="bibr" rid="B64">Yoshida et al., 2013</xref>). Here, we used LASTZ as the alignment tool, which is a pairwise aligner for aligning DNA sequences and is originally designed to align the sequences in the size of human chromosomes and from different species (<xref ref-type="bibr" rid="B22">Harris, 2007</xref>). Therefore, it is more suitable to detect similarity in a large scale, such as between the cp genome and the nuclear genome. After alignment, we found that three large chunks of cp genomic regions are highly similar to three regions in nuclear genome, and more importantly, these regions are in the same order between the cp genome and nuclear genome.</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption><p>Illustration of whole chloroplast genome transferred to nucleus event in <italic>Populus trichocarpa</italic>. The &#x201C;Aligned sequence&#x201D; were the aligned chloroplast genome, and corresponding nuclear sequence (NUPT) which located in site 1,424,411 to 1,590,305 in chromosome 13 (i.e., fragment 1, 3, 5, 7, 8). And the &#x201C;Complete sequence&#x201D; also includes gaps. The fragments of the chloroplast genome that were separated by gaps were numbered, fragments with green number was present in nuclear chromosome, while the blue ones were not.</p></caption>
<graphic xlink:href="fpls-08-01050-g005.tif"/>
</fig>
<p>Therefore, we conducted MC sampling to test the hypotheses of &#x201C;whole cp genome insertion&#x201D; or &#x201C;split cp genome insertion,&#x201D; which has not been done for detecting a whole cp genome transfer event before. We determine that the probability of those gaps produced by &#x201C;split cp genome insertion&#x201D; is less than 1<italic>e</italic>-05 and is significantly low by chance. Consequently, our simulation analysis supports that the whole chloroplast was transferred to the nuclear genome first and the gaps were created by subsequent insertion/translocation. Lastly, we found that the nucleus genes of NUPTs in <italic>P. trichocarpa</italic>, each of which matches multiple cp genes, contain multiple cp genes mostly in their original order in the cp genome (Supplementary Table <xref ref-type="supplementary-material" rid="SM2">S8</xref>). Taken together, all observations revealed by our analyses clearly favor the &#x201C;bulk&#x201D; and &#x201C;insertion&#x201D; DNA hypothesis of organelle-to-nucleus DNA horizontal transfer. Our approaches can serve as an example to analyze large-scale transfer/insertion events in other species.</p>
</sec>
</sec>
<sec><title>Discussion</title>
<sec><title>Highly Conserved cp Genomes of Salicaceae</title>
<p>We compared the three sequenced <italic>Salix</italic> cp genomes in this study with previous published complete cp genomes of 11 Salicaceae species, including eight <italic>Populus</italic> and three <italic>Salix</italic> species (Supplementary Table <xref ref-type="supplementary-material" rid="SM2">S2</xref>). We found that the structure and synteny of the cp genomes of the 14 Salicaceae species are highly conserved (Supplementary Figure <xref ref-type="supplementary-material" rid="SM4">S1</xref>). Also, the lengths of various segments, or parts (IRs, LSC, SSC, coding sequence and non-coding sequence), of the cp genomes are also quite conserved and vary in a small range (Supplementary Figure <xref ref-type="supplementary-material" rid="SM5">S2</xref>). We observed that four genes (i.g. <italic>infA</italic>, <italic>rps16</italic>, <italic>rpl32</italic>, and <italic>ycf68</italic>) were lost from Salicaceae cp genomes compared with other angiosperms, among which, <italic>infA</italic> and <italic>rps32</italic> were reported to be transferred into the nucleus (<xref ref-type="bibr" rid="B36">Millen et al., 2001</xref>; <xref ref-type="bibr" rid="B54">Ueda et al., 2007</xref>). The inverted repeat of the Salicaceae cp genome results in the complete duplication of <italic>rps19</italic>, <italic>rpl2</italic>, <italic>rpl23</italic>, <italic>ycf2</italic>, <italic>ycf15</italic>, <italic>ndhB</italic> and <italic>rps7</italic>, as well as exons 1 and 2 of <italic>rps12</italic>, all four rRNA genes (4.5S, 5S, 16S, and 23S) and seven tRNA genes (<italic>trnI</italic><sup>CAU</sup>, <italic>trnL</italic><sup>CAA</sup>, <italic>trnV</italic><sup>GAC</sup>, <italic>trnI</italic><sup>GAU</sup>, <italic>trnA</italic><sup>UGC</sup>, <italic>trnR</italic><sup>ACG</sup>, and <italic>trnN</italic><sup>GUU</sup>). Both IRs of the Salicaceae cp genome run 1,695&#x2013;1,907 bp into <italic>ycf1</italic>, which differs largely in genera <italic>Salix</italic> and <italic>Populus</italic>: 1,747 bp in all <italic>Salix</italic> species and 1,695&#x2013;1,907 bp in <italic>Populus</italic> species, which result in most proportions of fluctuations in IR length. Additionally, IRb runs 50&#x2013;51 bp into <italic>rpl22</italic> (Supplementary Figure <xref ref-type="supplementary-material" rid="SM6">S3</xref>).</p>
<p>Similar to other angiosperms, the IR (pairwise identity 98.1% for both IRa and IRb) regions are more conserved in the nine species than the LSC (pairwise identity 94.0%) and SSC (pairwise identity 94.9%) regions. Differences are observed in the cp genome in the 14 Salicaceae species including genome size, gene losses, the pseudogenization of protein-coding genes, and IR expansion and contraction (<bold>Table <xref ref-type="table" rid="T1">1</xref></bold> and Supplementary Figure <xref ref-type="supplementary-material" rid="SM5">S2</xref>).</p>
</sec>
<sec><title>Phylogenomic Analyses of Salicaceae</title>
<p>Phylogenetic analysis, based on the complete cp genomes and non-coding and protein-coding datasets, indicates that the phylogeny of Salicaceae s.str, <italic>Populus</italic> and <italic>Salix</italic> s.l. are all resolved as monophyly, which is largely consistent with previous studies (<xref ref-type="bibr" rid="B12">Davis et al., 2005</xref>; <xref ref-type="bibr" rid="B58">Wurdack and Davis, 2009</xref>; <xref ref-type="bibr" rid="B6">Chen et al., 2010</xref>; <xref ref-type="bibr" rid="B56">Wang et al., 2014</xref>; <xref ref-type="bibr" rid="B57">Wu et al., 2015</xref>) (<bold>Figure <xref ref-type="fig" rid="F2">2</xref></bold> and Supplementary Table <xref ref-type="supplementary-material" rid="SM2">S4</xref>). However, we identified the incongruence of phylogenies in Salicaceae that are inferred from these three datasets (i.e., whole genome and concatenated coding sequence between non-coding sequences of the cp genome). This incongruence might be caused by homoplasy, which could result from convergence, parallelism or reversal. Because the cp genome is inherited maternally as a single unit, the observed phylogenetic tree incongruences in our result are most unlikely caused by lineage sorting or hybridization/introgression, which are generally used to explain the conflict signals among characters of plant taxa including Salicaceae (<xref ref-type="bibr" rid="B56">Wang et al., 2014</xref>; <xref ref-type="bibr" rid="B57">Wu et al., 2015</xref>).</p>
<p>Our phylogenetic tree, based on the whole cp genome, coding-sequence and non-coding sequences of cp genome, shows that all conflicting branches are short (<bold>Figures <xref ref-type="fig" rid="F2">2A,B</xref></bold>), which could be caused by fast species radiation or short stem-lineages (<xref ref-type="bibr" rid="B55">Wagele and Mayer, 2007</xref>). In this case, apomorphies evolved in stem-lineage may be rare, and chance similarities (non-phylogenetic signal produced by nucleotide substitution processes) that evolved later can accumulate and dominate in the form of phylogenetic-signal-like (i.e., false and misleading phylogenetic signals) patterns. It is difficult to distinguish these kinds of homoplasies from apomorphies, and consequently, this might lead to the wrong phylogenetic tree (<xref ref-type="bibr" rid="B55">Wagele and Mayer, 2007</xref>). Moreover, multiple nucleotide substitutions along long branches may destroy synapomorphies, resulting in accumulation of homoplasies along long branches and attracting distantly related clades to be clustered together in a topology (i.e., long-branch attraction) (<xref ref-type="bibr" rid="B55">Wagele and Mayer, 2007</xref>). Therefore, the conflict topologies in our phylogenomic trees most likely result from incomplete sampling or homoplasy. This is in line with previous cp DNA markers based on phylogenomic study of bamboo, which comes to the conclusion that homoplasy should be in account for conflicting phylogenetic signals between cp genome datasets (<xref ref-type="bibr" rid="B33">Ma et al., 2014</xref>). Moreover, our analyses show that the phylogenetic tree based on genomic data with large number of informative characters (e.g., cp genomes) should be used in caution when the examined taxa are quickly radiated or bear short branches.</p>
<p>Phylogenomic analysis indicates that the species of <italic>Salix</italic> were clustered in a robust monophyletic clade. Two robust subclades were resolved within <italic>Salix</italic>. One subclade includes the species of subgenera <italic>Chametia</italic> and <italic>Vetrix</italic>. Two species (<italic>S. suchowensis</italic> and <italic>S. purpurea</italic>) from subgenus <italic>Vetrix</italic> cluster as a monophyletic group, which is sister to a group including a species, <italic>S. oreinoma</italic> from subgenus <italic>Chamaetia</italic>. The other subclade is comprised of species from subgenus <italic>Salix</italic>, the New World species, <italic>S. interior</italic>, which is sister to a group containing two Old World species, <italic>S. babylonica</italic> and <italic>S. tetrasperma</italic> (<bold>Figure <xref ref-type="fig" rid="F2">2</xref></bold>). The relationships resolved in this study are in line with previous phylogenetic studies of <italic>Salix</italic> (<xref ref-type="bibr" rid="B3">Azuma et al., 2000</xref>; <xref ref-type="bibr" rid="B6">Chen et al., 2010</xref>; <xref ref-type="bibr" rid="B21">Hardig et al., 2010</xref>; <xref ref-type="bibr" rid="B57">Wu et al., 2015</xref>), but we provide evidence from a phylogenomic perspective.</p>
</sec>
<sec><title>Positively Selected Genes of Salicaceae Chloroplast Genomes</title>
<p>In the evolutionary process of a certain lineage of an organism, changing environments (e.g., climate changing) impose selective pressures and result in adaptive evolution. Identification of genes involved in this process (i.e., positively selected genes) is central to understanding the evolutionary pattern of organisms (<xref ref-type="bibr" rid="B62">Yang, 1998</xref>), and pinpointing specific targets for adaptive studies.</p>
<p>Among seven significantly positively selected genes in Salicaceae cp genomes, gene <italic>rps7</italic> shows a frame shift mutation. A single nucleotide insertion at site 102 causes a frame shift mutation for gene <italic>rps7</italic> and results in a large variation of an amino acid sequence that is shown under positive selection (<bold>Figure <xref ref-type="fig" rid="F3">3</xref></bold>). This frame shift mutation further introduces an early stop codon and shortens its amino acid sequence about 42% of the length compared to other <italic>Populus</italic> species. The <italic>rps7</italic> gene encodes the ribosome S7 protein, also known as ribosomal protein S7 (uS7), which is crucial for the assembly and stability of the ribosome. S7 protein is an important part of the translation process, and is universally present in the small subunit of prokaryotic and eukaryotic ribosomes, and might play either a general or a specific regulatory role in translation initiation in the chloroplast (<xref ref-type="bibr" rid="B18">Fargo et al., 2001</xref>). The <italic>rps7</italic> gene of <italic>P. cathayana</italic> is positively selected, although it contains a premature stop codon caused by a frame shift mutation compared with other <italic>Populus</italic> species surveyed in this study (<bold>Figure <xref ref-type="fig" rid="F3">3</xref></bold>). Despite the premature stop codon, this truncated <italic>rps7</italic> gene in <italic>P. cathayana</italic> has a conserved domain for the uS7 superfamily, as revealed by NCBI-CDD blast results (Supplementary Figure <xref ref-type="supplementary-material" rid="SM7">S4</xref>). This reveals that this shortened <italic>rps7</italic> gene may still function normally. Alternatively, even if the truncated cp rps7 cannot maintain its original function, it is possible that an intact copy of cp <italic>rps7</italic> has been transferred into the nucleus and can properly function in the nucleus genome. <italic>P. cathayana</italic> does not have complete genome sequences, so we searched the cp <italic>rps7</italic> gene in nucleus genome of <italic>P. trichocarpa</italic>, which has fine quality whole nuclear genome sequences. We observe that the Potri.013G138900.1 gene in chromosome 13 is identical to cp <italic>rps7</italic> in length and has 100% identity for both gene and coding sequences. This implies that <italic>rps7</italic> may have functionally transferred into the nucleus and the cp copy of <italic>rps7</italic> might be freely subject to natural selection.</p>
<p>Similarly, the protein coding sequences of <italic>petD</italic> and <italic>psbC</italic> genes in <italic>P. cathayana</italic> are shortened by about 29 and 70%, respectively. Compared to other Salicaceae species, both mutations are caused by a single nucleotide insertion. The <italic>petD</italic> gene encodes subunit IV of the cytochrome b<sub>6</sub>/f complex. It is required for photosynthetic electron transport and hence, supports photosynthetic growth. The mutations in the 5&#x2032; UTR or initiation codon can affect its function (<xref ref-type="bibr" rid="B8">Chen et al., 1993</xref>; <xref ref-type="bibr" rid="B43">Sakamoto et al., 1994</xref>; <xref ref-type="bibr" rid="B51">Sturm et al., 1994</xref>). The <italic>psbC</italic> gene encodes one of the components of the core complex of photosystem II. It binds chlorophyll and helps catalyze the primary light-induced photochemical processes of PSII (<xref ref-type="bibr" rid="B41">Rochaix et al., 1989</xref>; <xref ref-type="bibr" rid="B5">Cai et al., 2010</xref>). However, the effects of shortened coding sequences of <italic>petD</italic> and <italic>psbC</italic> on their function remains unknown, especially for <italic>psbC</italic>, which is shortened about 70% of the length. Our analysis finds that <italic>petD</italic> and <italic>psbC</italic> have been transferred into the coding region of the nuclear genome for three and eight times in <italic>P. trichocarpa</italic>, respectively (Supplementary Tables <xref ref-type="supplementary-material" rid="SM2">S5</xref>, <xref ref-type="supplementary-material" rid="SM2">S6</xref>).</p>
<p>Three of the last few amino acids of <italic>ycf2</italic> in <italic>P. yunnanensis</italic> are detected to be under significant positive selection. These amino acid changes are caused by a frame shift (a single nucleotide insertion) 6 bp ahead of them. Consequently, <italic>ycf2</italic> in <italic>P. yunnanensis</italic> is 21 bp shorter than in other Salicaceae species. The <italic>ycf2</italic> gene is a putative ATPase with unknown function. This gene exists in many plants, including non-photosynthetic plants. Previous experiments in tobacco indicate that it plays an essential role in cell survival in the tobacco chloroplast (<xref ref-type="bibr" rid="B13">Drescher et al., 2000</xref>). Our analysis indicates that <italic>ycf2</italic> is transferred into the nuclear genome 18 times and all targeted in protein-coding regions in <italic>P. trichocarpa</italic> (Supplementary Tables <xref ref-type="supplementary-material" rid="SM2">S5</xref>, <xref ref-type="supplementary-material" rid="SM2">S6</xref>).</p>
<p>We identified two positively selected genes, <italic>ccsA</italic> and <italic>psbJ</italic> in <italic>P. tremula.</italic> Species of <italic>P. tremula</italic> mainly distribute in cool temperate regions of Europe and Asia. The protein encoded by <italic>ccsA</italic> is a component of the cytochrome c synthase complex of the membrane-bound System II, and is required during the biogenesis of c-type cytochromes at the step of heme attachment (<xref ref-type="bibr" rid="B60">Xie et al., 1998</xref>). The <italic>psbJ</italic> encodes one of the components of the core complex of photosystem II. Experiments in tobacco indicate that plants with a mutated <italic>psbJ</italic> gene are unable to grow photoautotrophically (<xref ref-type="bibr" rid="B19">Hager et al., 2002</xref>). Our analysis shows that <italic>ccsA</italic> has been transferred to the coding region of the nuclear genome once, but <italic>psbJ</italic> is not transferred in <italic>P. trichocarpa</italic> (Supplementary Tables <xref ref-type="supplementary-material" rid="SM2">S5</xref>, <xref ref-type="supplementary-material" rid="SM2">S6</xref>).</p>
<p>All the positively selected genes locate in terminal species, except <italic>atpE</italic>, which shows significant positive selection on one site in the <italic>Salix</italic> subgenus, <italic>Salix</italic> clade (i.e., <italic>S. babylonica</italic>, <italic>S. tetrasperma</italic>, and <italic>S. interior</italic>) (<bold>Table <xref ref-type="table" rid="T2">2</xref></bold>). The <italic>atpE</italic> gene encodes the &#x03B5; subunit of CF1 of the H<sup>+</sup>-translocating ATP synthase, and functions in part to prevent wasteful ATP hydrolysis by the enzyme (<xref ref-type="bibr" rid="B10">Cruz et al., 1997</xref>). The common ancestors of this clade might be adapted to temperate conditions and diverge in about early Oligocene (<xref ref-type="bibr" rid="B57">Wu et al., 2015</xref>). The positively selected <italic>aptE</italic> gene might be related to the adaptive evolution of the common ancestor group of <italic>Salix</italic> subgenus <italic>Salix.</italic></p>
<p>As mentioned above, most of the positively selected protein-coding genes are transferred into the nucleus in <italic>P. trichocarpa</italic>. Positive selection is generally regarded as evidence of adaptive evolution, and these positive selected genes might have driven the successful adaptation of the selected taxa and/or lineages. However, the specific functions or effects of these positively selected cp genes in the target species remain unknown, and structural biological studies are needed to clarify the implication of these findings.</p>
</sec>
<sec><title>Chloroplast-Nuclear DNA Transfer</title>
<p>Most of the chloroplast genes are transferred to nucleus and then deleted from the plastome. However, a transferred cp gene is not readily expressed in the nucleus, nor would it be able to give rise to a product equipped with the capability of returning to the chloroplast and ousting its progenitor chloroplast gene. The NUPT events have happened repeatedly and still are active during endosymbiotic evolution (<xref ref-type="bibr" rid="B52">Timmis et al., 2004</xref>; <xref ref-type="bibr" rid="B11">Cullis et al., 2009</xref>; <xref ref-type="bibr" rid="B64">Yoshida et al., 2013</xref>). Although some previous studies investigated the patterns of genomic integration of NUPTs in plant species, the mechanisms of cp-to-nucleus gene transfer are not well understood (<xref ref-type="bibr" rid="B52">Timmis et al., 2004</xref>; <xref ref-type="bibr" rid="B30">Lane, 2011</xref>; <xref ref-type="bibr" rid="B64">Yoshida et al., 2013</xref>).</p>
<p>The number of NUPTs we identified in <italic>P. trichocarpa</italic> is larger than that from a previous study (<xref ref-type="bibr" rid="B64">Yoshida et al., 2013</xref>), which used NCBI-BLASTN for NUPT identification instead of LASTZ, as in our study. However, the total length of NUPTs is similar. There is no clear explanation for the variation of the number and amount of NUPTs in plant species, but it may correlate with genome complexity, proportion of repetitive elements, and/or other factors (<xref ref-type="bibr" rid="B64">Yoshida et al., 2013</xref>).</p>
<p>Consistent with <xref ref-type="bibr" rid="B64">Yoshida et al. (2013)</xref>, our analysis shows that large NUPTs are young. Therefore these large NUPTs may experience rapid recombination, insertion/deletion, and fragmentation; supporting previous results that the plant nuclear genome is in equilibrium between frequent integration and rapid elimination of plastid DNA (<xref ref-type="bibr" rid="B34">Matsuo et al., 2005</xref>).</p>
<p>There are two main hypotheses to explain the mechanism of DNA transferring from organelle-to-nucleus: the &#x201C;bulk DNA&#x201D; (DNA-mediated) view and &#x201C;cDNA intermediate&#x201D; (RNA-mediated) view (<xref ref-type="bibr" rid="B52">Timmis et al., 2004</xref>). Given the presence of large NUPTs containing both non-coding and coding regions, our results and most previous studies support &#x201C;DNA-mediated&#x201D; transfer hypothesis, (<xref ref-type="bibr" rid="B34">Matsuo et al., 2005</xref>; <xref ref-type="bibr" rid="B35">Michalovova et al., 2013</xref>; <xref ref-type="bibr" rid="B64">Yoshida et al., 2013</xref>). Despite the frequent occurrence of large NUPTs, e.g., nearly the entire chloroplast fragment in rice chromosome 10 (<xref ref-type="bibr" rid="B65">Yuan et al., 2002</xref>), whether the whole plastid genome can be transferred to the nucleus genome remains largely unknown or not specifically verified yet. We observe that the entire <italic>P. trichocarpa</italic> chloroplast genome is aligned to a region on chromosome 13 of the <italic>P. trichocarpa</italic> nuclear genome (Data Sheet S1). Together with MC sampling, our study clearly provide the evidence that the whole chloroplast genome horizontally transfers to the nuclear genome.</p>
<p>As discussed above, the whole chloroplast genome can be transferred to the nucleus, so the missing and pseudogenized genes in the Salicaceae chloroplast genome, e.g., <italic>infA</italic>, <italic>ndhA</italic>, <italic>rpl16</italic>, <italic>rpl32</italic>, could be transferred to the nucleus and may function properly. Similar cases have been reported previously, involving <italic>rpl32</italic> in <italic>Populus</italic> (<xref ref-type="bibr" rid="B54">Ueda et al., 2007</xref>) and <italic>infA</italic> in some angiosperm species (<xref ref-type="bibr" rid="B36">Millen et al., 2001</xref>). Our analyses and other studies show that organellar DNA horizontal transfer in plants is frequent. These transfers are thought to play an important role in gene and genome evolution in plants, and functional transfer of the chloroplast genes may facilitate the regulation of gene expression (<xref ref-type="bibr" rid="B11">Cullis et al., 2009</xref>; <xref ref-type="bibr" rid="B64">Yoshida et al., 2013</xref>). However, functional gene horizontal transferring from organelle-to-nucleus is rare as the transferred coding sequences must acquire gene promoter and terminator sequences for proper transcription in the nucleus, and must also acquire transit peptides necessary for reimporting the protein back into organelle (<xref ref-type="bibr" rid="B52">Timmis et al., 2004</xref>; <xref ref-type="bibr" rid="B50">Stegemann and Bock, 2006</xref>). Our result provides additional evidence of the organelle-to-nucleus horizontal functional gene transfer in species of <italic>P. trichocarpa</italic>, where the majority of cp coding sequences are transferred to the nucleus genome, and most of them remain as coding regions in the nucleus genome (Supplementary Tables <xref ref-type="supplementary-material" rid="SM2">S5</xref>, <xref ref-type="supplementary-material" rid="SM2">S6</xref>). However, further investigation is needed to determine whether these plastid-coding sequences in nuclear genome could function properly. Because chloroplast-to-nucleus transferred functional genes must acquire promoter and terminator sequences (<xref ref-type="bibr" rid="B15">Eckardt, 2006</xref>), it is necessary to identify the regulatory motifs of these chloroplast-to-nucleus transferred genes. Our discovery that large cp genome segments are transferred into the nuclear genome, implies that the transfer of the coding genes might be accompanied by the transfer of their promoter and terminator sequences, which could provide materials for those cp genes to properly function in the nuclear genome. However, further experiments (e.g., western bolt and genetic screen) and analyses are needed to confirm the function of the genes transferred from the chloroplast to the nucleus. Furthermore, the identification of the whole cp genome transfer event in <italic>P. trichocarpa</italic> is based on single genome analysis. It remains unclear whether this event is usual or even fixed in <italic>P. trichocarpa</italic>. Therefore, sequencing the genomes of more individuals is needed to further verify the whole cp genome transfer event in <italic>P. trichocarpa.</italic></p>
</sec>
</sec>
<sec><title>Author Contributions</title>
<p>JC and YY designed research. YH and JC performed research. JC, JW, and YH analyzed data. JC, YH, JW, YY, and CF wrote the paper. All authors read and approved the final manuscript.</p>
</sec>
<sec><title>Conflict of Interest Statement</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
</body>
<back>
<fn-group>
<fn fn-type="financial-disclosure">
<p><bold>Funding.</bold> This study is supported by grants from the National Natural Science Foundation of China (NSFC 31590820, 31590823 to Hang Sun, 31270271, 31670198 to JC, 31560062 to YH); the Science and Technology Research Program of Kunming Institute of Botany, the Chinese Academy of Sciences, Grant NO. KIB2016005; and the Youth Innovation Promotion Association, the Chinese Academy of Sciences. CF is supported by start-up fund from Wayne State University.</p>
</fn>
</fn-group>
<sec sec-type="supplementary material">
<title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="http://journal.frontiersin.org/article/10.3389/fpls.2017.01050/full#supplementary-material">http://journal.frontiersin.org/article/10.3389/fpls.2017.01050/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="Table_1.xls" id="SM1" mimetype="application/vnd.ms-excel application/vnd.ms-excel.sheet.binary.macroEnabled.12" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Table_2.pdf" id="SM2" mimetype="application/pdf" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Data_Sheet_1.xls" id="SM3" mimetype="application/vnd.ms-excel application/vnd.ms-excel.sheet.binary.macroEnabled.12" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Image_1.PDF" id="SM4" mimetype="application/pdf" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Image_2.PDF" id="SM5" mimetype="application/pdf" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Image_3.PDF" id="SM6" mimetype="application/pdf" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Image_4.PDF" id="SM7" mimetype="application/pdf" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Abdollahzadeh</surname> <given-names>A.</given-names></name> <name><surname>Osaloo</surname> <given-names>S. K.</given-names></name> <name><surname>Maassoumi</surname> <given-names>A.</given-names></name></person-group> (<year>2011</year>). <article-title>Molecular phylogeny of the genus <italic>Salix</italic> (Salicaceae) with an emphasize to its species in Iran.</article-title> <source><italic>Iran. J. Bot.</italic></source> <volume>17</volume> <fpage>244</fpage>&#x2013;<lpage>253</lpage>.</citation></ref>
<ref id="B2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Argus</surname> <given-names>G. W.</given-names></name> <name><surname>Eckenwalder</surname> <given-names>J. E.</given-names></name> <name><surname>Kiger</surname> <given-names>R. W.</given-names></name></person-group> <collab>Flora of North America</collab>. (<year>2010</year>). &#x201C;<article-title>Salicaceae</article-title>,&#x201D; in <source><italic>Flora of North America</italic>,</source> <role>ed.</role> <person-group person-group-type="editor"><name><surname>Flora of North America Editorial</surname> <given-names>Committee</given-names></name></person-group> (<publisher-loc>Oxford</publisher-loc>: <publisher-name>Oxford University Press</publisher-name>).</citation></ref>
<ref id="B3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Azuma</surname> <given-names>T.</given-names></name> <name><surname>Kajita</surname> <given-names>T.</given-names></name> <name><surname>Yokoyama</surname> <given-names>J.</given-names></name> <name><surname>Ohashi</surname> <given-names>H.</given-names></name></person-group> (<year>2000</year>). <article-title>Phylogenetic relationships of <italic>Salix</italic> (Salicaceae) based on rbcL sequence data.</article-title> <source><italic>Am. J. Bot.</italic></source> <volume>87</volume> <fpage>67</fpage>&#x2013;<lpage>75</lpage>. <pub-id pub-id-type="doi">10.2307/2656686</pub-id></citation></ref>
<ref id="B4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Birky</surname> <given-names>C. W.</given-names></name></person-group> (<year>1995</year>). <article-title>Uniparental inheritance of mitochondrial and chloroplast genes - mechanisms and evolution.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>92</volume> <fpage>11331</fpage>&#x2013;<lpage>11338</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.92.25.11331</pub-id></citation></ref>
<ref id="B5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cai</surname> <given-names>W. H.</given-names></name> <name><surname>Ma</surname> <given-names>J. F.</given-names></name> <name><surname>Chi</surname> <given-names>W.</given-names></name> <name><surname>Zou</surname> <given-names>M. J.</given-names></name> <name><surname>Guo</surname> <given-names>J. K.</given-names></name> <name><surname>Lu</surname> <given-names>C. M.</given-names></name><etal/></person-group> (<year>2010</year>). <article-title>Cooperation of LPA3 and LPA2 is essential for photosystem II assembly in Arabidopsis.</article-title> <source><italic>Plant Physiol.</italic></source> <volume>154</volume> <fpage>109</fpage>&#x2013;<lpage>120</lpage>. <pub-id pub-id-type="doi">10.1104/pp.110.159558</pub-id></citation></ref>
<ref id="B6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>J. H.</given-names></name> <name><surname>Sun</surname> <given-names>H.</given-names></name> <name><surname>Wen</surname> <given-names>J.</given-names></name> <name><surname>Yang</surname> <given-names>Y. P.</given-names></name></person-group> (<year>2010</year>). <article-title>Molecular phylogeny of <italic>Salix</italic> L. (Salicaceae) inferred from three chloroplast datasets and its systematic implications.</article-title> <source><italic>Taxon</italic></source> <volume>59</volume> <fpage>29</fpage>&#x2013;<lpage>37</lpage>.</citation></ref>
<ref id="B7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>J. H.</given-names></name> <name><surname>Sun</surname> <given-names>H.</given-names></name> <name><surname>Yang</surname> <given-names>Y. P.</given-names></name></person-group> (<year>2008</year>). <article-title>Comparative morphology of leaf epidermis of <italic>Salix</italic> (Salicaceae) with special emphasis on sections Lindleyanae and Retusae.</article-title> <source><italic>Bot. J. Linn. Soc.</italic></source> <volume>157</volume> <fpage>311</fpage>&#x2013;<lpage>322</lpage>. <pub-id pub-id-type="doi">10.1111/j.1095-8339.2008.00809.x</pub-id></citation></ref>
<ref id="B8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>X. M.</given-names></name> <name><surname>Kindle</surname> <given-names>K.</given-names></name> <name><surname>Stern</surname> <given-names>D.</given-names></name></person-group> (<year>1993</year>). <article-title>Initiation codon mutations in the chlamydomonas chloroplast petD gene result in temperature-sensitive photosynthetic growth.</article-title> <source><italic>EMBO J.</italic></source> <volume>12</volume> <fpage>3627</fpage>&#x2013;<lpage>3635</lpage>.</citation></ref>
<ref id="B9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chumley</surname> <given-names>T. W.</given-names></name> <name><surname>Palmer</surname> <given-names>J. D.</given-names></name> <name><surname>Mower</surname> <given-names>J. P.</given-names></name> <name><surname>Fourcade</surname> <given-names>H. M.</given-names></name> <name><surname>Calie</surname> <given-names>P. J.</given-names></name> <name><surname>Boore</surname> <given-names>J. L.</given-names></name><etal/></person-group> (<year>2006</year>). <article-title>The complete chloroplast genome sequence of <italic>Pelargonium x hortorum</italic>: organization and evolution of the largest and most highly rearranged chloroplast genome of land plants.</article-title> <source><italic>Mol. Biol. Evol.</italic></source> <volume>23</volume> <fpage>2175</fpage>&#x2013;<lpage>2190</lpage>. <pub-id pub-id-type="doi">10.1093/molbev/msl089</pub-id></citation></ref>
<ref id="B10"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cruz</surname> <given-names>J. A.</given-names></name> <name><surname>Radkowski</surname> <given-names>C. A.</given-names></name> <name><surname>McCarty</surname> <given-names>R. E.</given-names></name></person-group> (<year>1997</year>). <article-title>Functional consequences of deletions of the N terminus of the epsilon subunit of the chloroplast ATP synthase.</article-title> <source><italic>Plant Physiol.</italic></source> <volume>113</volume> <fpage>1185</fpage>&#x2013;<lpage>1192</lpage>. <pub-id pub-id-type="doi">10.1104/pp.113.4.1185</pub-id></citation></ref>
<ref id="B11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cullis</surname> <given-names>C. A.</given-names></name> <name><surname>Vorster</surname> <given-names>B. J.</given-names></name> <name><surname>Van Der Vyver</surname> <given-names>C.</given-names></name> <name><surname>Kunert</surname> <given-names>K. J.</given-names></name></person-group> (<year>2009</year>). <article-title>Transfer of genetic material between the chloroplast and nucleus: how is it related to stress in plants?</article-title> <source><italic>Ann. Bot.</italic></source> <volume>103</volume> <fpage>625</fpage>&#x2013;<lpage>633</lpage>. <pub-id pub-id-type="doi">10.1093/aob/mcn173</pub-id></citation></ref>
<ref id="B12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Davis</surname> <given-names>C. C.</given-names></name> <name><surname>Webb</surname> <given-names>C. O.</given-names></name> <name><surname>Wurdack</surname> <given-names>K. J.</given-names></name> <name><surname>Jaramillo</surname> <given-names>C. A.</given-names></name> <name><surname>Donoghue</surname> <given-names>M. J.</given-names></name></person-group> (<year>2005</year>). <article-title>Explosive radiation of Malpighiales supports a mid-Cretaceous origin of modern tropical rain forests.</article-title> <source><italic>Am. Nat.</italic></source> <volume>165</volume> <fpage>E36</fpage>&#x2013;<lpage>E65</lpage>. <pub-id pub-id-type="doi">10.1086/428296</pub-id></citation></ref>
<ref id="B13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Drescher</surname> <given-names>A.</given-names></name> <name><surname>Ruf</surname> <given-names>S.</given-names></name> <name><surname>Calsa</surname> <given-names>T.</given-names></name> <name><surname>Carrer</surname> <given-names>H.</given-names></name> <name><surname>Bock</surname> <given-names>R.</given-names></name></person-group> (<year>2000</year>). <article-title>The two largest chloroplast genome-encoded open reading frames of higher plants are essential genes.</article-title> <source><italic>Plant J.</italic></source> <volume>22</volume> <fpage>97</fpage>&#x2013;<lpage>104</lpage>. <pub-id pub-id-type="doi">10.1046/j.1365-313x.2000.00722.x</pub-id></citation></ref>
<ref id="B14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dyall</surname> <given-names>S. D.</given-names></name> <name><surname>Brown</surname> <given-names>M. T.</given-names></name> <name><surname>Johnson</surname> <given-names>P. J.</given-names></name></person-group> (<year>2004</year>). <article-title>Ancient invasions: from endosymbionts to organelles.</article-title> <source><italic>Science</italic></source> <volume>304</volume> <fpage>253</fpage>&#x2013;<lpage>257</lpage>. <pub-id pub-id-type="doi">10.1126/science.1094884</pub-id></citation></ref>
<ref id="B15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Eckardt</surname> <given-names>N. A.</given-names></name></person-group> (<year>2006</year>). <article-title>Genomic hopscotch: gene transfer from plastid to nucleus.</article-title> <source><italic>Plant Cell</italic></source> <volume>18</volume> <fpage>2865</fpage>&#x2013;<lpage>2867</lpage>. <pub-id pub-id-type="doi">10.1105/tpc.106.049031</pub-id></citation></ref>
<ref id="B16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Eckenwalder</surname> <given-names>J. E.</given-names></name></person-group> (<year>1996</year>). &#x201C;<article-title>Systematics and evolution of <italic>Populus</italic></article-title>,&#x201D; in <source><italic>Biology of Populus and its Implications for Management and Conservation</italic></source>, <role>eds</role> <person-group person-group-type="editor"><name><surname>Stettler</surname> <given-names>R. F.</given-names></name> <name><surname>Bradshaw</surname> <given-names>H. D.</given-names></name> <name><surname>Heilman</surname> <given-names>P. E.</given-names></name> <name><surname>Hinckler</surname> <given-names>T. M.</given-names></name></person-group> (<publisher-loc>Ottawa, ON</publisher-loc>: <publisher-name>Canadian Government Publishing</publisher-name>), <fpage>7</fpage>&#x2013;<lpage>32</lpage>.</citation></ref>
<ref id="B17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fang</surname> <given-names>Z.</given-names></name> <name><surname>Zhao</surname> <given-names>S.</given-names></name> <name><surname>Skvortsov</surname> <given-names>A.</given-names></name></person-group> (<year>1999</year>). &#x201C;<article-title>Salicaceae</article-title>,&#x201D; in <source><italic>Flora of China</italic></source> <volume>Vol. 4</volume> <role>eds</role> <person-group person-group-type="editor"><name><surname>Wu</surname> <given-names>Z. Y.</given-names></name> <name><surname>Raven</surname> <given-names>P. H.</given-names></name> <name><surname>Hong</surname> <given-names>D. Y.</given-names></name></person-group> (<publisher-loc>Beijing</publisher-loc>: <publisher-name>Science Press &#x0026; Missouri Botanical Garden Press</publisher-name>), <fpage>139</fpage>&#x2013;<lpage>274</lpage>.</citation></ref>
<ref id="B18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fargo</surname> <given-names>D. C.</given-names></name> <name><surname>Boynton</surname> <given-names>J. E.</given-names></name> <name><surname>Gillham</surname> <given-names>N. W.</given-names></name></person-group> (<year>2001</year>). <article-title>Chloroplast ribosomal protein S7 of Chlamydomonas binds to chloroplast mRNA leader sequences and may be involved in translation initiation.</article-title> <source><italic>Plant Cell</italic></source> <volume>13</volume> <fpage>207</fpage>&#x2013;<lpage>218</lpage>. <pub-id pub-id-type="doi">10.1105/tpc.13.1.207</pub-id></citation></ref>
<ref id="B19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hager</surname> <given-names>M.</given-names></name> <name><surname>Hermann</surname> <given-names>M.</given-names></name> <name><surname>Biehler</surname> <given-names>A.</given-names></name> <name><surname>Krieger-Liszkay</surname> <given-names>A.</given-names></name> <name><surname>Bock</surname> <given-names>R.</given-names></name></person-group> (<year>2002</year>). <article-title>Lack of the small plastid-encoded PsbJ polypeptide results in a defective water-splitting apparatus of photosystem II, reduced photosystem I levels, and hypersensitivity to light.</article-title> <source><italic>J. Biol. Chem.</italic></source> <volume>277</volume> <fpage>14031</fpage>&#x2013;<lpage>14039</lpage>. <pub-id pub-id-type="doi">10.1074/jbc.M112053200</pub-id></citation></ref>
<ref id="B20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hamzeh</surname> <given-names>M.</given-names></name> <name><surname>Dayanandan</surname> <given-names>S.</given-names></name></person-group> (<year>2004</year>). <article-title>Phylogeny of <italic>Populus</italic> (Salicaceae) based on nucleotide sequences of chloroplast TRNT-TRNF region and nuclear rDNA.</article-title> <source><italic>Am. J. Bot.</italic></source> <volume>91</volume> <fpage>1398</fpage>&#x2013;<lpage>1408</lpage>. <pub-id pub-id-type="doi">10.3732/ajb.91.9.1398</pub-id></citation></ref>
<ref id="B21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hardig</surname> <given-names>T.</given-names></name> <name><surname>Anttila</surname> <given-names>C.</given-names></name> <name><surname>Brunsfeld</surname> <given-names>S.</given-names></name></person-group> (<year>2010</year>). <article-title>A phylogenetic analysis of <italic>Salix</italic> (Salicaceae) based on matK and ribosomal DNA sequence data.</article-title> <source><italic>J. Bot.</italic></source> <volume>2010</volume>:<issue>197696</issue>. <pub-id pub-id-type="doi">10.1186/s12862-015-0311-7</pub-id></citation></ref>
<ref id="B22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Harris</surname> <given-names>R.</given-names></name></person-group> (<year>2007</year>). <source><italic>Improved Pairwise Alignment of Genomic DNA.</italic></source> <publisher-name>Ph.D. thesis, Pennsylvania State University, State College, PA</publisher-name>.</citation></ref>
<ref id="B23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hope</surname> <given-names>A. C. A.</given-names></name></person-group> (<year>1968</year>). <article-title>A simplified Monte Carlo significance test procedure.</article-title> <source><italic>J. R. Stat. Soc. Ser. B</italic></source> <volume>30</volume> <fpage>582</fpage>&#x2013;<lpage>598</lpage>. <pub-id pub-id-type="doi">10.1088/0031-9155/54/3/005</pub-id></citation></ref>
<ref id="B24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Huson</surname> <given-names>D. H.</given-names></name> <name><surname>Scornavacca</surname> <given-names>C.</given-names></name></person-group> (<year>2012</year>). <article-title>Dendroscope 3: an interactive tool for rooted phylogenetic trees and networks.</article-title> <source><italic>Syst. Biol.</italic></source> <volume>61</volume> <fpage>1061</fpage>&#x2013;<lpage>1067</lpage>. <pub-id pub-id-type="doi">10.1093/sysbio/sys062</pub-id></citation></ref>
<ref id="B25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jansen</surname> <given-names>R. K.</given-names></name> <name><surname>Cai</surname> <given-names>Z.</given-names></name> <name><surname>Raubeson</surname> <given-names>L. A.</given-names></name> <name><surname>Daniell</surname> <given-names>H.</given-names></name> <name><surname>Depamphilis</surname> <given-names>C. W.</given-names></name> <name><surname>Leebens-Mack</surname> <given-names>J.</given-names></name><etal/></person-group> (<year>2007</year>). <article-title>Analysis of 81 genes from 64 plastid genomes resolves relationships in angiosperms and identifies genome-scale evolutionary patterns.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>104</volume> <fpage>19369</fpage>&#x2013;<lpage>19374</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.0709121104</pub-id></citation></ref>
<ref id="B26"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jansen</surname> <given-names>R. K.</given-names></name> <name><surname>Raubeson</surname> <given-names>L. A.</given-names></name> <name><surname>Boore</surname> <given-names>J. L.</given-names></name> <name><surname>dePamphilis</surname> <given-names>C. W.</given-names></name> <name><surname>Chumley</surname> <given-names>T. W.</given-names></name> <name><surname>Haberle</surname> <given-names>R. C.</given-names></name><etal/></person-group> (<year>2005</year>). <article-title>Methods for obtaining and analyzing whole chloroplast genome sequences.</article-title> <source><italic>Methods Enzymol.</italic></source> <volume>395</volume> <fpage>348</fpage>&#x2013;<lpage>384</lpage>. <pub-id pub-id-type="doi">10.1016/S0076-6879(05)95020-9</pub-id></citation></ref>
<ref id="B27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Karp</surname> <given-names>A.</given-names></name> <name><surname>Shield</surname> <given-names>I.</given-names></name></person-group> (<year>2008</year>). <article-title>Bioenergy from plants and the sustainable yield challenge.</article-title> <source><italic>New Phytol.</italic></source> <volume>179</volume> <fpage>15</fpage>&#x2013;<lpage>32</lpage>. <pub-id pub-id-type="doi">10.1111/j.1469-8137.2008.02432.x</pub-id></citation></ref>
<ref id="B28"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kearse</surname> <given-names>M.</given-names></name> <name><surname>Moir</surname> <given-names>R.</given-names></name> <name><surname>Wilson</surname> <given-names>A.</given-names></name> <name><surname>Stones-Havas</surname> <given-names>S.</given-names></name> <name><surname>Cheung</surname> <given-names>M.</given-names></name> <name><surname>Sturrock</surname> <given-names>S.</given-names></name><etal/></person-group> (<year>2012</year>). <article-title>Geneious basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data.</article-title> <source><italic>Bioinformatics</italic></source> <volume>28</volume> <fpage>1647</fpage>&#x2013;<lpage>1649</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/bts199</pub-id></citation></ref>
<ref id="B29"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kent</surname> <given-names>W. J.</given-names></name> <name><surname>Baertsch</surname> <given-names>R.</given-names></name> <name><surname>Hinrichs</surname> <given-names>A.</given-names></name> <name><surname>Miller</surname> <given-names>W.</given-names></name> <name><surname>Haussler</surname> <given-names>D.</given-names></name></person-group> (<year>2003</year>). <article-title>Evolution&#x2019;s cauldron: duplication, deletion, and rearrangement in the mouse and human genomes.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>100</volume> <fpage>11484</fpage>&#x2013;<lpage>11489</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1932072100</pub-id></citation></ref>
<ref id="B30"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lane</surname> <given-names>N.</given-names></name></person-group> (<year>2011</year>). <article-title>Plastids, genomes, and the probability of gene transfer.</article-title> <source><italic>Genome Biol. Evol.</italic></source> <volume>3</volume> <fpage>372</fpage>&#x2013;<lpage>374</lpage>. <pub-id pub-id-type="doi">10.1093/gbe/evr003</pub-id></citation></ref>
<ref id="B31"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Leskinen</surname> <given-names>E.</given-names></name> <name><surname>Alstrom-Rapaport</surname> <given-names>C.</given-names></name></person-group> (<year>1999</year>). <article-title>Molecular phylogeny of Salicaceae and closely related Flacourtiaceae: evidence from 5.8 S, ITS 1 and ITS 2 of the rDNA.</article-title> <source><italic>Plant Syst. Evol.</italic></source> <volume>215</volume> <fpage>209</fpage>&#x2013;<lpage>227</lpage>. <pub-id pub-id-type="doi">10.1007/BF00984656</pub-id></citation></ref>
<ref id="B32"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Luo</surname> <given-names>R.</given-names></name> <name><surname>Liu</surname> <given-names>B.</given-names></name> <name><surname>Xie</surname> <given-names>Y.</given-names></name> <name><surname>Li</surname> <given-names>Z.</given-names></name> <name><surname>Huang</surname> <given-names>W.</given-names></name> <name><surname>Yuan</surname> <given-names>J.</given-names></name><etal/></person-group> (<year>2012</year>). <article-title>SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler.</article-title> <source><italic>Gigascience</italic></source> <volume>1</volume>:<issue>18</issue>. <pub-id pub-id-type="doi">10.1186/2047-217X-1-18</pub-id></citation></ref>
<ref id="B33"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ma</surname> <given-names>P. F.</given-names></name> <name><surname>Zhang</surname> <given-names>Y. X.</given-names></name> <name><surname>Zeng</surname> <given-names>C. X.</given-names></name> <name><surname>Guo</surname> <given-names>Z. H.</given-names></name> <name><surname>Li</surname> <given-names>D. Z.</given-names></name></person-group> (<year>2014</year>). <article-title>Chloroplast phylogenomic analyses resolve deep-level relationships of an intractable bamboo tribe Arundinarieae (poaceae).</article-title> <source><italic>Syst. Biol.</italic></source> <volume>63</volume> <fpage>933</fpage>&#x2013;<lpage>950</lpage>. <pub-id pub-id-type="doi">10.1093/sysbio/syu054</pub-id></citation></ref>
<ref id="B34"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Matsuo</surname> <given-names>M.</given-names></name> <name><surname>Ito</surname> <given-names>Y.</given-names></name> <name><surname>Yamauchi</surname> <given-names>R.</given-names></name> <name><surname>Obokata</surname> <given-names>J.</given-names></name></person-group> (<year>2005</year>). <article-title>The rice nuclear genome continuously integrates, shuffles, and eliminates the chloroplast genome to cause chloroplast-nuclear DNA flux.</article-title> <source><italic>Plant Cell</italic></source> <volume>17</volume> <fpage>665</fpage>&#x2013;<lpage>675</lpage>. <pub-id pub-id-type="doi">10.1105/tpc.104.027706</pub-id></citation></ref>
<ref id="B35"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Michalovova</surname> <given-names>M.</given-names></name> <name><surname>Vyskot</surname> <given-names>B.</given-names></name> <name><surname>Kejnovsky</surname> <given-names>E.</given-names></name></person-group> (<year>2013</year>). <article-title>Analysis of plastid and mitochondrial DNA insertions in the nucleus (NUPTs and NUMTs) of six plant species: size, relative age and chromosomal localization.</article-title> <source><italic>Heredity</italic></source> <volume>111</volume> <fpage>314</fpage>&#x2013;<lpage>320</lpage>. <pub-id pub-id-type="doi">10.1038/hdy.2013.51</pub-id></citation></ref>
<ref id="B36"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Millen</surname> <given-names>R. S.</given-names></name> <name><surname>Olmstead</surname> <given-names>R. G.</given-names></name> <name><surname>Adams</surname> <given-names>K. L.</given-names></name> <name><surname>Palmer</surname> <given-names>J. D.</given-names></name> <name><surname>Lao</surname> <given-names>N. T.</given-names></name> <name><surname>Heggie</surname> <given-names>L.</given-names></name><etal/></person-group> (<year>2001</year>). <article-title>Many parallel losses of infA from chloroplast DNA during angiosperm evolution with multiple independent transfers to the nucleus.</article-title> <source><italic>Plant Cell</italic></source> <volume>13</volume> <fpage>645</fpage>&#x2013;<lpage>658</lpage>. <pub-id pub-id-type="doi">10.1105/tpc.13.3.645</pub-id></citation></ref>
<ref id="B37"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Moore</surname> <given-names>M. J.</given-names></name> <name><surname>Dhingra</surname> <given-names>A.</given-names></name> <name><surname>Soltis</surname> <given-names>P. S.</given-names></name> <name><surname>Shaw</surname> <given-names>R.</given-names></name> <name><surname>Farmerie</surname> <given-names>W. G.</given-names></name> <name><surname>Folta</surname> <given-names>K. M.</given-names></name><etal/></person-group> (<year>2006</year>). <article-title>Rapid and accurate pyrosequencing of angiosperm plastid genomes.</article-title> <source><italic>BMC Plant Biol.</italic></source> <volume>6</volume>:<issue>17</issue>. <pub-id pub-id-type="doi">10.1186/1471-2229-6-17</pub-id></citation></ref>
<ref id="B38"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Moore</surname> <given-names>M. J.</given-names></name> <name><surname>Soltis</surname> <given-names>P. S.</given-names></name> <name><surname>Bell</surname> <given-names>C. D.</given-names></name> <name><surname>Burleigh</surname> <given-names>J. G.</given-names></name> <name><surname>Soltis</surname> <given-names>D. E.</given-names></name></person-group> (<year>2010</year>). <article-title>Phylogenetic analysis of 83 plastid genes further resolves the early diversification of eudicots.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>107</volume> <fpage>4623</fpage>&#x2013;<lpage>4628</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.0907801107</pub-id></citation></ref>
<ref id="B39"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ohashi</surname> <given-names>H.</given-names></name></person-group> (<year>2001</year>). <article-title>Salicaceae of Japan.</article-title> <source><italic>Sci. Rep. Tohoku Univ.</italic></source> <volume>40</volume> <fpage>269</fpage>&#x2013;<lpage>396</lpage>.</citation></ref>
<ref id="B41"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rochaix</surname> <given-names>J. D.</given-names></name> <name><surname>Kuchka</surname> <given-names>M.</given-names></name> <name><surname>Mayfield</surname> <given-names>S.</given-names></name> <name><surname>Schirmerrahire</surname> <given-names>M.</given-names></name> <name><surname>Girardbascou</surname> <given-names>J.</given-names></name> <name><surname>Bennoun</surname> <given-names>P.</given-names></name></person-group> (<year>1989</year>). <article-title>Nuclear and chloroplast mutations affect the synthesis or stability of the chloroplast psbc gene-product in <italic>Chlamydomonas-reinhardtii</italic>.</article-title> <source><italic>EMBO J.</italic></source> <volume>8</volume> <fpage>1013</fpage>&#x2013;<lpage>1021</lpage>.</citation></ref>
<ref id="B42"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ronquist</surname> <given-names>F.</given-names></name> <name><surname>Teslenko</surname> <given-names>M.</given-names></name> <name><surname>van der Mark</surname> <given-names>P.</given-names></name> <name><surname>Ayres</surname> <given-names>D. L.</given-names></name> <name><surname>Darling</surname> <given-names>A.</given-names></name> <name><surname>Hohna</surname> <given-names>S.</given-names></name><etal/></person-group> (<year>2012</year>). <article-title>MrBayes 3.2: efficient bayesian phylogenetic inference and model choice across a large model space.</article-title> <source><italic>Syst. Biol.</italic></source> <volume>61</volume> <fpage>539</fpage>&#x2013;<lpage>542</lpage>. <pub-id pub-id-type="doi">10.1093/sysbio/sys029</pub-id></citation></ref>
<ref id="B43"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sakamoto</surname> <given-names>W.</given-names></name> <name><surname>Chen</surname> <given-names>X. M.</given-names></name> <name><surname>Kindle</surname> <given-names>K. L.</given-names></name> <name><surname>Stern</surname> <given-names>D. B.</given-names></name></person-group> (<year>1994</year>). <article-title>Function of the <italic>Chlamydomonas-reinhardtii</italic> petD 5&#x2019; untranslated region in regulating the accumulation of subunit-IV of the cytochrome b<sub>6/f</sub> Complex.</article-title> <source><italic>Plant J.</italic></source> <volume>6</volume> <fpage>503</fpage>&#x2013;<lpage>512</lpage>. <pub-id pub-id-type="doi">10.1046/j.1365-313X.1994.6040503.x</pub-id></citation></ref>
<ref id="B45"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schattner</surname> <given-names>P.</given-names></name> <name><surname>Brooks</surname> <given-names>A. N.</given-names></name> <name><surname>Lowe</surname> <given-names>T. M.</given-names></name></person-group> (<year>2005</year>). <article-title>The tRNAscan-SE, snoscan and snoGPS web servers for the detection of tRNAs and snoRNAs.</article-title> <source><italic>Nucleic Acids Res.</italic></source> <volume>33</volume> <fpage>W686</fpage>&#x2013;<lpage>W689</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gki366</pub-id></citation></ref>
<ref id="B46"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schwartz</surname> <given-names>S.</given-names></name> <name><surname>Kent</surname> <given-names>W. J.</given-names></name> <name><surname>Smit</surname> <given-names>A.</given-names></name> <name><surname>Zhang</surname> <given-names>Z.</given-names></name> <name><surname>Baertsch</surname> <given-names>R.</given-names></name> <name><surname>Hardison</surname> <given-names>R. C.</given-names></name><etal/></person-group> (<year>2003</year>). <article-title>Human-mouse alignments with BLASTZ.</article-title> <source><italic>Genome Res.</italic></source> <volume>13</volume> <fpage>103</fpage>&#x2013;<lpage>107</lpage>. <pub-id pub-id-type="doi">10.1101/gr.809403</pub-id></citation></ref>
<ref id="B47"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shendure</surname> <given-names>J.</given-names></name> <name><surname>Ji</surname> <given-names>H. L.</given-names></name></person-group> (<year>2008</year>). <article-title>Next-generation DNA sequencing.</article-title> <source><italic>Nat. Biotechnol.</italic></source> <volume>26</volume> <fpage>1135</fpage>&#x2013;<lpage>1145</lpage>. <pub-id pub-id-type="doi">10.1038/nbt1486</pub-id></citation></ref>
<ref id="B48"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Skvortsov</surname> <given-names>A. K.</given-names></name></person-group> (<year>1999</year>). <source><italic>Willows of Russia and Adjacent Countries.</italic></source> <publisher-loc>Joensuu</publisher-loc>: <publisher-name>University of Joensuu</publisher-name>.</citation></ref>
<ref id="B49"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Stamatakis</surname> <given-names>A.</given-names></name></person-group> (<year>2006</year>). <article-title>RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models.</article-title> <source><italic>Bioinformatics</italic></source> <volume>22</volume> <fpage>2688</fpage>&#x2013;<lpage>2690</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btl446</pub-id></citation></ref>
<ref id="B50"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Stegemann</surname> <given-names>S.</given-names></name> <name><surname>Bock</surname> <given-names>R.</given-names></name></person-group> (<year>2006</year>). <article-title>Experimental reconstruction of functional gene transfer from the tobacco plastid genome to the nucleus.</article-title> <source><italic>Plant Cell</italic></source> <volume>18</volume> <fpage>2869</fpage>&#x2013;<lpage>2878</lpage>. <pub-id pub-id-type="doi">10.1105/tpc.106.046466</pub-id></citation></ref>
<ref id="B51"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sturm</surname> <given-names>N. R.</given-names></name> <name><surname>Kuras</surname> <given-names>R.</given-names></name> <name><surname>Buschlen</surname> <given-names>S.</given-names></name> <name><surname>Sakamoto</surname> <given-names>W.</given-names></name> <name><surname>Kindle</surname> <given-names>K. L.</given-names></name> <name><surname>Stern</surname> <given-names>D. B.</given-names></name><etal/></person-group> (<year>1994</year>). <article-title>The petD gene is transcribed by functionally redundant promoters in <italic>Chlamydomonas-reinhardtii</italic> chloroplasts.</article-title> <source><italic>Mol. Cell. Biol.</italic></source> <volume>14</volume> <fpage>6171</fpage>&#x2013;<lpage>6179</lpage>. <pub-id pub-id-type="doi">10.1128/MCB.14.9.6171</pub-id></citation></ref>
<ref id="B52"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Timmis</surname> <given-names>J. N.</given-names></name> <name><surname>Ayliffe</surname> <given-names>M. A.</given-names></name> <name><surname>Huang</surname> <given-names>C. Y.</given-names></name> <name><surname>Martin</surname> <given-names>W.</given-names></name></person-group> (<year>2004</year>). <article-title>Endosymbiotic gene transfer: organelle genomes forge eukaryotic chromosomes.</article-title> <source><italic>Nat. Rev. Genet.</italic></source> <volume>5</volume> <fpage>123</fpage>&#x2013;<lpage>U116</lpage>. <pub-id pub-id-type="doi">10.1038/nrg1271</pub-id></citation></ref>
<ref id="B53"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tuskan</surname> <given-names>G. A.</given-names></name> <name><surname>DiFazio</surname> <given-names>S.</given-names></name> <name><surname>Jansson</surname> <given-names>S.</given-names></name> <name><surname>Bohlmann</surname> <given-names>J.</given-names></name> <name><surname>Grigoriev</surname> <given-names>I.</given-names></name> <name><surname>Hellsten</surname> <given-names>U.</given-names></name><etal/></person-group> (<year>2006</year>). <article-title>The genome of black cottonwood, <italic>Populus trichocarpa</italic> (Torr. &#x0026; Gray).</article-title> <source><italic>Science</italic></source> <volume>313</volume> <fpage>1596</fpage>&#x2013;<lpage>1604</lpage>. <pub-id pub-id-type="doi">10.1126/science.1128691</pub-id></citation></ref>
<ref id="B54"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ueda</surname> <given-names>M.</given-names></name> <name><surname>Fujimoto</surname> <given-names>M.</given-names></name> <name><surname>Arimura</surname> <given-names>S.</given-names></name> <name><surname>Murata</surname> <given-names>J.</given-names></name> <name><surname>Tsutsumi</surname> <given-names>N.</given-names></name> <name><surname>Kadowaki</surname> <given-names>K.</given-names></name></person-group> (<year>2007</year>). <article-title>Loss of the rpl32 gene from the chloroplast genome and subsequent acquisition of a preexisting transit peptide within the nuclear gene in <italic>Populus</italic>.</article-title> <source><italic>Gene</italic></source> <volume>402</volume> <fpage>51</fpage>&#x2013;<lpage>56</lpage>. <pub-id pub-id-type="doi">10.1016/j.gene.2007.07.019</pub-id></citation></ref>
<ref id="B55"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wagele</surname> <given-names>J. W.</given-names></name> <name><surname>Mayer</surname> <given-names>C.</given-names></name></person-group> (<year>2007</year>). <article-title>Visualizing differences in phylogenetic information content of alignments and distinction of three classes of long-branch effects.</article-title> <source><italic>BMC Evol. Biol.</italic></source> <volume>7</volume>:<issue>147</issue>. <pub-id pub-id-type="doi">10.1186/1471-2148-7-147</pub-id></citation></ref>
<ref id="B56"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>Z. S.</given-names></name> <name><surname>Du</surname> <given-names>S. H.</given-names></name> <name><surname>Dayanandan</surname> <given-names>S.</given-names></name> <name><surname>Wang</surname> <given-names>D. S.</given-names></name> <name><surname>Zeng</surname> <given-names>Y. F.</given-names></name> <name><surname>Zhang</surname> <given-names>J. G.</given-names></name></person-group> (<year>2014</year>). <article-title>Phylogeny reconstruction and hybrid analysis of <italic>Populus</italic> (Salicaceae) based on nucleotide sequences of multiple single-copy nuclear genes and plastid fragments.</article-title> <source><italic>PLoS ONE</italic></source> <volume>9</volume>:<issue>e103645</issue>. <pub-id pub-id-type="doi">10.1371/journal.pone.0103645</pub-id></citation></ref>
<ref id="B57"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wu</surname> <given-names>J.</given-names></name> <name><surname>Nyman</surname> <given-names>T.</given-names></name> <name><surname>Wang</surname> <given-names>D.-C.</given-names></name> <name><surname>Argus</surname> <given-names>G. W.</given-names></name> <name><surname>Yang</surname> <given-names>Y.-P.</given-names></name> <name><surname>Chen</surname> <given-names>J.-H.</given-names></name></person-group> (<year>2015</year>). <article-title>Phylogeny of <italic>Salix</italic> subgenus <italic>Salix</italic> s.l. (Salicaceae): delimitation, biogeography, and reticulate evolution.</article-title> <source><italic>BMC Evol. Biol.</italic></source> <volume>15</volume>:<issue>31</issue>. <pub-id pub-id-type="doi">10.1186/s12862-015-0311-7</pub-id></citation></ref>
<ref id="B58"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wurdack</surname> <given-names>K. J.</given-names></name> <name><surname>Davis</surname> <given-names>C. C.</given-names></name></person-group> (<year>2009</year>). <article-title>Malpighiales phylogenetics: gaining ground on one of the most recalcitrant clades in the angiosperm tree of life.</article-title> <source><italic>Am. J. Bot.</italic></source> <volume>96</volume> <fpage>1551</fpage>&#x2013;<lpage>1570</lpage>. <pub-id pub-id-type="doi">10.3732/ajb.0800207</pub-id></citation></ref>
<ref id="B59"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wyman</surname> <given-names>S. K.</given-names></name> <name><surname>Jansen</surname> <given-names>R. K.</given-names></name> <name><surname>Boore</surname> <given-names>J. L.</given-names></name></person-group> (<year>2004</year>). <article-title>Automatic annotation of organellar genomes with DOGMA.</article-title> <source><italic>Bioinformatics</italic></source> <volume>20</volume> <fpage>3252</fpage>&#x2013;<lpage>3255</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/bth352</pub-id></citation></ref>
<ref id="B60"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xie</surname> <given-names>Z. Y.</given-names></name> <name><surname>Culler</surname> <given-names>D.</given-names></name> <name><surname>Dreyfuss</surname> <given-names>B. W.</given-names></name> <name><surname>Kuras</surname> <given-names>R.</given-names></name> <name><surname>Wollman</surname> <given-names>F. A.</given-names></name> <name><surname>Girard-Bascou</surname> <given-names>J.</given-names></name><etal/></person-group> (<year>1998</year>). <article-title>Genetic analysis of chloroplast c-type cytochrome assembly in <italic>Chlamydomonas-reinhardtii</italic>: one chloroplast locus and at least four nuclear loci are required for heme attachment.</article-title> <source><italic>Genetics</italic></source> <volume>148</volume> <fpage>681</fpage>&#x2013;<lpage>692</lpage>.</citation></ref>
<ref id="B61"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yang</surname> <given-names>M.</given-names></name> <name><surname>Zhang</surname> <given-names>X. W.</given-names></name> <name><surname>Liu</surname> <given-names>G. M.</given-names></name> <name><surname>Yin</surname> <given-names>Y. X.</given-names></name> <name><surname>Chen</surname> <given-names>K. F.</given-names></name> <name><surname>Yun</surname> <given-names>Q. Z.</given-names></name><etal/></person-group> (<year>2010</year>). <article-title>The complete chloroplast genome sequence of date palm (<italic>Phoenix dactylifera</italic> L.).</article-title> <source><italic>PLoS ONE</italic></source> <volume>5</volume>:<issue>e12762</issue>. <pub-id pub-id-type="doi">10.1371/journal.pone.0012762</pub-id></citation></ref>
<ref id="B62"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yang</surname> <given-names>Z. H.</given-names></name></person-group> (<year>1998</year>). <article-title>Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution.</article-title> <source><italic>Mol. Biol. Evol.</italic></source> <volume>15</volume> <fpage>568</fpage>&#x2013;<lpage>573</lpage>. <pub-id pub-id-type="doi">10.1093/oxfordjournals.molbev.a025957</pub-id></citation></ref>
<ref id="B63"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yang</surname> <given-names>Z. H.</given-names></name></person-group> (<year>2007</year>). <article-title>PAML 4: phylogenetic analysis by maximum likelihood.</article-title> <source><italic>Mol. Biol. Evol.</italic></source> <volume>24</volume> <fpage>1586</fpage>&#x2013;<lpage>1591</lpage>. <pub-id pub-id-type="doi">10.1093/Molbev/Msm088</pub-id></citation></ref>
<ref id="B64"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yoshida</surname> <given-names>T.</given-names></name> <name><surname>Furihata</surname> <given-names>H. Y.</given-names></name> <name><surname>Kawabe</surname> <given-names>A.</given-names></name></person-group> (<year>2013</year>). <article-title>Patterns of genomic integration of nuclear chloroplast DNA fragments in plant species.</article-title> <source><italic>DNA Res.</italic></source> <volume>21</volume> <fpage>127</fpage>&#x2013;<lpage>140</lpage>. <pub-id pub-id-type="doi">10.1093/dnares/dst045</pub-id></citation></ref>
<ref id="B65"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yuan</surname> <given-names>Q.</given-names></name> <name><surname>Hill</surname> <given-names>J.</given-names></name> <name><surname>Hsiao</surname> <given-names>J.</given-names></name> <name><surname>Moffat</surname> <given-names>K.</given-names></name> <name><surname>Ouyang</surname> <given-names>S.</given-names></name> <name><surname>Cheng</surname> <given-names>Z.</given-names></name><etal/></person-group> (<year>2002</year>). <article-title>Genome sequencing of a 239-kb region of rice chromosome 10L reveals a high frequency of gene duplication and a large chloroplast DNA insertion.</article-title> <source><italic>Mol. Genet. Genomics</italic></source> <volume>267</volume> <fpage>713</fpage>&#x2013;<lpage>720</lpage>. <pub-id pub-id-type="doi">10.1007/s00438-002-0706-1</pub-id></citation></ref>
<ref id="B66"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>Y. J.</given-names></name> <name><surname>Ma</surname> <given-names>P. F.</given-names></name> <name><surname>Li</surname> <given-names>D. Z.</given-names></name></person-group> (<year>2011</year>). <article-title>High-throughput sequencing of six bamboo chloroplast genomes: phylogenetic implications for temperate woody bamboos (Poaceae: Bambusoideae).</article-title> <source><italic>PLoS ONE</italic></source> <volume>6</volume>:<issue>e20596</issue>. <pub-id pub-id-type="doi">10.1371/journal.pone.0020596</pub-id></citation></ref>
</ref-list>
<fn-group>
<fn id="fn01"><label>1</label><p><ext-link ext-link-type="uri" xlink:href="http://blast.ncbi.nlm.nih.gov/">http://blast.ncbi.nlm.nih.gov/</ext-link></p></fn>
<fn id="fn02"><label>2</label><p><ext-link ext-link-type="uri" xlink:href="https://phytozome.jgi.doe.gov/pz/portal.html#!info?alias=Org_Ptrichocarpa">https://phytozome.jgi.doe.gov/pz/portal.html#!info?alias=Org_Ptrichocarpa</ext-link></p></fn>
<fn id="fn03"><label>3</label><p><ext-link ext-link-type="uri" xlink:href="https://phytozome.jgi.doe.gov/pz/portal.html#!info?alias=Org_Spurpurea">https://phytozome.jgi.doe.gov/pz/portal.html#!info?alias=Org_Spurpurea</ext-link></p></fn>
</fn-group>
</back>
</article>