three genome-based phylogeny of cupressaceae s.l.: further evidence for the evolution of gymnosperms...

19
Three genome-based phylogeny of Cupressaceae s.l.: Further evidence for the evolution of gymnosperms and Southern Hemisphere biogeography Zu-Yu Yang a,b , Jin-Hua Ran a , Xiao-Quan Wang a,a State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China b Graduate University of the Chinese Academy of Sciences, Beijing 100039, China article info Article history: Received 1 April 2011 Revised 1 May 2012 Accepted 2 May 2012 Available online 18 May 2012 Keywords: LEAFY (LFY) NEEDLY (NLY) Allopolyploid origin Ancient hybridization Gymnosperm Gondwana abstract Phylogenetic information is essential to interpret the evolution of species. While DNA sequences from different genomes have been widely utilized in phylogenetic reconstruction, it is still difficult to use nuclear genes to reconstruct phylogenies of plant groups with large genomes and complex gene families, such as gymnosperms. Here, we use two single-copy nuclear genes, together with chloroplast and mito- chondrial genes, to reconstruct the phylogeny of the ecologically-important conifer family Cupressaceae s.l., based on a complete sampling of its 32 genera. The different gene trees generated are highly congru- ent in topology, supporting the basal position of Cunninghamia and the seven-subfamily classification, and the estimated divergence times based on different datasets correspond well with each other and with the oldest fossil record. These results imply that we have obtained the species phylogeny of Cupressaceae s.l. In addition, possible origins of all three polyploid conifers were investigated, and a hybrid origin was suggested for Cupressus, Fitzroya and Sequoia. Moreover, we found that the biogeographic history of Cupressaceae s.l. is associated with the separation between Laurasia and Gondwana and the further break-up of the latter. Our study also provides new evidence for the gymnosperm phylogeny. Ó 2012 Elsevier Inc. All rights reserved. 1. Introduction Reconstructing plant phylogenies using sequences from inde- pendent nuclear loci and different genomic compartments has been increasingly popular due to the growing awareness that rely- ing on a single data set may result in insufficient phylogenetic res- olution or misleading inferences (Maddison, 1997; Wendel and Doyle, 1998). Phylogenetic congruence among different genomic compartments could strongly suggest that the gene trees are also congruent with the species phylogeny (Wang et al., 2000). On the other hand, molecular dating has proved very efficient in estimat- ing evolutionary divergence times of diverse taxa (e.g., Wang et al., 2000; Sanderson, 2002; Knapp et al., 2005; Barker et al., 2007; Sauquet et al., 2009), although there are still some controversies regarding mainly the appropriateness of the selected model, calibration procedure, effect of long branches, and degree of congruence between time estimates and the fossil record (e.g., Kumar, 2005; Magallón and Sanderson, 2005; Rutschmann et al., 2007; Inoue et al., 2010; Magallón, 2010). It would be more convincing if divergence time estimates are congruent among different genes and consistent with the fossil record. In the past decade, low-copy nuclear genes have been widely utilized to improve the resolution and robustness of plant phyloge- netic reconstruction at various taxonomic levels (e.g., Wang et al., 2000; Sang, 2002; Peng and Wang, 2008). However, this use is lim- ited by the problems associated with the complex evolutionary dynamics of nuclear genes, such as gene paralogy, recombination, lineage sorting, and lateral gene transfer (Small et al., 2004). This limitation is particularly notable for gymnosperms due to the large nuclear genomes and complex gene families (Kinlaw and Neale, 1997; Murray, 1998; Leitch et al., 2001; Ahuja and Neale, 2005), as well as the unavailability of complete genome sequences so far. In contrast to other low-copy nuclear genes with a high rate of birth and death evolution, the use of sister genes from ancient gene duplication could minimize these potential problems when both copies exist in the studied taxa. Cupressaceae s.l., including Cupressaceae s.s. and traditional Taxodiaceae without Sciadopitys, is an important component of for- ests, and comprises 32 genera and more than 130 species (Farjón, 2005; Adams et al., 2009; Debreczy et al., 2009). Among them, only four genera, i.e., Callitris, Cupressus, Hesperocyparis (a New World genus separated from Cupressus)(Adams et al., 2009) and Juniperus, have more than 10 species, and as many as 19 genera are monotypic. Cupressaceae s.s. was first separated from Taxodiaceae by Pilger 1055-7903/$ - see front matter Ó 2012 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.ympev.2012.05.004 Corresponding author. Address: State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, 20 Nanxincun, Xiangshan, Beijing 100093, China. Fax: +86 10 62590843. E-mail address: [email protected] (X.-Q. Wang). Molecular Phylogenetics and Evolution 64 (2012) 452–470 Contents lists available at SciVerse ScienceDirect Molecular Phylogenetics and Evolution journal homepage: www.elsevier.com/locate/ympev

Upload: zu-yu-yang

Post on 30-Oct-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

  • ssu

    se A

    Keywords:LEAFY (LFY)NEEDLY (NLY)

    s eenct p

    such as gymnosperms. Here, we use two single-copy nuclear genes, together with chloroplast and mito-

    ent in topology, supporting the basal position of Cunninghamia and the seven-subfamily classication,

    s usinggenom

    regarding mainly the appropriateness of the selected model,calibration procedure, effect of long branches, and degree ofcongruence between time estimates and the fossil record (e.g.,Kumar, 2005; Magalln and Sanderson, 2005; Rutschmann et al.,2007; Inoue et al., 2010; Magalln, 2010). It would be more

    of birth and death evolution, the use of sister genes from ancientgene duplication could minimize these potential problems whenboth copies exist in the studied taxa.

    Cupressaceae s.l., including Cupressaceae s.s. and traditionalTaxodiaceae without Sciadopitys, is an important component of for-ests, and comprises 32 genera and more than 130 species (Farjn,2005; Adams et al., 2009; Debreczy et al., 2009). Among them, onlyfour genera, i.e., Callitris, Cupressus, Hesperocyparis (a New Worldgenus separated from Cupressus) (Adams et al., 2009) and Juniperus,havemore than 10 species, and asmany as 19 genera aremonotypic.Cupressaceae s.s. was rst separated from Taxodiaceae by Pilger

    Corresponding author. Address: State Key Laboratory of Systematic andEvolutionary Botany, Institute of Botany, Chinese Academy of Sciences, 20Nanxincun, Xiangshan, Beijing 100093, China. Fax: +86 10 62590843.

    Molecular Phylogenetics and Evolution 64 (2012) 452470

    Contents lists available at

    Molecular Phylogene

    .e lE-mail address: [email protected] (X.-Q. Wang).been increasingly popular due to the growing awareness that rely-ing on a single data set may result in insufcient phylogenetic res-olution or misleading inferences (Maddison, 1997; Wendel andDoyle, 1998). Phylogenetic congruence among different genomiccompartments could strongly suggest that the gene trees are alsocongruent with the species phylogeny (Wang et al., 2000). On theother hand, molecular dating has proved very efcient in estimat-ing evolutionary divergence times of diverse taxa (e.g., Wang et al.,2000; Sanderson, 2002; Knapp et al., 2005; Barker et al., 2007;Sauquet et al., 2009), although there are still some controversies

    netic reconstruction at various taxonomic levels (e.g., Wang et al.,2000; Sang, 2002; Peng and Wang, 2008). However, this use is lim-ited by the problems associated with the complex evolutionarydynamics of nuclear genes, such as gene paralogy, recombination,lineage sorting, and lateral gene transfer (Small et al., 2004). Thislimitation is particularly notable for gymnosperms due to the largenuclear genomes and complex gene families (Kinlaw and Neale,1997; Murray, 1998; Leitch et al., 2001; Ahuja and Neale, 2005),as well as the unavailability of complete genome sequences sofar. In contrast to other low-copy nuclear genes with a high rateAllopolyploid originAncient hybridizationGymnospermGondwana

    1. Introduction

    Reconstructing plant phylogeniependent nuclear loci and different1055-7903/$ - see front matter 2012 Elsevier Inc. Ahttp://dx.doi.org/10.1016/j.ympev.2012.05.004and the estimated divergence times based on different datasets correspond well with each other and withthe oldest fossil record. These results imply that we have obtained the species phylogeny of Cupressaceaes.l. In addition, possible origins of all three polyploid conifers were investigated, and a hybrid origin wassuggested for Cupressus, Fitzroya and Sequoia. Moreover, we found that the biogeographic history ofCupressaceae s.l. is associated with the separation between Laurasia and Gondwana and the furtherbreak-up of the latter. Our study also provides new evidence for the gymnosperm phylogeny.

    2012 Elsevier Inc. All rights reserved.

    sequences from inde-ic compartments has

    convincing if divergence time estimates are congruent amongdifferent genes and consistent with the fossil record.

    In the past decade, low-copy nuclear genes have been widelyutilized to improve the resolution and robustness of plant phyloge-Available online 18 May 2012chondrial genes, to reconstruct the phylogeny of the ecologically-important conifer family Cupressaceaes.l., based on a complete sampling of its 32 genera. The different gene trees generated are highly congru-Three genome-based phylogeny of Cuprefor the evolution of gymnosperms and So

    Zu-Yu Yang a,b, Jin-Hua Ran a, Xiao-Quan Wang a,a State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, ChinebGraduate University of the Chinese Academy of Sciences, Beijing 100039, China

    a r t i c l e i n f o

    Article history:Received 1 April 2011Revised 1 May 2012Accepted 2 May 2012

    a b s t r a c t

    Phylogenetic information idifferent genomes have benuclear genes to reconstru

    journal homepage: wwwll rights reserved.aceae s.l.: Further evidencethern Hemisphere biogeography

    cademy of Sciences, Beijing 100093, China

    ssential to interpret the evolution of species. While DNA sequences fromwidely utilized in phylogenetic reconstruction, it is still difcult to usehylogenies of plant groups with large genomes and complex gene families,

    SciVerse ScienceDirect

    tics and Evolution

    sevier .com/locate /ympev

  • stem identity occurs in all land plants (Frohlich and Meyerowitz,

    etic(1926), but afterwards the morphological, anatomical, embryologi-cal, immunological, and cladistic studies (Eckenwalder, 1976; Hart,1987; Price and Lowenstein, 1989; Farjn, 2005; Schulz and Stutzel,2007) as well as molecular investigations (Brunsfeld et al., 1994;Tsumura et al., 1995; Chaw et al., 1997, 2000; Stefanovic et al.,1998; Gadek et al., 2000; Kusumi et al., 2000; Quinn et al., 2002;Rydin et al., 2002; Schmidt and Schneider-Poetsch, 2002; Raiet al., 2008) consistently support a merger of the two families. Forthe infra-familial classication of Cupressaceae s.l., Gadek et al.(2000) divided the family into seven subfamilies based on morpho-logical and cpDNA evidence, which include Cunninghamioideae,Taiwanioideae, Athrotaxidoideae, Sequoioideae, Taxodioideae,Callitroideae and Cupressoideae. However, Farjn (2005) did notrecognize the subfamily Callitroideae that occurs in the SouthernHemisphere, and treated Cupressaceae s.s. as a subfamily (Cupres-soideae) rather than two subfamilies.

    All previous molecular phylogenies of Cupressaceae (s.l. or s.s.)were reconstructed based on chloroplast DNA (cpDNA) markers(Tsumura et al., 1995; Gadek and Quinn, 1993; Brunsfeld et al.,1994; Gadek et al., 2000; Kusumi et al., 2000), although 410 generaof the family were sampled in several other studies using nucleargenes (Chaw et al., 1997; Stefanovic et al., 1998; Kusumi et al.,2002; Rydin et al., 2002). In addition, the published cpDNA phylog-enies comprise only 1222 genera of Cupressaceae s.l. (Gadek andQuinn, 1993; Brunsfeld et al., 1994; Tsumura et al., 1995; Kusumiet al., 2000; Quinn et al., 2002) except that 31 genera were sampledby Gadek et al. (2000), and the intergeneric relationships, especiallywithin Cupressaceae s.s., were poorly resolved in the rbcL gene trees(Gadek and Quinn, 1993; Brunsfeld et al., 1994; Gadek et al., 2000).In the study of Gadek et al. (2000), ve genera occurring in theSouthern Hemisphere (Actinostrobus, Austrocedrus, Fitzroya, Pilgero-dendron and Papuacedrus) and two genera in the Northern Hemi-sphere [Cupressus s.s. and Xanthocyparis, a new genus fromnorthern Vietnam (Farjn et al., 2002)] were not included in thecombined matK + rbcL gene analysis. In particular, after diagnosingthe inadequacies of thematK + rbcL analysis, the authors favored thematK + non-molecular analysis andused it as the basis for their clas-sication. Furthermore, the chloroplast genome behaves as a singlelocus (Doyle, 1992), and can only provide genetic information of oneparent given its predominantly paternal inheritance in Cupressa-ceae s.l. (reviewed by Mogensen, 1996). Thus, analysis of multiplegenes from different genomic compartments, especially the nucleargenome, and extensive sampling would be necessary to clarify theintergeneric relationships within Cupressaceae s.l., since hybridiza-tion has played a major role in the development of plant speciesdiversity (Soltis and Soltis, 2009).

    Compared to the abundance of polyploids in angiosperms, coni-fers are mainly diploids with scattered natural polyploids onlyoccurring in Cupressaceae s.l., such as the tetraploid Fitzroyacupressoides (Hair, 1968), and the hexaploid Sequoia sempervirens(Stebbins, 1948). Cupressaceae s.l. is also the only coniferous familywith a virtually worldwide distribution, being represented in allcontinents except Antarctica. According to the fossil record andmolecular divergence time estimates, the present distribution ofthe traditional Taxodiaceae is generally interpreted as a relic froma much more widespread and common occurrence in the past,while the split between the two clades of Cupressaceae s.s. (North-ern and Southern Hemispheres) could be dated back to the separa-tion of Laurasia and Gondwana (Li, 1953; Miller, 1977; Li and Yang,2002). In particular, many genera of the Southern Hemisphereclade of Cupressaceae s.s. are conned to a single continent. There-fore, a robust phylogenetic reconstruction of Cupressaceae s.l.,especially the intergeneric relationships, may also shed light on

    Z.-Y. Yang et al. /Molecular Phylogenthe origin and evolution of the rare natural polyploids of conifersand provide further evidence for the break-up history ofGondwana.2. Materials and methods

    2.1. Taxon sampling

    All the recognized 32 genera (totaling 45 species) of Cupressa-ceae s.l. (Farjn, 2005; Adams et al., 2009; Debreczy et al., 2009)were sampled. We used the names Callitropsis nootkatensis andXanthocyparis vietnamensis following Debreczy et al. (2009). Taxuscuspidata var. nana and Cephalotaxus sinensis were chosen as out-groups to study the intergeneric relationships of Cupressaceae s.l.according to the sister relationship between TaxaceaeCephalotax-aceae and Cupressaceae s.l. (Quinn et al., 2002). To investigate evo-lution of the LFY and NLY genes in gymnosperms, we also sampledthe other families of conifers (except Phyllocladaceae), Zamia fur-furacea, Ginkgo biloba and Gnetales. The LFY and NLY sequences ofZamia furfuracea, Ginkgo biloba, Gnetales, Pinus radiata, Picea abies,Podocarpus matudae var. reichei and Taxus globosawere downloadedfrom GenBank. All the above-mentioned samples were used in theDNA analysis. To preliminarily explore whether the LFY and NLYgenes amplied from genomic DNA are functional, we further choseseven easily accessible species from three families (Araucariaceae,Cupressaceae s.l., Pinaceae) for RNA extraction and RT-PCR analysis.Moreover, to determine whether each of the LFY and NLY genes ex-ists as a single locus, four species from different lineages of conifersthat represent both diploid and polyploid species were selected forthe Southern blot analysis. The origins of materials and voucherspecimens are shown in Table S1.

    2.2. DNA and RNA extraction, PCR and RT-PCR amplication, cloningand sequencing1997; Maizel et al., 2005; Tanahashi et al., 2005; Moyroud et al.,2010). Although this gene exists as a single-copy in most diploidangiosperms, its sister gene NEEDLY (NLY) that originated from aduplication event in the common ancestor of seed plants still re-mains in gymnosperms (Frohlich and Meyerowitz, 1997; Moura-dov et al., 1998; Maizel et al., 2005; Vazquez-Lobo et al., 2007;Shiokawa et al., 2008). Thus, the duplicated sister genes LFY andNLY are very suitable to be used as nuclear gene markers for thephylogenetic reconstruction of Cupressaceae s.l.

    Recently, the LFY gene, especially its second intron, has beenwidely used to reconstruct the phylogeny of many angiospermgroups (e.g., Oh and Potter, 2003; Grob et al., 2004; Kim et al.,2008), and several genera of gymnosperms such as Gnetum (Wonand Renner, 2006), Thuja (Peng and Wang, 2008), and Pseudotsuga(Wei et al., 2010). Also, the NLY gene has been used to resolve theinterspecic relationships of Cupressus (Little, 2006). Moreover,there isarich fossil recordofCupressaceae s.l. (assummarized inFlorin,1963; Miller, 1977; Farjn, 2005), which is very helpful for estimat-ing the divergence times of different lineages within the family.

    In the present study, we use the two nuclear genes LFY and NLY,coupled with the chloroplast matK and mitochondrial rps3 genes,to reconstruct the phylogeny of Cupressaceae s.l. based on a com-plete sampling of its 32 genera. Then, we discuss the evolution ofLFY andNLY in gymnosperms, and possible hybrid origins of Cupres-sus, Fitzroya and Sequoia. In addition, the biogeographical history ofCupressaceae s.l., in particular Cupressaceae s.s. in the SouthernHemisphere, is investigated with the help of molecular dating, thefossil record and geological evidence for the break-up of Gondwana.The LEAFY (LFY) gene that encodes a transcription factor in-volved in regulating cell division and arrangement or oral meri-

    s and Evolution 64 (2012) 452470 453Total DNAwas extracted from silica gel dried leaves using eitherthe modied CTAB method (Rogers and Bendich, 1985) or the

  • eticDNAsecure Plant Kit (Tiangen, Beijing, China). Young leaves of Arau-caria heterophylla, Juniperus chinensis, Metasequoia glyptostroboides,Pinus armandii, Platycladus orientalis, Sequoia sempervirens andThuja occidentalis (different individuals from those for DNA extrac-tion except Pinus armandii) were collected for RNA extraction. TotalRNA was isolated and the rst-strand cDNA was produced follow-ing Ran et al. (2010a). For polymerase chain reaction, we used totalDNA or cDNA as template. The LFY gene was amplied with theforward primer LFYE1F3 (Peng and Wang, 2008) or LFYE1F4 andreverse primer LFYE3R3 or LFYE3R4 (Peng and Wang, 2008), andthe NLY gene with NLYE1F1 or NLYE1F2 (forward) and NLYE3R3(reverse) (Fig. S1 and Table S2). To obtain the matK gene sequenceof each species from the same individual as used in the other genes,also for the accuracy of sequences, we amplied this gene withprimers matKF and matKR (Kusumi et al., 2000) rather than down-loading sequences from GenBank. PCR amplication was carriedout in a volume of 25 ll containing 100200 ng DNA or 12 llcDNA template for nuclear genes and 550 ng of DNA templatefor matK, 6.25 pmol of each primer, 0.2 mM of each dNTP, 2 mMMgCl2, and 0.75 U of ExTaq DNA polymerase (Takara Biotechnology,CO., Ltd. Dalian, China). Amplication was conducted in a Tgradientthermal cycler (Biometre, Gttingen, Germany) or an EppendorfMastercycler (Eppendorf Scientic, Westbury, NY, USA). PCR cycleswere as follows: one cycle of 4 min at 94 C, four cycles of 1 min at94 C, 30 s at 55 C (for matK and LFY) or 58 C (for NLY), and1.56.0 min at 72 C, followed by 35 cycles of 30 s at 94 C, 30 s at53 C (for matK and LFY) or 56 C (for NLY), and 1.56.0 min at72 C, with a nal extension step for 10 min at 72 C. The amplica-tion of the mitochondrial rps3 gene followed Ran et al. (2010a),using the same primer pair rps19-F + rpl16-R.

    PCR products were separated by 1.2% agarose gel electrophore-sis and puried using the TIANgel Midi Purication Kit (Tiangen,Beijing, China). For the nuclear genes, the puried PCR productswere rst sequenced directly with the PCR primers and two inter-nal primers to cover all exons. Then, we cloned all the PCR productswith the pGEM-T Easy Vector System II (Promega, Madison, USA).Ten clones with the correct insertion, determined by digestionwith EcoR I, were picked for each species and screened by sequenc-ing with the primer T7. All distinct clones were further sequencedwith the primer SP6 and internal primers (Table S2). For the cyto-plasmic genes, the puried PCR products were directly sequenced.However, the matK gene of Juniperus chinensis, Libocedrus plumosa,Xanthocyparis vietnamensis, and Araucaria heterophylla has poly-morphic sites in the sequencing chromatogram, so these sampleswere also cloned as described above.

    Sequencing was performed using the DYEnamic ET TerminatorKit (Amersham Pharmacia Biotech) or the BigDye Terminator v3.1Cycle Sequencing Kit. The sequencing products were separated ona MegaBACE 1000 automatic sequencer (Amersham Biosciences,Buckinghamshire, UK) or a 96-capillary 3730XL DNA analyzer(Applied Biosystems, Foster City, CA, USA). The sequences reportedin this studyaredeposited inGenBankunder accessionsHQ245712HQ245782 (NLY), HQ245783HQ245805 and JN226115 (rps3),HQ245806245873 (LFY), and HQ245874HQ245921 (matK)(Table S1).

    2.3. Southern blot analysis

    Southern blotting was used to detect the number of loci of thetwo nuclear genes LFY and NLY in the four conifer speciesCunninghamia lanceolata, Sequoia sempervirens, Taxus cuspidatavar. nana and Pinus armandii. Approximately 2040 lg of genomicDNA was digested with restriction enzymes that do not have rec-

    454 Z.-Y. Yang et al. /Molecular Phylogenognition sites in the probe sequence, separated on a 0.8% agarosegel, and transferred to a nylon Hybond N+ membrane using theVacuGene XL Vacuum Blotting System (Amersham Biosciences,Buckinghamshire, UK). For both genes, the probes were designedto cover exon 1 (for P. armandii) or exon 2 (for S. sempervirensand C. lanceolata) or exons 23 including intron 2 (for T. cuspidatavar. nana), which were amplied from species-specic clones andlabeled with alkaline phosphatase using the Gene Images AlkPhosDirect Labeling and Detection System (GE Healthcare, Bucking-hamshire, UK). The membranes were hybridized and washed at5665 C, following the protocol provided by the manufacturer.The hybridized signals were detected with the CDP-Star detectionreagent (GE Healthcare, Buckinghamshire, UK), and visualized by aGeneGnome BioImaging System (Syngene, Cambridge, UK) orexposure to Kodak X-Omat BT Film at room temperature for 24 h.

    2.4. Sequence analysis

    Sequence alignments were made with CLUSTAL X (Thompsonet al., 1997) and rened manually. The variable sites and variabilityof conspecic clones of LFY and NLY were calculated by MEGA ver-sion 3 (Kumar et al., 2004) and BioEdit software (Hall, 1999),respectively. The unalignable regions in LFY and rps3 and intronsof the two nuclear genes that cannot be reliably aligned among dif-ferent families of gymnosperms, even within Cupressaceae s.l.,were excluded manually from the analyses at or above the familylevel. The aligned sequences were tested for substitutional satura-tion by a maximum likelihood method (Grifths, 1997).

    To assess congruence between the four gene data sets, LFY, NLY,matK and rps3, in Cupressaceae s.l., we used the incongruencelength difference test (ILD) (Farris et al., 1994) as implemented inPAUP 4.0b10 (Swofford, 2002) in a pair-wise fashion. In the rstround of tests, all sequences (45 species) were included, and whena species harbors two distinct LFY or NLY clones, the other geneswere duplicated and randomly paired with them. To identify se-quences and species responsible for incongruence between datapartitions, the ILD test was further conducted after excluding se-quences from all species of which clones do not formmonophyleticgroups. When the data sets were reduced to include 39 species, theresult showed that all genes except rps3 are not signicantly incon-gruent (Table 1). However, the 39-species data set does not includethe two genera Xanthocyparis and Fitzroya. Then, to include all gen-era of Cupressaceae s.l. in the combined analysis, we added Xantho-cyparis vietnamensis and Fitzroya cupressoides into the 39-speciesdata set, and tried all different combinations of the four clones ofthe two species (Xanthocyparis-NLY-1, Xanthocyparis-NLY-11,Fitzroya-LFY-1, and Fitzroya-LFY-7) for the ILD test (41 speciesA:Xanthocyparis-NLY-1 + Fitzroya-LFY-7; 41 speciesB: Xanthocyparis-NLY-1 + Fitzroya-LFY-1; 41 speciesC: Xanthocyparis-NLY-11 +Fitzroya-LFY-7; 41 speciesD: Xanthocyparis-NLY-11 + Fitzroya-LFY-1). The ILD test on the four 41-species data sets did not ndsignicant incongruence among the three genes LFY, NLY andmatK(Table 1), and preliminary phylogenetic analyses found that onlythe relationships of Xanthocyparis, Hesperocyparis and Callitropsisare different when different NLY clones of Xanthocyparis were used(trees not shown, and see Section 4). Finally, we randomly chose the41 speciesA data set for the combined analysis (LFY + NLY, and LFY +NLY +matK). Although the ILD test showed that rps3 is signicantlyincongruent with other genes, we also tried to combine the fourgenes (LFY + NLY +matK + rps3) for analysis.

    2.5. Phylogenetic analysis

    Phylogenetic analyses were conducted on seven data matrices,including two of gymnosperms (LFY and NLY, Zamia furfuracea asoutgroup), and ve of Cupressaceae s.l., i.e., LFY + NLY, matK, rps3,

    s and Evolution 64 (2012) 452470LFY + NLY +matK, and LFY + NLY +matK + rps3. According to theML test (Grifths, 1997), none of the data sets was saturated withnucleotide substitutions, thus all phylogenetic analyses were based

  • rps3 vs. LFY 0.001 0.001 0.001rps3 vs. NLY 0.029 0.087 0.023

    rm a

    eticon the complete alignments except the exclusion of introns of thetwo nuclear genes and the unalignable regions in LFY and rps3.

    To construct phylogenetic trees, maximum parsimony (MP),maximum likelihood (ML) and Bayesian inference (BI) wereperformed in PAUP 4.0b10, PHYML version 2.4.4 (Guindon andGascuel, 2003) and MrBayes 3.1.2 (Ronquist and Huelsenbeck,2003), respectively. Indels in all datamatrices were treated asmiss-ing data. The MP analyses used heuristic searches with 1000 ran-dom addition sequence replicates, tree-bisection-reconnection(TBR) branch-swapping, andMulTrees option on, except that amax-imumof 2000 treeswere saved per round for the rps3 alignment. Allcharacter states were treated as unordered and equally weighted.The relative robustness of the clades found in theMP treeswas eval-uated by bootstrap analyses (Felsenstein, 1985) based on 1000 rep-licates, using the same options as above except that a maximum of2000 trees were saved per round for the LFY and NLY alignments,and 50% majority rule consensus was used. In ML and BI analyses,the evolutionary models were optimized in Modeltest 3.7 (Posadaand Crandall, 1998) and MrModeltest 2.2 (Nylander, 2004) usingAkaike Information Criterion (AIC), respectively. The best modelsand the model settings for analyses are shown in Table 2. For theML analysis in PHYML version 2.4.4, the GTR model was used formatK and rps3 due to the unavailability of the best-t model TVM.ML parameters were optimized with a BIONJ tree as a starting point(Gascuel, 1997), and support values for nodes on the ML tree wereestimated with 1000 bootstrap replicates. For the Bayesian infer-ence, two independent runs of Markov chain Monte Carlo (MCMC)were conducted simultaneously, with each including one cold and

    rps3 vs. matK 0.001 0.001 0.001

    45 Species: including all species and sequences.43 Species: excluding Xanthocyparis and Fitzroya.39 Species: excluding all non-monophyletic species (clones of a species do not fo41 SpeciesA: 39 species + Xanthocyparis-NLY-1 + Fitzroya-LFY-7.41 SpeciesB: 39 species + Xanthocyparis-NLY-1 + Fitzroya-LFY-1.41 SpeciesC: 39 Species + Xanthocyparis-NLY-11 + Fitzroya-LFY-7.41 SpeciesD: 39 species + Xanthocyparis-NLY-11 + Fitzroya -LFY-1.Table 1P values from the incongruence length difference (ILD) test.

    Datasets P values

    45 Species 43 Species 39 Species

    LFY vs. NLY 0.043 0.142 0.267LFY vs. matK 0.003 0.014 0.087NLY vs. matK 0.022 0.110 0.105

    Z.-Y. Yang et al. /Molecular Phylogenthree incrementally heated chains that started randomly in theparameters space (Temp = 0.20, Swapfreq = 1, Nswaps = 1) and ranfor 1000,000 generations. Every 100 generations were sampled,and the rst 30% of samples were discarded as burn-in accordingto the analysis by Tracer v1.5 (Rambaut and Drummond, 2007). A50% majority-rule consensus tree was generated based on the treessampled after generation 300,000. In addition, we repeated the BIanalysis four times for each data set to conrm the results.

    2.6. Network analysis

    To investigate the possible hybridization events in the evolutionof Cupressaceae s.l., the Neighbor-Net method as implemented inSplitstree 4.11.3 (Huson and Bryant, 2006) was ued to reconstructreticulate networks based on sequences of the combined nucleargenes (LFY + NLY), matK, and rps3, respectively. For distance calcu-lations, we excluded indels and used the best-t or the secondbest-t model (GTR) selected by AIC in Modeltest (Table 2). Therelative robustness of the clades was estimated by performing1000 bootstrap replicates, based on which a 95% condence net-work was constructed for each data set.

    2.7. Molecular dating analysis

    The molecular dating analysis was conducted on matK, LFY, NLYand combined LFY + NLY +matK + rps3, respectively (the rps3 genewas not independently used due to its great length variation andfast evolution in conifer II). The ML tree topologies of Cupressaceaes.l. used for analysis were generated from the simplied data sets,which include one species from each genus and a single clone fromeach species that were randomly chosen (all distinct NLY clones ofXanthocyparis and LFY clones of Fitzroya were included, since theydo not cluster into a single clade), and Araucaria heterophylla wasused as the outgroup. For absolute ages, we relied on the followingcalibrations. The root node, the split between Cupressaceae s.l. andTaxaceaeCephalotaxaceae, was constrained to minimally 192 MAbased on Paleotaxus in the lower Jurassic (Taylor and Taylor, 1993)as used in Magalln and Sanderson (2005), and maximally237 MA based on Parasciadopitys from the Middle Triassic ofAntarctica, a close relative of Sciadopitys (Sciadopityaceae) (Yaoet al., 1997). According to the earliest fossil record, the most recentcommon ancestors (MRCA) of the ve clades, i.e., SequoiaMetasequoiaSequoiadendron, GlyptostrobusTaxodium, LibocedrusPilgerodendron, DiselmaFitzroyaWiddringtonia and JuniperusCupressusHesperocyparis, were constrained to the minimal age of 140 MA(Sequoia in early Cretaceous) (Penny, 1947; Ma et al., 2005),

    41 SpeciesA 41 SpeciesB 41 SpeciesC 41 SpeciesD

    0.317 0.289 0.290 0.2920.089 0.086 0.098 0.0920.099 0.099 0.097 0.1010.001 0.001 0.001 0.0010.016 0.023 0.023 0.0290.001 0.001 0.001 0.001

    monophyletic group).

    s and Evolution 64 (2012) 452470 45599 MA (Glyptostrobus in Cenomanian, upper Cretaceous) (Miller,1977; Aulenback and LePage, 1998), 61.7 MA (Libocedrus inearly-middle Paleocene) (Pole, 1998), 95 MA (Widdringtonia inCretaceous) (McIver, 2001), and 33.9 MA (Juniperus dated back tothe Eocene/Oligocene boundary) (Kvacek, 2002), respectively.

    Rate constancy across lineages was examined for each datasetusing a likelihood ratio test (LRT) (Felsenstein, 1988), in which like-lihood scores of the trees with and without an enforced molecularclock were compared. When the clock assumption was rejected(all P 0.001), we used both Bayesian and penalized-likelihood(PL) methods to estimate the divergence times for comparison,since different datingmethodsmay generate different results basedon the same data set. The Bayesian approach was performed withPAML/multidivtime (Thorne et al., 1998; Yang, 2007), followingthe step-by-step manual, version 1.5 (Rutschmann, 2005). Priorgamma distributions on parameters of the relaxed clock modelwere as follows: the mean of the prior distribution for the rootage (rttm) was set to 2 time units (100 MA) based on the split timebetween Cupressaceae s.l. and TaxaceaeCephalotaxaceae (237192 MA as mentioned above); the standard deviation of this prior(rttmsd) was also set to 2; the mean and SD of the prior distribution

  • MrM

    ettin

    GTR + I + G Lset nst = 6Rates = invgamma;

    ma77;421

    Rates = gamma47;861

    ma82;

    ma,

    ma48;344

    ma,ate

    eticShape = 1.30Pinvar = 0.2

    LFY + NLY GTR + G GTRLset nst = 6;Rates = gamShape = 0.43

    matK TVM + G (GTR + G) GTRLset nst = 6;Rates = gam

    LFY + NLY +matK GTR + I + G GTRLset nst = 6;Rates = gamShape = 1.01Pinvar = 0.2

    rps3 TVM + I + G (GTR + I + G) GTRLset nst = 6;Rates = gamPinvar, estim

    LFY + NLY +matK + rps3 (GTR + I + G) GTRTable 2Sequence evolution models best t to each data set as determined in Modeltest 3.7 and

    Data set Best model for ML ML model s

    LFY GTR + I + G GTRLset nst = 6;Rates = gamShape = 1.78Pinvar = 0.3

    NLY GTR + I + G GTRLset nst = 6;

    456 Z.-Y. Yang et al. /Molecular Phylogenfor the ingroup root rate (rtrate and rtratesd) were both set to 0.05;the brownmean and brownsd were both set to 0.5 following themanuals recommendation that the brownmean mutiplied by rttmbe about 12.Markov chains inmultidivtimewere run for 1000,000generations, sampling every 100th generation for a total of 10,000trees, with a burn-in of the rst 3000 trees.

    The PL analysis (Sanderson, 2002) under the TN algorithm wasimplemented in the program r8s. The topologieswith branch lengthgenerated by PHYML version 2.4.4 (Guindon and Gascuel, 2003)were used for time estimation. Optimal values of smoothing werefound to be 1800, 1800, 1300 and 1000 for LFY, NLY,matK and com-bined LFY + NLY +matK + rps3, respectively, using the statisticalcross-validation method (cvstart = 0, cvinc = 0.125, cvnum = 32).The credibility intervals were estimated using the prole commandbased on 100 topologically constrained trees with branch lengths,which were yielded from 100 bootstrap resampled data generatedby PAUP 4.0b10.

    2.8. Biogeographic analysis

    The traditional Taxodiaceae originated before the separation ofGondwana and Laurasia according to the divergence time estima-tion, and nearly all of its genera are monotypic and relictual. More-over, the biogeographical study of some genera of Cupressaceae s.s.widely distributed in the Northern Hemisphere, such as Juniperus,Cupressus and Thuja, needs a dense species sampling and the inte-gration of complex geological and climatic information (e.g., Pengand Wang, 2008). Therefore, this study focused on the biogeogra-phy of Cupressaceae s.s., especially its genera distributed in theSouthern Hemisphere. In the analysis, we used the same data sets

    Lset nst = 6;Rates = gammaShape = 0.8272;Pinvar = 0.2805

    The second best-t model is shown in brackets.Prsetstatefreqpr = dirichlet(1,1,1,1);

    ;

    GTR + I + G Lset nst = 6Rates = invgamma;Prsetstatefreqpr = dirichlet(1,1,1,1);

    ;

    GTR + G Lset nst = 6Rates = gamma;Prsetstatefreqpr = dirichlet(1,1,1,1);

    GTR + G Lset nst = 6Rates = gamma;

    estimated Prsetstatefreqpr = dirichlet(1,1,1,1);

    GTR + G Lset nst = 6Rates = gamma;Prsetstatefreqpr = dirichlet(1,1,1,1);

    ;

    GTR + I + G Lset nst = 6Rates = invgamma;

    estimated; Prsetd; statefreqpr = dirichlet(1,1,1,1);

    GTR + I + G Lset nst = 6odeltest 2.2 using Akaike Information Criterion (AIC), and model settings for analyses.

    gs Best model for BI BI model settings

    s and Evolution 64 (2012) 452470as used in the molecular dating, but the traditional Taxodiaceaewas excluded. Nine geographical areas were dened in our analy-sis: (A) Australia including Tasmania, (B) South America, (C) NewCaledonia, (D) New Zealand, (E) New Guinea and Moluccas, (F)Africa, (G) Asia, (H) North America, and (I) Europe. Ancestral rangepatterns were inferred by a statistical dispersal-vicariance analysisbased on the BI topology constructed from each of the four simpli-ed data matrices LFY, NLY, matK and combined LFY + NLY +matK + rps3 (each genus represented by one sequence), usingS-DIVA, which reconstructs ancestral ranges while accounting forboth phylogenetic uncertainty and multiple solutions in DIVAoptimization (Yu et al., 2010). For the estimation of node supportvalues and the nal topology, BEAST v1.7.1 package (http://beast.bio.ed.ac.uk/Main_Page, Drummond and Rambaut, 2007) was usedto generate the trees by the following three steps: (1) The BEASTle was generated with BEAUti by setting 1000,000 generationsand sampling every 1000 generations. (2) One thousand trees wereobtained from the BEAST analysis, based on which a BI topology(the nal topology) was generated in TreeAnnotator by discardingthe rst 300 trees as burn-in. (3) The obtained 1000 trees, the naltopology and a le of taxon distribution were imported into S-DIVA, and the ancestral area reconstruction was performed by set-ting a maximum of seven areas at each node.

    3. Results

    3.1. Sequence characterization

    All PCR products of LFY and NLY showed a single band afterelectrophoresis in 1.2% agarose gel except that Diselma archeri

    Rates = invgamma;Prsetstatefreqpr = dirichlet(1,1,1,1);

    ;

  • had another LFY band of a much smaller size, which was conrmedto be a LFY pseudogene with a large deletion between intron 1 andintron 2 by cloning and thus was excluded for further analysis.Based on direct sequencing and cloning, no more than two distinctclones of LFY and NLY occurred in the same individual, and mostspecies did not show clone polymorphism. If two distinct clonesare present in the same species, conspecic clones share the highestsimilarity except for LFY of Fitzroya cupressoides, and NLY of Cupres-sus funebris, Juniperus chinensis, and Xanthocyparis vietnamensis. Allthe species harboring two distinct clones of LFY and NLY and thevariability of conspecic clones are shown in Table 3. By RT-PCR,we obtained cDNA sequences of both genes from Juniperus chinen-sis, Metasequoia glyptostroboides, Platycladus orientalis, Thuja occi-dentalis, Sequoia semperviens and Pinus armandii, and LFY fromAraucaria heterophylla. These cDNAs are identical to the exon se-quences of the genomic DNA obtained from the same species exceptfor 12 sitemutations in some species, which could be attributed todifferent individuals used for DNA and RNA extraction or PCRerrors. The LFY and NLY variable sites among the ve genera Callitr-opsis, Cupressus, Hesperocyparis, Juniperus and Xanthocyparis andamong the three genera Metasequoia, Sequoia and Sequoiadendronare shown in Fig. S2.

    Table 3The species harboring two distinct clones of LFY and NLY and the variability ofconspecic clones.

    Gene Species Twoconspecicclonescompared

    DNAvariablesites(Exon)

    AAvariablesites

    DNAvariablesites(Intron)

    Sharedsimilarity(full length)(%)

    LFY Cunninghamialanceolata

    3 vs. 5 4 2 14 99

    Cupressusatlantica

    4 vs. 7 4 2 8 99

    Fitzroyacupressoides(4)

    1 vs. 7 23 10 52 94

    Hesperocyparisbakeri

    1 vs. 7 8 2 26 98

    Hesperocyparismacrocarpa

    1 vs. 5 11 10 28 99

    Juniperussaltillensis

    3 vs. 9 6 3 10 99

    Z.-Y. Yang et al. /Molecular PhylogeneticSequoiasempervirens(6)

    1 vs. 3 9 2 29 98

    Xanthocyparisvietnamensis

    5 vs. 6 0 0 12 99.5

    NLY Cupressusdupreziana

    1 vs. 8 3 1 52 98

    Cupressusfunebris

    1 vs. 3 11 3 67 96

    Fitzroyacupressoides(4)

    2 vs. 3 3 0 17 99

    Hesperocyparisbakeri

    1 vs. 3 3 2 17 99

    Hesperocyparismacrocarpa

    2 vs. 6 2 1 27 99

    Juniperuschinensis (4)

    1 vs. 5 16 6 104 97

    Junipersusaltillensis

    1 vs. 3 6 2 46 98

    Sequoiasempervirens(6)

    1 vs. 8 2 1 20 98

    Thuja plicata 1 vs. 4 7 3 33 99Widdringtonianodiora

    2 vs. 3 2 0 14 99

    Xanthocyparis 1 vs.11 9 5 88 97

    vietnamensisThe obtained sequence of LFY varied from 2170 bp (Araucariaheterophylla) to 3201 bp (Austrocedrus chilensis), containing threeexons (partial exon 1, exon 2 and partial exon 3) and two introns.The exon length of this gene ranged from 1078 bp (Thujopsis dolab-rata) to 1123 bp (Cedrus deodara), except for the short sequences(889 bp) obtained from Metasequoia glyptostroboides and Sequoia-dendron giganteum with the primer pair LFYE1F4 + LFYE3R4. Afterexcluding some unalignable regions, the alignment of LFY exonsfor analysis was 1168 bp, of which 628 were variable and 540 wereparsimony informative. The PCR-amplied NLY gene also containedthree exons and two introns (partial exons 1 and 3), with a totallength ranging from 3116 bp (Austrocedrus chilensis) to 4335 bp(Cephalotaxus sinensis). The sequence alignment of its exons forphylogenetic analyses was 1063 bp (578 variable sites, 407 parsi-mony-informative sites).

    Direct sequencing of the matK gene failed in Juniperus chinensis,Libocedrus plumosa, Xanthocyparis vietnamensis and Araucaria hete-rophylla, and matK pseudogenes with predicted early stop codonswere found after cloning and sequencing. Phylogenetic analysisshowed that the conspecic clones clustered together with highsupport values (trees not shown), and thus the pseudogenes wereexcluded from further analysis. The sequence alignment of thematK region was 1577 bp (containing 173 bp of the trnK intronand 1404 bp of the matK gene), of which 657 nucleotide sites werevariable and 414 were parsimony-informative. The aligned se-quence of the mitochondrial rps3 region, including the rps3 geneand partial sequences of its two anking genes rps19 and rpl16,was 2815 bp (excluding two highly variable regions that couldnot be reliably aligned), which included 531 variable sites and232 parsimony-informative characters.

    3.2. Southern blotting

    Southern blotting showed that both LFY and NLY have a singlelocus in conifers according to our study on three diploid (Cunningh-amia lanceolata, Pinus armandii, Taxus cuspidata var. nana) and onehexaploid (Sequoia semperviens) species. In each restriction digestof genomic DNA, only one band was detected when the LFY andNLY probes were used, respectively, except that two signals ofLFY were observed in the Xba I digest of C. lanceolata (Figs. 1 andS3). We found that Xba I has no recognition site in the probe se-quence or the LFY region of C. lanceolata covered by the probe,and one recognition site in the downstream of the probe regionand at the same position (intron 2) of the two distinct LFY clones(alleles) obtained from this species. Thus, the two LFY signals ofC. lanceolata with great difference in molecular weight could becaused by another recognition site of Xba I in the upstream ofLFY that is heterozygous for its two alleles. To verify this inference,we further blotted the genomic DNA of C. lanceolata digested byEcoR V and Hind III, respectively, with a LFY probe (cover exon2exon 3) amplied from this species, and observed a single signalfrom each digestion (Fig. S3). The above evidence suggests that theLFY gene also exists as a single locus in C. lanceolata.

    3.3. Phylogenetic analysis

    The MP analyses generated 56 equally most-parsimonious trees(tree length = 2119, CI = 0.479, RI = 0.8021, including uninforma-tive characters) for the LFY gene, and 144 (tree length = 1598,CI = 0.5463, RI = 0.7857) for the NLY gene. Except for the positionof Welwitschia in NLY trees and some poorly resolved clades, theMP and BI trees generated are almost identical to the ML treesfor both LFY and NLY (Fig. 2). According to the NLY gene, Welwits-

    s and Evolution 64 (2012) 452470 457chia was clustered with Zamia in the MP trees, but with Pinaceaein ML and BI trees, although the support values are very low. Also,

  • Cunn

    eticexcept a little difference in the position of Podocarpaceae, the treetopologies of the two sister genes LFY and NLY are highly congruentin revealing the inter- and intra-family relationships of conifers.

    Fig. 1. Southern blot hybridization of the genomic DNAs of Sequoia sempervirens and56 C. Numbers on the right indicate DNA molecular weight marker.

    458 Z.-Y. Yang et al. /Molecular PhylogenFor example, Pinaceae is monophyletic and sister to the otherconifers, among which CephalotaxaceaeTaxaceae is sister toCupressaceae s.l. (Fig. 2). In particular, the seven subfamilies ofCupressaceae s.l. recognized by Gadek et al. (2000) are highly sup-ported. In addition, the distinct clones from a same species form astrongly supported monophyletic clade, except for LFY of Hespero-cyparis bakeri, Hesperocyparis macrocarpa and Fitzroya, and NLY ofXanthocyparis, Cupressus dupreziana, Hesperocyparis macrocarpaand Thuja plicata (Figs. 2 and S4). In the LFY tree, Diselma wasnested into the Fitzroya cupressoides clones with high support val-ues, and in the NLY tree, the clone Xanthocyparis vietnamensis-11forms a sister relationship with the HesperocyparisCallitropsisX.vietnamensis-1 clade (Figs. 2B and S4B).

    The parsimony analysis of the combined nuclear genes gener-ated 16 equally most-parsimonious trees (tree length = 1368,CI = 0.6768, RI = 0.8531), which are nearly identical to the ML andBI trees except that Fokienia and Chamaecyparis have basal positionsin Cupressoideae on the MP tree. The ML tree, with bootstrap per-centages of MP and ML as well as Bayesian posterior probabilities,is shown in Fig. S5, in which the relationships within Sequoioideae,positions of Austrocedrus and Papuacedrus, and the relationshipsamong Calocedrus, Tetraclinis and PlatycladusMicrobiota are stillpoorly resolved, as in the separate nuclear gene trees (Fig. 2).

    Based on the matK gene, 16 equally most-parsimonious trees(tree length = 1232, CI = 0.6972, RI = 0.8399) were obtained, whichare topologically identical to the ML and BI trees (see Fig. 3, the MLtopology). Compared to the combined nuclear gene tree (Fig. S5),positions of three genera, i.e., Athrotaxis, Austrocedrus and Caloce-drus, are different, but none of them are strongly supported(Fig. 3). Even though the mitochondrial rps3 gene has the lowest le-vel of phylogenetic signal among the data sets, it also provided arelatively good resolution for some intergeneric relationships ofCupressaceae s.l. (Fig. 4).The parsimony analysis of the combined LFY + NLY +matK gen-erated six equally most-parsimonious trees (tree length = 2627,CI = 0.6829, RI = 0.8399), which are highly congruent with the ML

    inghamia lanceolatawith the Sequoia-LFY probe (A) and the Sequoia-NLY probe (B) at

    s and Evolution 64 (2012) 452470and BI trees in topology except for some poorly resolved clades(see Fig. S6, the ML topology). The intergeneric relationships ofCupressaceae s.l. revealed by the combined LFY + NLY +matK arethe same as those by the combined LFY + NLY (Fig. S5), and supportvalues for some clades increased. Based on the combined LFY +NLY +matK + rps3, we also obtained six equally most-parsimonioustrees (tree length = 3390, CI = 0.7068, RI = 0.8379) that are topolog-ically identical to the ML and BI trees (Fig. 5, the ML topology). Theintergeneric relationships of Cupressaceae s.l. revealed by the com-bined LFY + NLY +matK + rps3 (Fig. 5) are nearly the same as thoseby the combined LFY + NLY + matK (Fig. S6) except for the relation-ship between Thuja + Thujopsis and Fokienia + Chamaecyparis.

    3.4. Network analysis

    Three condence networks (at the 95% condence level) ofCupressaceae s.l. were constructed based on sequences of the com-bined LFY + NLY, matK and rps3 genes, respectively. In the com-bined LFY + NLY network, two reticulations were found (Fig. 6).That is, Sequoia and Cupressus appeared to be recombinants be-tween Metasequoia and Sequoiadendron, and between Juniperusand the HesperocyparisCallitropsisXanthocyparis clade, respec-tively. In addition, reticulation events could have occurred in thebasal groups of the Nothern Hemisphere Cupressaceae s.s., i.e.,among FokieniaChamaecyparis, Thuja and Thujopsis (Fig. 6). In con-trast, no reticulation was found in the networks based on the cyto-plasmic genes matK and rps3 (Fig. S7).

    3.5. Divergence time estimation

    The estimated divergence times based on different data sets cor-respond well with each other when topologies are consistent,although the standard errors estimated by Multidivtime (Bayes

  • eticZ.-Y. Yang et al. /Molecular Phylogenmethod) are a little larger than those by PL in r8s (Table 4). TheMultidivtime estimates based on the combined LFY + NLY +matK + rps3 are shown in Fig. 7. The results of molecular datingshowed that all genera of the traditional Taxodiaceae originatedin the Jurassic or lower Cretaceous and most genera of Cupressa-ceae s.s. diverged in the upper Cretaceous or Tertiary, which arehighly congruent with the earliest fossil record (Fig. 7).

    3.6. Biogeographic analysis

    The results of ancestral range reconstruction for Cupressaceaes.s. are shown in Fig. 8A (based on the combined LFY + NLY + -matK + rps3), which were compared with the modern distributionof its genera and the break-up history of Gondwana (Fig. 8B, mod-ied from Sanmartin and Ronquist, 2004). Since several alternativeancestral patterns were suggested for each node, we only focusedon the ancestral distributions with the highest relative probabili-ties for the nodes (Fig. 8A, nodes 1, 5, 6, 7, 8 and 9) consistently re-solved in different gene trees (Figs. 2, 3 and 5). Although the nodesare congruent based on different data sets (LFY, NLY, matK, com-bined LFY + NLY +matK + rps3), the S-DIVA method suggested dif-ferent ancestral ranges for some nodes (Table 5).

    Fig. 2. ML trees of gymnosperms constructed from LFY (A) and NLY (B). Numbers associlines indicate Bayesian posterior probabilities greater than 0.90. The letter c and numrespectively.s and Evolution 64 (2012) 452470 4594. Discussion

    4.1. High congruence among different gene trees and the phylogeniesof Cupressaceae s.l. and gymnosperms

    To date, cpDNA and nrDNA are still the most widely usedmolecular markers for studying phylogenetic relationships of gym-nosperms (e.g., Chaw et al., 1997; Cheng et al., 2000; Quinn et al.,2002; Rydin et al., 2002; Little, 2006; Rai et al., 2008; Lin et al.,2010). The use of single or low copy nuclear gene markers mayprovide more phylogenetic information and be very helpful to testthe reliability of phylogenies based on cpDNA and nrDNA, but it isvery difcult in gymnosperms that are remarkable for a large nu-clear genome, and highly complex gene families (Kinlaw andNeale, 1997; Murray, 1998; Leitch et al., 2001). In particular, mostsingle copy nuclear genes reported in angiosperms exist as genefamilies in gymnosperms (Kinlaw and Neale, 1997). Thus, distin-guishing orthologs from paralogs is the rst and most importantthing for using nuclear genes in the gymnosperm phylogeneticreconstruction.

    In this study, although a LFY pseudogene was found in Diselmaarcheri, direct sequencing and cloning of both genomic DNA and

    ated with branches are bootstrap percentages of MP and ML greater than 50%. Boldbers following species names denote sequences from cDNA and clone numbers,

  • etic460 Z.-Y. Yang et al. /Molecular PhylogencDNA, together with Southern blotting (Figs. 1 and S3), stronglysuggest that both LFY and NLY exist as single copy in conifers, evenin the polyploid species. None of the studied species harbors morethan two distinct clones of LFY or NLY, and the results of our South-ern blot analyses (Figs. 1 and S3) are consistent with the previousnding that LFY has a single locus in Gnetum parvifolium (Shindoet al., 2001), Pinus radiata (Mellerowicz et al., 1998) and Pinuscaribaea var. caribaea (Dornelas and Rodriguez, 2005). Therefore,the two sister genes LFY or NLY are promising candidate markers

    Fig. 3. The ML tree of Cupressaceae s.l. inferred from the chloroplastmatK gene. Numbersand Bayesian posterior probabilities greater than 0.90, respectively. Numbers followings and Evolution 64 (2012) 452470for studying phylogenetic and evolutionary relationships ofgymnosperms, in addition to the use of cytoplasmic DNA.

    Despite different inheritance pathways of nuclear and organellegenomes in the cypress family, the phylogenetic trees generatedfrom the nuclear LFY and NLY, chloroplast matK and mitochondrialrps3 genes are highly congruent in topology except for severalpoorly resolved clades in the rps3 tree (Figs. 24). Our results cor-roborate several important ndings by previous morphological,anatomical and cpDNA analyses of Cupressaceae s.l. (Brunsfeld

    associated with branches are bootstrap percentages of MP and ML greater than 50%species names denote the clone numbers.

  • eticZ.-Y. Yang et al. /Molecular Phylogenet al., 1994; Gadek et al., 2000; Kusumi et al., 2000; Quinn et al.,2002; Schulz and Stutzel, 2007), including the basal position ofCunninghamia, the close relationships among Metasequoia, Sequoiaand Sequoiadendron and among Cryptomeria, Glyptostrobus andTaxodium, and Cupressaceae s.s. as a monophyletic group derivedfrom the traditional Taxodiaceae and sister to the CryptomeriaGlyptostrobusTaxodium clade. In particular, the three genome-based phylogeny (Figs. 25, S5 and S6) strongly supports the clas-sication of Cupressaceae s.l. into seven subfamilies by Gadek et al.(2000) or six subfamilies by Farjn (2005). The six subfamilies rec-ognized by Farjn (2005) include Cunninghamioideae, Taiwanioi-deae, Athrotaxidoideae, Sequoioideae containing Metasequoia,Sequoia and Sequoiadendron, Taxodioideae comprising Cryptomeria,Glyptostrobus and Taxodium, and Cupressoideae containing all theremaining genera (Cupressaceae s.s.). In contrast, as the only differ-ence from Farjn (2005), Gadek et al. (2000) recognized two sub-families in Cupressaceae s.s., i.e., Cupressoideae and Callitroideaeoccurring in the Northern and Southern Hemispheres, respectively,based both on morphological and cpDNA evidence. In addition,our study resolved six subclades in Cupressaceae s.s., i.e.ThujaThujopsis, FokieniaChamaecyparis, PlatycladusMicrobiota,CupressusJuniperusHesperocyparisCallitropsisXanthocyparis,LibocedrusPilgerodendron, and ActinostrobusCallitrisNeocallitrop-osisDiselmaFitzroyaWiddringtonia, and they are strongly sup-ported in the nuclear, chloroplast and combined gene trees (Figs.

    Fig. 4. The ML tree of Cupressaceae s.l. inferred from the mitochondrial rps3 gene. The braoutgroup and ingroup. Numbers associated with branches are bootstrap percentages of Mrespectively.s and Evolution 64 (2012) 452470 46125, S5 and S6). Therefore, the high topological congruence amongdifferent gene trees implies that we may have recovered the spe-cies phylogeny of Cupressaceae s.l.

    However, there are still some discrepancies among the differentgene trees (Figs. 25), which include the systematic positions ofAthrotaxis, Papuacedrus and Tetraclinis, the relationships withinSequoioideae, between FokieniaChamaecyparis and ThujaThujop-sis, and among Cupressus, Juniperus and HesperocyparisCallitropsisXanthocyparis, as well as the intrageneric relationships of Cupressusand Hesperocyparis. These discrepancies could be attributed toinsufcient resolution of the molecular markers, historical hybrid-ization, incomplete lineage sorting or ancient radiation (Maddison,1997; Wendel and Doyle, 1998; Whiteld and Lockhart, 2007),which need further studies.

    The relationships among Cupressus, Juniperus, Hesperocyparis,Callitropsis and Xanthocyparis have been discussed in previousstudies (Little et al., 2004; Little, 2006; Adams et al., 2009). A closerelationship between Xanthocyparis vietnamensis and Callitropsisnootkatensis was suggested by some previous morphological stud-ies (Farjn, 2005; Debreczy et al., 2009) and molecular evidencefrom nrDNA ITS and NLY intron 2 (Little et al., 2004; Little, 2006).This relationship is weakly supported by some gene trees yieldedin the present study (Figs. 2 and S5). It may be worthy to investi-gate whether X. vietnamensis has originated from hybridization,considering the occurrence of two distinct NLY gene clones in this

    nch length of outgroups is not shown to scale because of the great distance betweenP and ML greater than 50% and Bayesian posterior probabilities greater than 0.90,

  • etic462 Z.-Y. Yang et al. /Molecular Phylogenspecies (Figs. 2B and S4B) and its inconsistent positions in differentgene trees (Figs. 25). According to the nuclear genes LFY and NLYas well as the chloroplast matK gene, three monophyletic groups(Cupressus, Juniperus, and HesperocyparisCallitropsisXanthocyp-aris) are strongly supported (Figs. 2 and 3). However, relationshipsof the three subclades are incongruent among different data sets(Fig. S8). Cupressus is strongly supported as sister to Juniperus bythe two single-copy nuclear genes (LFY and NLY), while, in contrast,it forms a sister group with the HesperocyparisCallitropsisXantho-cyparis clade in both cpDNA (matK, Fig. 3; petN-psbM, Fig. S8G) andnrDNA ITS (Fig. S8A) trees. It is interesting that Cupressus showssimilarity to both Juniperus and the HesperocyparisCallitropsisXanthocyparis clade in the nucleotide sequence of LFY and NLY(Fig. S2). Moreover, the network analysis indicates that Cupressushas evolved from a recombination between the two clades(Fig. 6). Based on all the above evidence, we infer that Cupressusmight have originated through hybridization between Juniperus

    Fig. 5. The ML tree of Cupressaceae s.l. generated from the combined LFY + NLY +matK +ML greater than 50% and Bayesian posterior probabilities greater than 0.90, respectivelys and Evolution 64 (2012) 452470and the ancestor of HesperocyparisCallitropsisXanthocyparis. Thetetraploid Fitzroya and the hexaploid Sequoia also have discordantpositions in different gene trees, which will be discussed later.

    To improve the understanding of evolutionary relationshipswithin the gymnosperms, great efforts have been undertaken inthe past two decades (e.g., Chaw et al., 1997, 2000; Gugerli et al.,2001; Quinn et al., 2002; Rydin et al., 2002; Hajibabaei et al.,2006; Rai et al., 2008; Ran et al., 2010a). There have been great de-bates on the position of Gnetales (see review in Ran et al., 2010a); asister relationship between it and Pinaceae or conifers is supportedby most molecular phylogenetic analyses (e.g., Soltis et al., 1999;Chaw et al., 1997, 2000; Donoghue and Doyle, 2000; Gugerliet al., 2001; Hajibabaei et al., 2006; McCoy et al., 2008; Rai et al.,2008; Ran et al., 2010a). In this study, we found that the pair of an-cient duplicated genes LFY and NLY are very informative in recon-structing the phylogeny of gymnosperms. The two gene trees havenearly identical topologies such as the basal positions of Araucari-

    rps3 genes. Numbers associated with branches are bootstrap percentages of MP and.

  • eticZ.-Y. Yang et al. /Molecular Phylogenaceae and Podocarpaceae in Conifer II (non-Pinaceae conifers) andthe sister relationship between Taxaceae + Cephalotaxaceae andCupressaceae s.l. (Fig. 2), which are in accordance with the resultsof most previous molecular studies (e.g., Chaw et al., 1997; Gugerliet al., 2001; Quinn et al., 2002; Rydin et al., 2002; Rai et al., 2008;Ran et al., 2010a). The phylogenetic position of Welwitschia is notconsistent among the ML, MP and BI trees of the NLY gene as men-tioned previously, and thus still needs more studies to resolve(Fig. 2B), since we failed to amplify this gene from the other twogenera of Gnetales. Nevertheless, the LFY gene phylogeny strongly

    Fig. 6. A condence reticulate network constructed based on the combined LFY

    Table 4Estimated divergence times of the consistent nodes among different gene trees of Cupress

    Nodea Calibration matK(MA) LFY(MA)

    Multidivtime PL Multidivtime PL

    1 Min = 192Max = 237 230.76 5.65 237 228.91 7.23 237

    2 220.10 9.79 198.59 3.85 220.75 10.01 208.903 211.47 10.35 187.83 4.19 213.32 11.24 197.644 201.14 10.51 178.68 3.90 206.00 12.12 186.735 Min = 140 153.48 10.09 140.06 3.24 163.74 16.87 160.386 183.32 11.60 163.18 4.11 187.14 14.09 163.167 148.91 13.29 136.28 7.49 125.45 14.53 127.848 Min = 99 116.66 13.08 109.27 10.54 112.21 10.76 111.909 165.49 11.79 149.17 4.31 168.76 16.32 143.02

    10 137.77 10.72 126.47 3.86 145.91 17.47 119.8111 Min = 61.7 71.43 8.30 62.39 2.45 78.11 13.66 82.0712 115.37 8.46 107.92 3.15 112.52 10.83 95.9513 Min = 95 100.10 4.84 95.00 103.40 8.12 95.0014 49.34 15.61 49.39 22.41 63.70 16.92 57.8015 46.27 12.75 45.46 6.75 51.70 17.01 49.5816 149.95 13.01 139.22 6.09 143.28 18.28 133.9717 102.55 20.71 101.13 13.70 65.83 25.16 64.0118 140.10 14.26 130.88 7.52 128.87 18.78 122.5919 91.45 21.07 75.61 15.31 91.82 24.65 107.4720 100.67 18.05 93.62 8.98 88.59 18.68 80.5421 39.74 16.73 33.96 8.62 45.54 15.66 40.1322 Min = 33.9 65.17 17.41 56.14 7.80 46.13 11.05 35.1923 47.01 15.09 47.43 7.28 32.73 10.76 26.44

    a Node numbers correspond to those in Fig. 7, except that node 18 is the stem groups and Evolution 64 (2012) 452470 463supports Gnetales as a sister group of conifers and the sister rela-tionship between Pinaceae and Conifer II (Fig. 2A). Furthermore,introns of LFY and NLY can be well aligned among closely relatedgymnospermous genera and have good inter- and intrageneric res-olution (Fig. S4; Little, 2006; Won and Renner, 2006; Peng andWang, 2008; Wei et al., 2010). Thus, these two genes could be usedto investigate the evolutionary relationships of gymnosperms atvarious taxonomic levels in the future. In particular, the LFY genemay have the potential to become a nuclear DNA barcode of landplants (Ran et al., 2010b).

    and NLY gene sequences of Cupressaceae s.l. (at the 95% condence level).

    aceae s.l.

    NLY(MA) LFY + NLY + matK + rps3(MA)

    Multidivtime PL Multidivtime PL

    226.57 8.88 237 233.45 3.25 237 4.40 217.99 11.18 198.12 6.19 198.82 12.91 197.66 2.54 4.96 211.56 11.98 188.92 7.01 191.89 11.37 187.51 2.54 5.86 204.24 12.74 181.87 6.69 183.80 10.62 180.36 2.33 11.83 177.04 18.76 163.55 13.96 147.18 7.48 142.04 3.56 7.42 188.28 14.25 161.49 7.78 161.75 8.77 162.48 2.40 14.49 127.96 16.57 113.15 13.04 139.62 8.43 127.05 5.83 12.68 112.06 11.81 99.18 1.19 111.69 9.18 101.75 5.06 8.69 178.16 14.96 154.75 8.55 152.63 7.75 149.94 2.53 10.28 153.66 17.35 122.20 9.56 126.30 5.71 122.74 2.37

    13.55 75.99 12.52 62.03 2.10 68.87 5.69 61.7 14.58 132.84 16.66 102.72 12.86 107.06 3.59 105.31 1.61

    106.50 10.50 95.00 97.46 2.34 95.00 13.05 60.74 21.28 48.53 30.95 72.23 6.89 66.92 6.34 14.49 40.27 20.26 27.77 11.24 63.38 10.36 37.87 5.21 10.09 130.65 22.63 118.49 11.04 143.91 7.10 133.00 4.74

    40.09 73.86 26.09 82.60 30.65 111.93 11.45 97.40 8.79 10.64 119.92 22.28 111.76 11.81 139.38 7.39 127.54 5.25 14.40 78.78 26.09 90.53 19.99 106.93 11.45 80.40 8.88

    10.40 93.77 22.39 78.47 11.31 99.81 11.19 85.42 6.20 9.03 41.61 18.75 32.51 11.26 57.42 12.56 34.22 5.50 2.60 55.60 16.92 38.31 6.56 47.15 8.98 36.89 3.32 5.37 42.11 15.24 31.31 7.64 39.92 9.03 29.07 3.87

    of Thuja and Thujopsis in the LFY gene tree.

  • etic464 Z.-Y. Yang et al. /Molecular Phylogen4.2. Evolution of the LFY and NLY genes with implications for the originand evolution of polyploids in gymnosperms

    Polyploidy, as a common model of evolution and speciation(Soltis et al., 2010), is particularly prominent in plants (Soltis andSoltis, 2009; Van de Peer et al., 2009; Wood et al., 2009; Ainoucheand Jenczewski, 2010). The genome sequences or genome-scaledata, coupled with cytogenetic and phylogenetic databases,have further shown that polyploidy is far more prevalent than ex-pected (Cui et al., 2006; Meyers and Levin, 2006; Doyle et al., 2008;Hegarty and Hiscock, 2008; Wood et al., 2009; Soltis et al., 2010),and that 47100% of owering plants and most extant ferns mightbe derived from ancient polyploidy (Wood et al., 2009). However,compared to other vascular plants, many fewer natural polyploidshave been reported from gymnosperms, and all natural polyploid

    Fig. 7. The combined gene (LFY + NLY +matK + rps3) topology showing divergence timerecord of each genus. The fossil and modern distributions of the Southern HemispherCaledonia, NZ: New Zealand, AF: Africa, NG: New Guinea, SA: Southern America, NA: Ns and Evolution 64 (2012) 452470conifers, including the two tetraploids Fitzroya cupressoides and Juni-perus chinensis Ptzeriana and the hexaploid Sequoia sempervirens,belong to Cupressaceae s.l. (Khoshoo, 1959; Ahuja, 2005). Thus, therarity of polyploids found in Cupressaceae s.l. implied a differentpolyploidization pathway from angiosperms that might provideadditional clues for understanding of plant polyploidization.

    As discussed earlier, both LFY and NLY generally exist as a singlelocus in gymnosperms. Although functions of the two sister genesin gymnosperms are still poorly understood, comparative spatio-temporal patterns of their expression in the three conifer generaPicea, Podocarpus and Taxus suggest a functional divergence be-tween them. That is, they can be expressed simultaneously in a sin-gle reproductive axis, initially overlapping but later in mutuallyexclusive primordia and/or groups of developing cells in bothfemale and male structures (Vazquez-Lobo et al., 2007). In

    s of Cupressaceae s.l. estimated by Multidivtime associated with the earliest fossile Cupressaceae s.s. are indicated on the right. A: Australia, T: Tasmania, NC: Neworthern America, M: Moluccas.

  • Fig. 8. Ancestral range reconstruction for Cupressaceae s.s. based on the combined LFY + Nthe break-up history of Gondwana (B) modied from Sanmartin and Ronquist (2004) (onlalternative ancestral ranges obtained by S-DIVA. The divergence times shown for some nC: New Caledonia; D: New Zealand; E: New Guinea and Moluccas; F: Africa; G: Asia; H:

    Table 5The highest relative probabilities of ancestral ranges for the consistent nodes amongdifferent gene trees of the Southern Hemisphere Cupressaceae s.s.

    Datasets Ancestral range of each nodea

    1 6 7 8 9

    matK ABCDEFG ACF ABF AB ACLFY ABCDEFG ABCF BF B ACNLY ABEG A ABF AB ALFY + NLY +matK + rps3 ABCDEFG ABCF ABF AB C

    a Node numbers and denition of geographical area correspond to those in Fig. 8.

    Z.-Y. Yang et al. /Molecular Phylogenetics and Evolution 64 (2012) 452470 465angiosperm, the LFY gene regulates expression of the MADS-boxgenes responsible for oral-meristem identity (Soltis et al., 2002),and also occurs as a single copy, although two paralogs of LFY havebeen found in recent polyploids (Bomblies and Doebley, 2005;Esumi et al., 2005). While LFY has been successfully used in inves-tigating the origin and evolution of some hybrid species of seedplants (Oh and Potter, 2003; Wei et al., 2010), allopolyploid speci-ation (Kim et al., 2008), and reticulate evolution (Peng and Wang,2008), its sister gene NLY has also shown good resolution for inter-specic relationships of Cupressus (Little, 2006). In particular, the

    LY +matK + rps3 genes compared with the modern distribution of its genera (A) andy four genera of Cupressoideae are shown). Pie charts at nodes show probabilities ofodes are same as those in Fig. 7. A: Australia including Tasmania; B: South America;North America; I: Europe.

  • eticorigin of Pseudotsuga wilsoniana from interspecic hybridizationwas successfully revealed by the distribution of two distinct LFYgene types in this species and the phylogenetic tree of this gene(Wei et al., 2010).

    Sequoia semperviens is a hexaploid (2n = 6x = 66) occurring insouthwest Oregon and northwest California. According to morpho-logical and cytological studies, several hypotheses were proposedfor its origin (Stebbins, 1948; Li, 1987, 1988; Ahuja and Neale,2002; Ahuja, 2005). Stebbins (1948) inferred that Sequoia origi-nated as an allopolyploid by hybridization between Metasequoiaand some probably extinct taxodiaceous plant. Li (1987, 1988) sug-gested Metasequoia and Sequoiadendron or ancestors of the twogenera as the parental species of Sequoia. However, Ahuja andNeale (2002) thought that Sequoia could be an autohexaploid,autoallohexaploid or segemental allohexapliod. According to thepresent study, Sequoia is clustered with Metasequoia glyptostrobo-ides (a species historically widespread but currently conned toSichuan and Hubei, China) in the LFY tree (Fig. 2A) but withSequoiadendron giganteum (distributed in the Sierra Nevada Moun-tains of California, USA) in the NLY tree (Fig. 2B), which are congru-ent with the sequence characters of the two nuclear genes (Fig. S2).Network analysis based on the combined nuclear genes stronglysupports that Sequoia has originated from a recombination be-tween Metasequoia and Sequoiadendron. Based on previous cpDNAphylogenies (Brunsfeld et al., 1994; Tsumura et al., 1995; Kusumiet al., 2000) and the present chloroplast matK and mitochondrialrps3 gene trees (Figs. 3 and 4), Sequoia is sister to Sequoiadendron.The above evidence implies that Sequoiadendron is the paternalancestor of Sequoia if this genus really originated by hybridizationbetween Metasequoia and Sequoiadendron or ancestors of the twogenera, a hypothesis suggested by Li (1987, 1988). The hybrid ori-gin of Sequoia is also supported by the fossil record of Metasequoiain Northern America (Stockey et al., 2001; Farjn, 2005). Unfortu-nately, clues for the origin of Sequoia may have been blurred inthe long evolutionary history, given that the earliest fossil recordof this genus can be dated back to the early Cretaceous (Penny,1947; Ma et al., 2005; Farjn, 2005). Nevertheless, the inconsistentrelationships among Metasequoia, Sequoia and Sequoiadendron re-vealed by different data sets (Figs. 24) could be an important signof reticulate evolution among the three genera, even though it isdifcult to deduce how and when Sequoia originated.

    Interestingly, the variability of conspecic clones of LFY or NLYis higher in the two tetraploids Juniperus chinensis Ptzerianaand Fitzroya cupressoides than in other diploid species. Juniperuschinensis Ptzeriana has 22 pairs of chromosomes in the meioticcells (Sax and Sax, 1933) and its origin by hybridization betweenJ. chinensis and J. sabina was suggested by RAPDs (Le Duc et al.,1999). Two distinct NLY clones were obtained from genomicDNA of this cultivar, which differ in 16 sites in a total exonlength of 1009 bp (Table 3). The two gene members might havefunction, since both of them were also found in the cDNA. Fitz-roya cupressoides is endemic to temperate forests in southernSouth America, with a chromosome number of 2n = 44 (Hair,1968). This species harbors two distinct LFY sequences that differin 23 sites in a total exon length of 1081 bp (share 94% identity),while its two NLY clones share high similarity and are sister toeach other (Table 3; Figs. 2 and S4). According to the fact thatthe two LFY clones of F. cupressoides do not form a sister relation-ship and the close relationship between Fitzroya and Diselma(Figs. 24 and S4c), together with divergence time estimation(Table 4), we infer that the monotypic genus Fitzroya might haveoriginated from hybridization between Diselma and an extinctgroup, and F. cupressoides could be an allotetraploid. However,

    466 Z.-Y. Yang et al. /Molecular Phylogenit is difcult to distinguish between ancient allopolyploid andautopolyploid speciation only based on a couple of genes. Morestudies are still needed to investigate the evolutionary historyand stabilization process of these interesting coniferouspolyploids.

    Diploidization is important to the stabilization of neopolyploids(Ramsey and Schemske, 2002), during which the preservation ofduplicated gene copies appears to be nonrandom, with some genesbeing duplicated and reduplicated whereas others being iterativelyreturned to singleton status (Blanc andWolfe, 2004). In the presentstudy, we did not nd redundant copies of LFY or NLY in conifers,even not in the hexaploid Sequoia, although the two genes origi-nated from an ancient gene duplication. However, two distinctcopies are maintained in LFY of Fitzroya cupressoides and NLY ofJuniperus chinensis, which might represent distant alleles des-cended from the putative parents of the tetraploid species. Themaintenance of different genes in different groups could havecaused by genetic drift and diploidization of the polyploids, duringwhich some loci retain contributions from both parents and otherretain alleles only from one parent (Wolfe, 2001; Doyle et al.,2008). Given the key regulatory function of LFY and NLY in thedevelopment of reproductive organs in gymnosperm (Vazquez-Lobo et al., 2007), the two sister genes will very likely return to sin-gleton status following genome/gene duplication.

    4.3. Divergence times and biogeography of Cupressaceae s.s.: Furtherevidence for Southern Hemisphere biogeography

    Southern Hemisphere biogeography has drawn tremendousinterest from biologists and geologists (Sanmartin and Ronquist,2004; Knapp et al., 2005; Barker et al., 2007; Upchurch, 2008),and is considered a typical vicariance scenario responsible for thetransoceanic disjunctions of biota that developed by the sequentialbreakup of the Gondwana supercontinent during the last 165 Myr(McLoughlin, 2001; Sanmartin and Ronquist, 2004).

    However, based on molecular estimates and more accuratepaleogeographic reconstruction, recent studies indicate that dis-persal has also played an important role in shaping these biogeo-graphical patterns (Givnish and Renner, 2004; Sanmartin andRonquist, 2004). In particular, unlike the situation in animals, bio-geographical histories of plants, especially angiosperm groups suchas Nothofagus, Proteaceae and Restionaceae, show less evidence foror are only partially consistent with the timing of the Gondwananbreakup (Linder et al., 2003; Knapp et al., 2005; Barker et al., 2007).That is, a more complicated Southern Hemisphere biogeography isbeing revealed by integrating evidences from molecular dating,paleontology and ecology, challenging the classic Gondwana par-adigm (Crisp et al., 2011).

    Biogeographical reconstruction of the ancient conifer familyCupressaceae s.l., especially Cupressaceae s.s., could shed somelight on Southern Hemisphere biogeography and the history ofGondwana. The molecular dating by Li and Yang (2002) indicatedthat the divergence of major lineages of Taxodiaceae occurred inthe Jurassic, and that the Northern and Southern Hemisphereclades of Cupressaceae s.s. (Cupressoideae and Callitroideae) di-verged at least 124 MA. Regretfully, in their study, most generaof Cupressaceae s.s were excluded from the molecular clock analy-sis due to substitution rate heterogeneity. In the present study, weused multiple calibration points (Fig. 7) considering the great im-pact of fossil calibration on posterior time estimates (Inoue et al.,2010), and performed the relaxed molecular clock analysis forCupressaceae s.l. based on nuclear, chloroplast and combined genedata sets. It is very interesting that the results yielded from differ-ent data sets correspond well with each other (Table 4). Moreover,the estimated times of unconstrained nodes correspond well withthe oldest fossil record (Fig. 7). Here we use results of the Bayesian

    s and Evolution 64 (2012) 452470analysis (multidivtime) for further discussion, since this method,compared with the PL method by r8s, provides a powerful frame-work for integrating fossil information (Inoue et al., 2010).

  • to disperse from Africa to North America in the Cretaceous.The divergence times of New Zealand and New Caledonian Lib-

    eticThe divergence times of all main lineages of the Taxodiaceae,including Cunninghamia (node 2), Taiwania (node 3), Sequoioideae(node 4), and Taxodioideae (node 7), can be dated back to the Juras-sic or even earlier (Fig. 7 and Table 4), which are highly accordantwith the fossil record (summarized in Miller, 1977; Farjn, 2005).The congruence also supports the idea that the present distributionof this traditional family is a relic of a much more widespreadoccurrence in the past (Miller, 1977; Li and Yang, 2002; Farjn,2005). The split between Callitroideae and Cupressoideae, node 9(Fig. 7), can also be dated back to the Jurassic (165.49 11.79,168.76 16.32, 178.16 14.96 and 152.63 7.75 MA based onmatK, LFY, NLY and combined LFY + NLY +matK + rps3, respectively,Table 4), providing strong evidence for the separation of two sub-families by the spilt of Laurasia and Gondwana (Li, 1953; Li andYang, 2002) that is supported by the ancestral range reconstruction(Fig. 8, node 1; Table 5).

    The Greater Cape, one of the global biodiversity hotspots, hasbeen identied as a combination of ancient species repositoryand hot-bed of recent radiation in ora (Warren and Hawkins,2006; Verboom et al., 2009). Widdringtonia is endemic to SouthernAfrica, and extends from the Cape to Malawi. Based on its oldestfossil record in North America (McIver, 2001) and the divergencebetween it and the FitzroyaDiselma clade that was dated back to65 MA, Warren and Hawkins (2006) supported the hypothesis ofMcIver (2001) thatWiddringtonia originated in Laurasia in the earlyCretaceous and later migrated to Africa. However, this hypothesisis very unlikely to be true considering our ndings that all generaof Cupressaceae s.s. occurring in the Southern Hemisphere com-prise a strongly supported clade in all gene trees (Figs. 25). Widd-ringtonia forms a clade with the two monotypic genera Fitzroya andDiselma, which are endemic to the Southern Andes and Tasmania,respectively, and the clade is sister to the Australian clade compris-ing Callitris, Actinostrobus and Neocallitropsis (Figs. 2, 3 and 5).These phylogenetic relationships and biogeographic histories ofthe six genera (Fig. 8A, node 6) are generally congruent with thebreak-up history of Gondwana (Fig. 8B). That is, the separation ofEast and West Gondwana at 165130 MA (McLoughlin, 2001) ledto the divergence between the two clades CallitrisActinostrobusNeocallitropsis and WiddringtoniaFitzroya (Diselma will be dis-cussed later) (Fig. 7, node 12) that can be dated back to the earlyCretaceous (matK, 115.37 8.46 MA; LFY, 112.52 10.83 MA;NLY, 132.84 16.66 MA; combined LFY + NLY +matK + rps3,107.06 3.59 MA, Table 4). Also, the fact that the split betweenWiddringtonia and Fitzroya-Diselma occurred at least 95 MAaccording to the oldest fossil of Widdringtonia from the TuscaloosaFormation of Alabama (McIver, 2001) is generally consistent withthe nal separation of Africa from Southern America at about105 MA (McLoughlin, 2001). The above evidence, coupled withancestral range reconstruction (Fig. 8, nodes 6 and 7; Table 5),may suggest that vicariance is mainly responsible for this biogeo-graphic pattern.

    Of great interest is the grouping of the Tasmanian Diselma withFitzroya (Figs. 2 and 3). As discussed earlier, the tetraploid Fitzroyacould have originated by hybridization with Diselma as one parent.It would be reasonable to see the sister relationship between thetwo genera if Diselmawas the paternal ancestor, given the predom-inantly paternal cytoplasmic inheritance in Cupressaceae s.l. (re-viewed by Mogensen, 1996). That is, the maternal ancestor ofFitzroya migrated into Tasmania from South America and hybrid-ized with Diselma, giving rise to the putative allotetraploid Fitzroya,which migrated back to South America by the connection betweenAustralia and South America through Antarctic (5235 MA)(McLoughlin, 2001; Sanmartin and Ronquist, 2004). This inference

    Z.-Y. Yang et al. /Molecular Phylogenis supported by the ancestral range reconstruction (Fig. 8, node8), estimate of the divergence time between Fitzroya and Diselma(Fig. 7, node 14) (matK, 49.34 15.61 MA; LFY, 63.70 16.92 MA;ocedrus, South American Pilgerodendron, New Guinean Papuacedrusand South American Austrocedrus were dated back to the Creta-ceous, even early Cretaceous (Fig. 7 and Table 4), suggesting theirorigins before the separation of those islands or continents fromAustralia (McLoughlin, 2001). These genera might have a widerdistribution in the past according to fossil record (Hill andBrodribb, 1999; Farjn, 2005; Fig. 7).

    Based on robust phylogenetic reconstruction, molecular dating,ancestral range reconstruction and fossil record, our study pro-vides some evidence for the biogeographic history of Cupressaceaes.l. and generally supports some hypotheses about the evolutionaryhistory of continents such as the separation of Gondwana fromLaurasia in the Jurassic and the break-up process of Gondwana(McLoughlin, 2001). However, our inferences are very preliminarydue to the limitation of sampling at the genus level, many extinc-tion events, and the very complicated biogeographic and geologicalhistory in the Southern Hemisphere (McLoughlin, 2001; Sanmartinand Ronquist, 2004; Crisp et al., 2011). More samples at the specieslevel are needed in future biogeographic studies of the SouthernHemisphere Cupressaceae s.s., especially those species occurringin Australia, New Caledonia and New Zealand.

    Acknowledgments

    We are indebted to Profs. Christopher Quinn (Royal BotanicGardens of Australia), Robert P. Adams (Baylor University, USA),and Peter Hollingsworth (Royal Botanic Garden Edinburgh) fortheir great help in sampling the genera of Cupressaceae s.s. ende-mic to the Southern Hemisphere and America. We thank Drs.Dan Peng and Qiao-Ping Xiang (Institute of Botany, Chinese Acad-emy of Sciences), Drs. Maurizio Rossetto and Carolyn Porter (Bota-nic Gardens Trust in Sydney), Dr. Shou-Zhou Zhang (ShenzhenFairyLake Botanical Garden, China), and the Royal Botanic Garden,Kew (UK) for providing some samples for DNA analysis; Dr. Ken-neth H. Wolfe for suggestions on polyploid evolution; Dr. HuiGao for assistance with lab work; Drs. Qiang Zhang and Fu-ShengYang for help in molecular dating and biogeographic analyses;Ms. Wan-Qing Jin and Rong-Hua Liang for their assistance inDNA sequencing. We also thank the handling editor and the twoanonymous reviewers for their insightful comments and sugges-tions on the manuscript. This work was supported by the NationalNLY, 60.74 21.28 MA; combined LFY + NLY +matK + rps3,72.23 6.89 MA), and the rich fossils of Fitzroya found in Tasmania(Hill and Whang, 1996; Hill and Paull, 2003; Paull and Hill, 2010).However, it cannot be completely ruled out that the current distri-butions of the three genera Widdringtonia, Fitzroya and Diselmahave resulted from long distance dispersal and extinction of someof their close relatives, ifWiddringtoniawas distributed in Australiaor its neighboring islands. Also, it is more parsimonious that bothparents of Fitzroya originally occurred in Tasmania and hybridized,and then the generated tetraploid Fitzroya migrated into SouthAmerica.

    One may wonder why the fossil of Widdringtonia was found inNorth America (McIver, 2001). Low latitude connection betweenBrazil and equatorial Africa might have been maintained until119105 MA due to translational movement of the continentsalong the Guinea Fracture Zone (McLoughlin, 2001), while NorthAmerica and South America became connected in the middle Cre-taceous (100 MA) and then separated in the early Eocene (Hayet al., 1999). These connections made it possible for Widdringtonia

    s and Evolution 64 (2012) 452470 467Natural Science Foundation of China (Grant Nos. 31170197,30730010, 30425028) and the Chinese Academy of Sciences (the100-Talent Project).

  • eticAppendix A. Supplementary material

    Supplementarydata associatedwith this article canbe found, in theonline version, at http://dx.doi.org/10.1016/j.ympev.2012.05.004.

    References

    Adams, P.R., Bartel, J.A., Price, R.A., 2009. A new genus, Hesperocyparis, for thecypresses of the Western Hemisphere (Cupressaceae). Phytologia 91,160185.

    Ahuja, M.R., 2005. Polyploidy in gymnosperms: revisited. Silvae Genet. 54,5969.

    Ahuja, M.R., Neale, D.B., 2002. Origins of polyploidy in coast redwood (Sequoiasempervirens (D. Don) Endl.) and relationship of coast redwood to other generaof Taxodiaceae. Silvae Genet. 51, 93100.

    Ahuja, M.R., Neale, D.B., 2005. Evolution of genome size in conifers. Silvae Genet. 54,126137.

    Ainouche, M.L., Jenczewski, E., 2010. Focus on polyploidy. New Phytol. 186, 14.Aulenback, K.R., LePage, B.A., 1998. Taxodium wallisii sp. nov.: rst occurrence of

    Taxodium from the Upper Cretaceous. Int. J. Plant Sci. 159, 367390.Barker, N.P., Weston, P.H., Rutschmann, F., Sauquet, H., 2007. Molecular dating of

    the Gondwanan plant family Proteaceae is only partially congruent with thetiming of the break-up of Gondwana. J. Biogeogr. 34, 20122027.

    Blanc, G., Wolfe, K.H., 2004. Functional divergence of duplicated genes formed bypolyploidy during Arabidopsis evolution. Plant Cell 16, 16791691.

    Bomblies, K., Doebley, J.F., 2005. Molecular evolution of FLORICAULA/LEAFYorthologs in the Andropogoneae (Poaceae). Mol. Biol. Evol. 22, 10821094.

    Brunsfeld, S.J., Soltis, P.S., Soltis, D.E., Gadek, P.A., Quinn, C.J., Strenge, D.D., Ranker,T.A., 1994. Phylogenetic relationships among the genera of Taxodiaceae andCupressaceae: evidence from rbcL sequences. Syst. Bot. 19, 253262.

    Chaw, S.M., Zharkikh, A., Sung, H.M., Lau, T.C., Li, W.H., 1997. Molecular phylogenyof extant gymnosperms and seed plant evolution: analysis of nuclear 18S rRNAsequences. Mol. Biol. Evol. 14, 5668.

    Chaw, S.M., Parkinson, C.L., Cheng, Y.C., Vincent, T.M., Palmer, J.D., 2000. Seed plantphylogeny inferred from all three plant genomes: monophyly of extantgymnosperms and origin of Gnetales from conifers. Proc. Natl. Acad. Sci. USA97, 40864091.

    Cheng, Y.C., Nicolson, R.G., Tripp, K., Chaw, S.M., 2000. Phylogeny of Taxaceae andCephalotaxaceae genera inferred from chloroplast matK gene and nuclear rDNAITS region. Mol. Phylogenet. Evol. 14, 353365.

    Crisp, M.D., Trewick, S.A., Cook, L.G., 2011. Hypothesis testing in biogeography.Trends Ecol. Evol. 26, 6672.

    Cui, L.Y., Wall, P.K., Leebens-Mack, J.H., Lindsay, B.G., Soltis, D.E., Doyle, J.J., Soltis,P.S., Carlson, J.E., Arumuganathan, K., Barakat, A., Albert, V.A., Ma, H.,dePamphilis, C.W., 2006. Widespread genome duplications throughout thehistory of owering plants. Genome Res. 16, 738749.

    Debreczy, Z., Musial, K., Price, R.A., Rcz, I., 2009. Relationships and nomenclaturalstatus of the nootka cypress (Callitropsis nootkatensis, Cupressaceae). Phytologia91, 140158.

    Donoghue, M.J., Doyle, J.A., 2000. Seed plant phylogeny: demise of the anthophytehypothesis? Curr. Biol. 10, R106R109.

    Dornelas, M.C., Rodriguez, A.P.M., 2005. A FLORICAULA/LEAFY gene homolog ispreferentially expressed in developing female cones of the tropical pine Pinuscaribaea var. caribaea. Genet. Mol. Biol. 28, 299307.

    Doyle, J.J., 1992. Gene trees and species trees: molecular systematics as one-character taxonomy. Syst. Bot. 17, 144163.

    Doyle, J.J., Flagel, L.E., Paterson, A.H., Rapp, R.A., Soltis, D.E., Soltis, P.S., Wendel, J.F.,2008. Evolutionary genetics of genome merger and doubling in plants. Annu.Rev. Genet. 42, 443461.

    Drummond, A.J., Rambaut, A., 2007. BEAST: Bayesian evolutionary analysis bysampling trees. BMC Evol. Biol. 7, 214.

    Eckenwalder, J.E., 1976. Re-evaluation of Cupressaceae and Taxodiaceae: a proposedmerger. Madroo 23, 237256.

    Esumi, T., Tao, R., Yonemori, K., 2005. Isolation of LEAFY and TERMINAL FLOWER 1homologues from six fruit tree species in the subfamily Maloideae of theRosaceae. Sex. Plant Reprod. 17, 277287.

    Farjn, A., 2005. A Bibliography of Cupressaceae and Sciadopitys. Royal BotanicGardens, Kew.

    Farjn, A., Hiep, N.T., Harder, D.K., Loc, P.K., Averyanov, L., 2002. A new genus andspecies in Cupressaceae (Coniferales) from northern Vietnam, Xanthocyparisvietnamensis. Novon 12, 179189.

    Farris, J.S., Kallersjo, M., Kluge, A.G., Bult, C., 1994. Testing signicance ofincongruence. Cladistics 10, 315319.

    Felsenstein, J., 1985. Condence limits on phylogenies: an approach using thebootstrap. Evolution 39, 783791.

    Felsenstein, J., 1988. Phylogenies from molecular sequences: inference andreliability. Annu. Rev. Genet. 22, 521565.

    Florin, R., 1963. The distribution of conifer and taxad genera in time and space. ActaHort. Berg. 20, 121312.

    Frohlich, M.W., Meyerowitz, E.M., 1997. The search for ower homeotic genehomologs in basal angiosperms and Gnetales: a potential new source of data on

    468 Z.-Y. Yang et al. /Molecular Phylogenthe evolutionary origin of owers. Int. J. Plant Sci. 158, S131S142.Gadek, P.A., Quinn, C.J., 1993. An analysis of relationships within the Cupressaceae

    sensu stricto based on rbcL sequences. Ann. Mo. Bot. Gard. 80, 581586.Gadek, P.A., Alpers, D.L., Heslewood, M.M., Quinn, C.J., 2000. Relationships withinCupressaceae sensu lato: a combined morphological and molecular approach.Am. J. Bot. 87, 10441057.

    Gascuel, O., 1997. BIONJ: an improved version of the NJ algorithm based on a simplemodel of sequence data. Mol. Biol. Evol. 14, 685695.

    Givnish, T.J., Renner, S.S., 2004. Tropical intercontinental disjunctions: Gondwanabreakup, immigration from the boreotropics, and transoceanic dispersal. Int. J.Plant Sci. 165 (Suppl. 4), S1S6.

    Grifths, C.S., 1997. Correlation of functional domains and rates of nucleotidesubstitution in cytochrome b. Mol. Phylogenet. Evol. 7, 352365.

    Grob, G.B.J., Gravendeel, B., Eurlings, M.C.M., 2004. Potential phylogenetic utility ofthe nuclear FLORICAULA/LEAFY second intron: comparison with threechloroplast DNA regions in Amorphophallus (Araceae). Mol. Phylogenet. Evol.30, 1323.

    Gugerli, F., Sperisen, C., Buchler, U., Brunner, L., Brodbeck, S., Palmer, J.D., Qiu, Y.L.,2001. The evolutionary split of Pinaceae from other conifers: Evidence from anintron loss and a multigene phylogeny. Mol. Phylogenet. Evol. 21, 167175.

    Guindon, S., Gascuel, O., 2003. A simple, fast, and accurate algorithm to estimatelarge phylogenies by maximum likelihood. Syst. Biol. 52, 696704.

    Hair, J.B., 1968. The chromosomes of the Cupressaceae I. Tetraclineae andActinostrobeae (Callitroideae). New Zeal. J. Bot. 6, 277284.

    Hajibabaei, M., Xia, J., Drouin, G., 2006. Seed plant phylogeny: gnetophytes arederived conifers and a sister group to Pinaceae. Mol. Phylogenet. Evol. 40, 208217.

    Hall, T.A., 1999. BioEdit: a user-friendly biological sequence alignment editor andanalysis program for Windows 95/98/NT. Nucl. Acids Symp. Ser. 40, 9598.

    Hart, J.A., 1987. A cladistic analysis of conifers: preliminary results. J. ArnoldArboretum 68, 269307.

    Hay, W.W., DeConto, R.M., Wold, C.N., Wilson, K.M., Voigt, S.Schulz, M., Wold-Rossby, A., Dullo, W.C., Ronov, A.B., Balukhovsky, A.N., Soeding, E., 1999.Alternative global cretaceous paleogeography. In: Evolution of Cretaceousocean-climate systems. Special Paper 332. Geological Society of America, pp. 147.

    Hegarty, M.J., Hiscock, S.J., 2008. Genomic clues to the evolutionary success ofpolyploid plants. Curr. Biol. 18, R435R444.

    Hill, R.S., Brodribb, T.J., 1999. Turner review no. 2 Southern conifers in time andspace. Aust. J. Bot. 47, 639696.

    Hill, R.S., Paull, R., 2003. Fitzroya (Cupressaceae) macrofossils from Cenozoicsediments in Tasmania, Australia. Rev. Palaeobot. Palynol. 126, 145152.

    Hill, R.S., Whang, S.S., 1996. A new species of Fitzroya (Cupressaceae) fromOligocene sediments in north-western Tasmania. Aust. Syst. Bot. 9, 867875.

    Huson, D.H., Bryant, D., 2006. Application of phylogenetic networks in evolutionarystudies. Mol. Biol. Evol. 23, 254267.

    Inoue, J., Donoghue, P.C.J., Yang, Z.H., 2010. The impact of the representation offossil calibrations on Bayesian estimation of species divergence times. Syst. Biol.59, 7489.

    Khoshoo, T.N., 1959. Polyploidy in gymnosperms. Evolution 13, 2439.Kim, S.T., Sultan, S.E., Donoghue, M.J., 2008. Allopolyploid speciation in Persicaria

    (Polygonaceae): insights from a low-copy nuclear region. Proc. Natl. Acad. Sci.USA 105, 1237012375.

    Kinlaw, C.S., Neale, D.B., 1997. Complex gene families in pine genomes. Trends PlantSci. 2, 356359.

    Knapp, M., Stockler, K., Havell, D., Delsuc, F., Sebastiani, F., Lockhart, P.J., 2005.Relaxed molecular clock provides evidence for long-distance dispersal ofNothofagus (southern beech). PLoS Biol. 3, 3843.

    Kumar, S., 2005. Molecular clocks: four decades of evolution. Nat. Rev. Genet. 6,654662.

    Kumar, S., Tamura, K., Nei, M., 2004. MEGA3: integrated software for molecularevolutionary genetics analysis and sequence alignment. Brief. Bioinform. 5,150163.

    Kusumi, J., Tsumura, Y., Yoshimaru, H., Tachida, H., 2000. Phylogenetic relationshipsin Taxodiaceae and Cupressaceae sensu stricto based on matK gene, chlL gene,trnL-trnF IGS region, and trnL intron sequences. Am. J. Bot. 87, 14801488.

    Kusumi, J., Tsumura, Y., Yoshimaru, H., Tachida, H., 2002.Molecular evolution of nucleargenes in Cupressaceae, a group of conifer trees. Mol. Biol. Evol. 19, 736747.

    Kvacek, Z., 2002. A new juniper from the Palaeogene of Central Europe. Fedd. Repert.113, 492502.

    Le Duc, A., Adams, R.P., Zhong, M., 1999. Using random amplication ofpolymorphic DNA for a taxonomic reevaluation of Ptzer Junipers.HortScience 34, 11231125.

    Leitch, I.J., Hanson, L., Wineld, M., Parker, J., Bennett, M.D., 2001. Nuclear DNA C-values complete familial representation in gymnosperms. Ann. Bot. 88, 843849.

    Li, H.L., 1953. Present distribution and habitats of the