phylogenomic revisit for green contribution to diatoms
TRANSCRIPT
Phylogenomic Revisit for Green Contribution to Diatoms
Ahmed Moustafa1, Klaus Valentin2, Debashish Bhaacharya3
June 28, 2013
The Molecular Life of Diatoms
image credit: Atsuko Tanaka, Christian Sardet, Sebastien Colin, and Diana Sarno!
1American University in Cairo, Egypt 2Alfred Wegener Institute, Germany 3Rutgers University, USA
Eukaryotic Tree of Life [eTOL] “Supergroups”
[e.g., diatoms and dinoflagellates]
Reyes-Prieto et al., Annu. Rev. Genet. 2007. 41:147–68
Prochlorococcus
Origin of Photosynthesis & Endosymbiosis
Arabidopsis Chlamydomonas
Cyanidioschyzon
Emiliania Karenia
Cyanophora
“Chromalveolate hypothesis” Cavalier-Smith 1999
images: micro*scope (http://microscope.mbl.edu)!
chimeric carotenoid pathway in diatoms 70% red and 30% green!
Frommolt et al. Mol Biol Evol. 2008 Dec;25(12):2653-67.
Why do we see green genes in diatoms?
Horizontal Gene Transfer
(HGT)
Endosymbiotic Gene Transfer
(EGT)
“Chromalveolate hypothesis” ???
Phaeodactylum http://genome.jgi-‐psf.org
Thalassiosira http://www.awi.de
Non-vertical gene transfer
Reyes-Prieto et al. Annu. Rev. Genet. 2007. 41:147–68
Detection of Non-vertical (H/EGT) Gene Transfer
Ho: Gene tree = Species (host) tree HA: Gene tree ≠ Species (host) tree
Moustafa, Bhattacharya, Allen. CIBEC 103,107, Dec. 2010.
Organisms Genes Nuclear + bacterial 3,744 5,544,637 Mitochondrial 1,611 23,228 Plastid 142 14,179
(Database: RefSeq + JGI + dbEST)
iTree – Phylogenomic Pipeline http://itree.sourceforge.net
¡ Search by topology and bootstrap!¡ Search for mandatory and optional clades,
all possible scenarios:!
Moustafa and Bhattacharya. BMC Evol Biol. 2008 Jan 15;8:6.
PhyloSort – Mining Phylogenetic Trees
€
nnC + n−1
nC + ...+ 1nC = r
nCr=1
n∑
• Migration!• Bug fixes!• New features!
Phaeodactylum nuclear-encoded
proteome (~ 10.5k)
Thalassiosira nuclear-encoded
proteome (~ 11.5k)
Step 1: phylogenomic screening Topological (red + green + diatoms + chromalveolates)
BLAST (e-value < 1E-5) à MAFFT à RAxML à PhyloSort
3,468 candidates 3,696 candidates
Step 2: phylogenomic screening Topological (as in Step 1) + Statistical (score ≥ 75%)
Alignments (from Step 1) à PhyML à PhyloSort
2,423 genes of potential red or green algal origin
2,533 genes of potential red or green algal origin
22% of the diatom nuclear gnome of red or green algal origin
0
500
1000
1500
2000
2500
3000
Phaeodactylum Thalassiosira Gene families
Viridiplantae Rhodophyta Unresolved
Contribution of Red and Green to Diatoms
Moustafa et al., Science. 2009
450 red : 1800 green à 17% of the diatom genome is green
Diatom Green Genes in the Green Lineages
Ostreococcus
Hervé Moreau
~7,000 genes
Chlamydomonas
Linda Amaral-Zettler
~15,000 genes
Prasinophytes Core Chlorophytes
¡ 75%: shared with prasinophytes
¡ 40%: prasinophytes are the closest green neighbor
¡ 25%: exclusively shared with prasinophytes
Plastids in Chromalveolates
Classic Hypothesis
Plastids in Chromalveolates
Proposed model
Lineage! Organism! Proteins! Dataset type!Chlorophyceae! Chlamydomonas reinhardtii! 14,332! Genomic!Chlorophyceae! Volvox carteri! 14,328! Genomic!Embryophyta! Arabidopsis thaliana! 32,779! Genomic!Embryophyta! Brachypodium distachyon! 24,100! Genomic!Embryophyta! Medicago truncatula! 44,823! Genomic!Embryophyta! Oryza sativa! 28,418! Genomic!Embryophyta! Physcomitrella patens! 35,544! Genomic!Embryophyta! Ricinus communis! 31,214! Genomic!Embryophyta! Selaginella moellendorffii! 33,146! Genomic!Embryophyta! Vitis vinifera! 23,349! Genomic!Embryophyta! Zea mays! 22,230! Genomic!Mamiellophyceae! Micromonas pusilla! 10,244! Genomic!Mamiellophyceae! Micromonas sp! 10,113! Genomic!Mamiellophyceae! Ostreococcus lucimarinus! 7,403! Genomic!Mamiellophyceae! Ostreococcus sp! 7,492! Genomic!Mamiellophyceae! Ostreococcus tauri! 7,965! Genomic!Trebouxiophyceae! Chlorella variabilis! 9,829! Genomic!Trebouxiophyceae! Coccomyxa subellipsoidea! 19,425! Genomic!
Lineage! Organism! Proteins! Dataset type!Bangiophyceae! Cyanidioschyzon merolae! 5,085! Genomic!2009!
Lineage! Organism! Proteins!Dataset source!Bangiophyceae! Cyanidioschyzon merolae! 5,085! Genomic!Bangiophyceae! Galdieria sulphuraria! 6,796! Genomic!Bangiophyceae! Porphyra purpurea! 223,550!Transcriptomic!Bangiophyceae! Porphyra umbilicalis! 123,661!Transcriptomic!Baniophyceae! Porphyridium purpureum! 23,277!Transcriptomic!Compsopogonophyceae!Boldia erythrosiphon! 80,535!Transcriptomic!Compsopogonophyceae!Rhodochaete parvula! 58,506!Transcriptomic!Florideophyceae! Calliarthron tuberculosum! 23,365!Transcriptomic!Florideophyceae! Chondrus crispus! 9,671! Genomic!
Lineage! Organism! Proteins! Dataset type!Chlorophyceae! Chlamydomonas reinhardtii! 14,332! Genomic!Chlorophyceae! Volvox carteri! 14,328! Genomic!Embryophyta! Arabidopsis thaliana! 32,779! Genomic!Embryophyta! Brachypodium distachyon! 24,100! Genomic!Embryophyta! Medicago truncatula! 44,823! Genomic!Embryophyta! Oryza sativa! 28,418! Genomic!Embryophyta! Physcomitrella patens! 35,544! Genomic!Embryophyta! Ricinus communis! 31,214! Genomic!Embryophyta! Selaginella moellendorffii! 33,146! Genomic!Embryophyta! Vitis vinifera! 23,349! Genomic!Embryophyta! Zea mays! 22,230! Genomic!Mamiellophyceae! Micromonas pusilla! 10,244! Genomic!Mamiellophyceae! Micromonas sp! 10,113! Genomic!Mamiellophyceae! Ostreococcus lucimarinus! 7,403! Genomic!Mamiellophyceae! Ostreococcus sp! 7,492! Genomic!Mamiellophyceae! Ostreococcus tauri! 7,965! Genomic!Trebouxiophyceae! Chlorella variabilis! 9,829! Genomic!Trebouxiophyceae! Coccomyxa subellipsoidea! 19,425! Genomic!
Lineage! Organism! Proteins! Dataset type!Bangiophyceae! Cyanidioschyzon merolae! 5,085! Genomic!
2013!
+8 red data!sets!
Phylogenomics of diatoms versus ToL
vertical transfer within the SAR clade? !
Red and green affiliations in diatoms
Red and green affiliations in chromalveolates
¡ In the different chromalveolate lineages, the ratio ≈ 2 reds : 3 greens!
¡ The major green neighbor lineage is the prasinophytes!
¡ Distribution of red and green genes is similar across chromalveolates with red or green plastids.!
> 10 genomes à ≈ 100k proteins à phylogenomics à ≈ 100k ML trees
-‐ve
+ve
S
A
H
C
SA
SH
SC
AH
AC
HC
SAH
SAC
SHC
AHC
SAHC
0
25
0
50
0
75
0
10
00
Cla
de
s
Phylogenetic AffiliationRhodophyta
Viridiplantae
LineagesS: Stramenopiles
A: Alveolates
H: Haptophytes
C: Cryptophytes
Shared protein families
Shared and lineage-specific red and green genes
297
70
249
434
218
232
24
50
954
289170 326
97
142
344
S A
H C
94
20
129
242
175
111
6
19
595
14765 145
27
74
222
S A
H C
S: Stramenopiles A: Alveolates H: Haptophytes C: Cryptophytes
p-value << 0.001
Phylogenetic placement on a reference tree Phytoene dehydrogenase!
The four diatoms!
Plastids in Chromalveolates
Proposed model
Loss of red plastid!
¡ If the shared red genes transferred through endosymbiosis then why not the more abundant green genes? !
¡ There is no compelling reason to reject the hypothesis of cryptic green plastid in the ancestor of the chromalveolates.!
¡ These two endosymbioses (red and green) supplied the chromalveolates with the genetic potential to become the most successful marine primary producers and protist supergroup on our planet.!
¡ Next: are there outstanding metabolic trends in terms of the red and green composition? Exclusively red or green pathways? Chimeric pathways?!
Summary
94
20
129
242
175
111
6
19
595
14765 145
27
74
222
S A
H C
297
70
249
434
218
232
24
50
954
289170 326
97
142
344
S A
H C
background image: http://deepbluehome.blogspot.com/2011/01/psychedelic-diatoms.html!