Download - Jonathan Eisen slides for #HMP2010
![Page 1: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/1.jpg)
A phylogeny driven genomic encyclopedia of bacteria and
archaea
Jonathan A. EisenUC Davis
Talk for HMP2010September 2, 2010
![Page 2: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/2.jpg)
![Page 3: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/3.jpg)
Social Networking in Science
![Page 4: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/4.jpg)
Bacterial evolve
![Page 5: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/5.jpg)
Progress in Genome Sequencing
From http://genomesonline.org
![Page 6: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/6.jpg)
Progress in Genome Sequencing
From http://genomesonline.org
![Page 7: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/7.jpg)
Progress in Genome Sequencing
From http://genomesonline.org
![Page 8: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/8.jpg)
Way Back Machine - 2002
![Page 9: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/9.jpg)
Way Back Machine - 2002
454
![Page 10: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/10.jpg)
Way Back Machine - 2002
454
![Page 11: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/11.jpg)
Way Back Machine - 2002
454
Illumina
![Page 12: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/12.jpg)
Way Back Machine - 2002
454
Illumina
![Page 13: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/13.jpg)
Way Back Machine - 2002
454
Illumina
Solid
![Page 14: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/14.jpg)
Way Back Machine - 2002
454
Illumina
Solid
![Page 15: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/15.jpg)
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
• At least 40 phyla of bacteria
2002
Based on Hugenholtz, 2002
![Page 16: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/16.jpg)
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
• At least 40 phyla of bacteria
• Genome sequences are mostly from three phyla
Based on Hugenholtz, 2002
2002
![Page 17: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/17.jpg)
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
• At least 40 phyla of bacteria
• Genome sequences are mostly from three phyla
• Some other phyla are only sparsely sampled
Based on Hugenholtz, 2002
2002
![Page 18: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/18.jpg)
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
• At least 40 phyla of bacteria
• Genome sequences are mostly from three phyla
• Some other phyla are only sparsely sampled
Based on Hugenholtz, 2002
2002
![Page 19: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/19.jpg)
Why Increase Phylogenetic Coverage?
• Common approach within some eukaryotic groups (FGP, NHGRI, etc)
• Many successful small projects to fill in bacterial or archaeal gaps
• Phylogenetic gaps in bacterial and archaeal projects commonly lamented in literature
• Many potential benefits
![Page 20: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/20.jpg)
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
• At least 40 phyla of bacteria
• Genome sequences are mostly from three phyla
• Some other phyla are only sparsely sampled
• Solution I: sequence more phyla
• NSF-funded Tree of Life Project
• A genome from each of eight phyla
Eisen & Ward, PIs
![Page 21: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/21.jpg)
![Page 22: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/22.jpg)
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
• At least 40 phyla of bacteria
• Genome sequences are mostly from three phyla
• Some other phyla are only sparsely sampled
• Still highly biased in terms of the tree
• NSF-funded Tree of Life Project
• A genome from each of eight phyla
Eisen & Ward, PIs
![Page 23: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/23.jpg)
Major Lineages of Actinobacteria2.5.1 Acidimicrobidae2.5.1.1 Unclassified2.5.1.2 "Microthrixineae2.5.1.3 Acidimicrobineae2.5.1.4 BD2-102.5.1.5 EB10172.5.2 Actinobacteridae2.5.2.1 Unclassified2.5.2.10 Ellin306/WR1602.5.2.11 Ellin50122.5.2.12 Ellin50342.5.2.13 Frankineae2.5.2.14 Glycomyces2.5.2.15 Intrasporangiaceae2.5.2.16 Kineosporiaceae2.5.2.17 Microbacteriaceae2.5.2.18 Micrococcaceae2.5.2.19 Micromonosporaceae2.5.2.2 Actinomyces2.5.2.20 Propionibacterineae2.5.2.21 Pseudonocardiaceae2.5.2.22 Streptomycineae2.5.2.23 Streptosporangineae2.5.2.3 Actinomycineae2.5.2.4 Actinosynnemataceae2.5.2.5 Bifidobacteriaceae2.5.2.6 Brevibacteriaceae2.5.2.7 Cellulomonadaceae2.5.2.8 Corynebacterineae2.5.2.9 Dermabacteraceae2.5.3 Coriobacteridae2.5.3.1 Unclassified2.5.3.2 Atopobiales2.5.3.3 Coriobacteriales2.5.3.4 Eggerthellales2.5.4 OPB412.5.5 PK12.5.6 Rubrobacteridae2.5.6.1 Unclassified2.5.6.2 "Thermoleiphilaceae2.5.6.3 MC472.5.6.4 Rubrobacteraceae
2.5 Actinobacteria2.5.1 Acidimicrobidae2.5.1.1 Unclassified2.5.1.2 "Microthrixineae2.5.1.3 Acidimicrobineae2.5.1.3.1 Unclassified2.5.1.3.2 Acidimicrobiaceae2.5.1.4 BD2-102.5.1.5 EB10172.5.2 Actinobacteridae2.5.2.1 Unclassified2.5.2.10 Ellin306/WR1602.5.2.11 Ellin50122.5.2.12 Ellin50342.5.2.13 Frankineae2.5.2.13.1 Unclassified2.5.2.13.2 Acidothermaceae2.5.2.13.3 Ellin60902.5.2.13.4 Frankiaceae2.5.2.13.5 Geodermatophilaceae2.5.2.13.6 Microsphaeraceae2.5.2.13.7 Sporichthyaceae2.5.2.14 Glycomyces2.5.2.15 Intrasporangiaceae2.5.2.15.1 Unclassified2.5.2.15.2 Dermacoccus2.5.2.15.3 Intrasporangiaceae2.5.2.16 Kineosporiaceae2.5.2.17 Microbacteriaceae2.5.2.17.1 Unclassified2.5.2.17.2 Agrococcus2.5.2.17.3 Agromyces2.5.2.18 Micrococcaceae2.5.2.19 Micromonosporaceae2.5.2.2 Actinomyces2.5.2.20 Propionibacterineae2.5.2.20.1 Unclassified2.5.2.20.2 Kribbella2.5.2.20.3 Nocardioidaceae2.5.2.20.4 Propionibacteriaceae2.5.2.21 Pseudonocardiaceae2.5.2.22 Streptomycineae2.5.2.22.1 Unclassified2.5.2.22.2 Kitasatospora2.5.2.22.3 Streptacidiphilus2.5.2.23 Streptosporangineae2.5.2.23.1 Unclassified2.5.2.23.2 Ellin51292.5.2.23.3 Nocardiopsaceae2.5.2.23.4 Streptosporangiaceae2.5.2.23.5 Thermomonosporaceae2.5.2.3 Actinomycineae2.5.2.4 Actinosynnemataceae2.5.2.5 Bifidobacteriaceae2.5.2.6 Brevibacteriaceae2.5.2.7 Cellulomonadaceae2.5.2.8 Corynebacterineae2.5.2.8.1 Unclassified2.5.2.8.2 Corynebacteriaceae2.5.2.8.3 Dietziaceae2.5.2.8.4 Gordoniaceae2.5.2.8.5 Mycobacteriaceae2.5.2.8.6 Rhodococcus2.5.2.8.7 Rhodococcus2.5.2.8.8 Rhodococcus2.5.2.9 Dermabacteraceae2.5.2.9.1 Unclassified2.5.2.9.2 Brachybacterium2.5.2.9.3 Dermabacter2.5.3 Coriobacteridae2.5.3.1 Unclassified2.5.3.2 Atopobiales2.5.3.3 Coriobacteriales2.5.3.4 Eggerthellales2.5.4 OPB412.5.5 PK12.5.6 Rubrobacteridae2.5.6.1 Unclassified2.5.6.2 "Thermoleiphilaceae2.5.6.2.1 Unclassified2.5.6.2.2 Conexibacter2.5.6.2.3 XGE5142.5.6.3 MC472.5.6.4 Rubrobacteraceae
![Page 24: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/24.jpg)
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
• At least 40 phyla of bacteria
• Genome sequences are mostly from three phyla
• Some other phyla are only sparsely sampled
• Same trend in Archaea
• NSF-funded Tree of Life Project
• A genome from each of eight phyla
Eisen & Ward, PIs
![Page 25: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/25.jpg)
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
• At least 40 phyla of bacteria
• Genome sequences are mostly from three phyla
• Some other phyla are only sparsely sampled
• Same trend in Eukaryotes
• NSF-funded Tree of Life Project
• A genome from each of eight phyla
Eisen & Ward, PIs
![Page 26: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/26.jpg)
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
• At least 40 phyla of bacteria
• Genome sequences are mostly from three phyla
• Some other phyla are only sparsely sampled
• Same trend in Viruses
• NSF-funded Tree of Life Project
• A genome from each of eight phyla
Eisen & Ward, PIs
![Page 27: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/27.jpg)
Progress in Genome Sequencing
From http://genomesonline.org
![Page 28: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/28.jpg)
• At least 40 phyla of bacteria
• Genome sequences are mostly from three phyla
• Some other phyla are only sparsely sampled
• Solution: Really Fill in the Tree
• GEBA• A genomic
encyclopedia of bacteria and archaea
Eisen & Ward, PIs
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
![Page 29: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/29.jpg)
GEBA Pilot Project Overview
• Identify major branches in rRNA tree for which no genomes are available
• Identify branches with a cultured representative in DSMZ
• DSMZ grew > 200 of these and prepped DNA• Sequence and finish 100 (covering breadth of
bacterial/archaea diversity)• Annotate, analyze, release data• Assess benefits of tree guided sequencing• 1st paper Wu et al in Nature Dec 2009
![Page 30: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/30.jpg)
GEBA Pilot Project: Components• Project overview (Phil Hugenholtz, Nikos Kyrpides, Jonathan Eisen,
Eddy Rubin, Jim Bristow)• Project management (David Bruce, Eileen Dalin, Lynne Goodwin)• Culture collection and DNA prep (DSMZ, Hans-Peter Klenk)• Sequencing and closure (Eileen Dalin, Susan Lucas, Alla Lapidus, Mat
Nolan, Alex Copeland, Cliff Han, Feng Chen, Jan-Fang Cheng)• Annotation and data release (Nikos Kyrpides, Victor Markowitz, et al)• Analysis (Dongying Wu, Kostas Mavrommatis, Martin Wu, Victor
Kunin, Neil Rawlings, Ian Paulsen, Patrick Chain, Patrik D’Haeseleer, Sean Hooper, Iain Anderson, Amrita Pati, Natalia N. Ivanova, Athanasios Lykidis, Adam Zemla)
• Adopt a microbe education project (Cheryl Kerfeld)• Outreach (David Gilbert)• $$$ (DOE, DSMZ, GBMF)
![Page 31: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/31.jpg)
GEBA Lesson 1
rRNA Tree is Useful for Identifying Phylogenetically Novel Organisms
![Page 32: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/32.jpg)
rRNA Tree of Life
FIgure from Barton, Eisen et al. “Evolution”, CSHL Press.
Based on tree from Pace NR, 2003.
Archaea
Eukaryotes
Bacteria
![Page 33: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/33.jpg)
Network of Life
Figure from Barton, Eisen et al. “Evolution”, CSHL Press.
Based on tree from Pace NR, 2003.
Archaea
Eukaryotes
Bacteria
![Page 34: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/34.jpg)
“Whole Genome” Tree w/ AMPHORA
http://bobcat.genomecenter.ucdavis.edu/AMPHORA/See Wu and Eisen, Genome Biology 2008 9: R151
http://itol.embl.de/
Analogous to method of Ciccarelli et al.
![Page 35: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/35.jpg)
Compare PD in rRNA and WGT
![Page 36: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/36.jpg)
PD of rRNA, Genome Trees Similar
From Wu et al. 2009 Nature 462, 1056-1060
![Page 37: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/37.jpg)
GEBA Lesson 2
Phylogeny-driven genome selection helps discover new genetic diversity
![Page 38: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/38.jpg)
Network of Life
FIgure from Barton, Eisen et al. “Evolution”, CSHL Press.
Based on tree from Pace NR, 2003.
Archaea
Eukaryotes
Bacteria
![Page 39: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/39.jpg)
Protein Family Rarefaction Curves
• Take data set of multiple complete genomes• Identify all protein families using MCL• Plot # of genomes vs. # of protein families
![Page 40: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/40.jpg)
![Page 41: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/41.jpg)
![Page 42: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/42.jpg)
![Page 43: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/43.jpg)
![Page 44: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/44.jpg)
![Page 45: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/45.jpg)
Synapomorphies exist
![Page 46: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/46.jpg)
Phylogenetic Distribution Novelty: Bacterial Actin Related Protein
Haliangium ochraceum DSM 14365 Patrik D’haeseleer, Adam Zemla, Victor Kunin
!"#$%&'()*&& !"#$%&'(%()+"#,-.(/01 !"#*+,**'+(
2"#3)&4&*&& !"#*)$*),+%5"#$-.-6&0&1- !"#$%,$-%)(7"#0(1.8-9& !"#$''+-+,',!5"#:1,)*&$/0 !"#&$,%+)+-+
;"#01,&-*0 !"#%*+$--(<"#$-.-3.1%&0 !"#%',&'-+)
2"#$&*-.-1 !"#$'(-%%+&$="#$.1001 !"#-*$+$(&(>"#0$1,/%1.&0 !"#&$**+),)-!;"#01,&-*0 !"#*+,$*'(
5"#:1,)*&$/0 !"#&$,%+%-%%5"#$-.-6&0&1- !"#',&+$)*?"#@-%1*)A10(-. !"#&%'%&*%*B"#A1%%/0# "#%*,-&*'(2"#*-)').@1*0 !"#*-&'''(+5"#$-.-6&0&1- !"#',&&*&*?"#@-%1*)A10(-. !"#$)),)*%,;"#01,&-*0 !"#*+,$*),!;"#)$C.1$-/@ !"#&&),(*((-
."#,1(-*0 !"#$'-+*$((&!!"#(C1%&1*1 !"#$-,(%'+-!
5"#$-.-6&0&1- !"#$++-&%%!
?"#@-%1*)A10(-. !"#$)),),%)
?"#C1*0-*&&!"#&$-*$$(&$5"#$-.-6&0&1- !"#',&,$$%
5"#:1,)*&$/0 !"#&$,%+-,(,!5"#$-.-6&0&1- !"#$,+$(,&
?"#4&0$)&4-/@ !"#''-+&%$-
D"#01(&61 !"#$-&'*)%&+!!"#(C1%&1*1!"#$-%$ $),)
?"#@-%1*)A1(-. !"#$((&+,*-<"#@/0$/%/0 !"#&&'&%'*(,
((
')
$++$++
'*
$++
$++
)*
$++
$++
*$
((),
$++()
(%$++
)%
$++
-)
$++
+/*!
!"#$%
!&'(
!&')
!&'*
+!&'
!&',
!&'-
!&'.
!&'/
!&'(0
See also Guljamow et al. 2007 Current Biology.
![Page 47: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/47.jpg)
GEBA Lesson 3
Phylogeny-driven genome selection improves genome annotation
![Page 48: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/48.jpg)
Most/All Functional Prediction Improves w/ Better Phylogenetic Sampling
• Better definition of protein family sequence “patterns”• Greatly improves “comparative” and “evolutionary”
based predictions• Conversion of hypothetical into conserved
hypotheticals• Linking distantly related members of protein families• Improved non-homology prediction
Kostas Mavrommatis
Natalia Ivanova
Thanos Lykidis
Nikos Kyrpides
Iain Anderson
![Page 49: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/49.jpg)
GEBA Lesson 4
Metadata and individual genome papers important
![Page 50: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/50.jpg)
SIGS http://standardsingenomics.org/
![Page 51: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/51.jpg)
GEBA Lesson 5
Phylogeny-driven genome selection improves analysis of metagenome data
![Page 52: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/52.jpg)
Who is out there?
![Page 53: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/53.jpg)
rRNA phylotyping from metagenomics
Venter et al., 2004
![Page 54: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/54.jpg)
Shotgun Sequencing Allows Use of Alternative Anchors (e.g., RecA)
Venter et al., 2004
![Page 55: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/55.jpg)
0
0.1250
0.2500
0.3750
0.5000
Alphaproteobacteria
Betaproteobacteria
Gammaproteobacteria
Epsilonproteobacteria
Deltaproteobacteria
Cyanobacteria
Firmicutes
Actinobacteria
Chlorobi
CFB
Chloroflexi
Spirochaetes
Fusobacteria
Deinococcus-Thermus
Euryarchaeota
Crenarchaeota
Sargasso Phylotypes
Wei
ght
ed %
of
Clo
nes
Major Phylogenetic Group
EFGEFTuHSP70RecARpoBrRNA
Shotgun Sequencing Allows Use of Other Markers
Venter et al., 2004
![Page 56: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/56.jpg)
ABCDEFG
TUVWXYZ
Binning challenge
![Page 57: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/57.jpg)
ABCDEFG
TUVWXYZ
Binning challenge
Best binning method: reference genomes
![Page 58: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/58.jpg)
Reference Genomes Coming from Select Environment
![Page 59: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/59.jpg)
ABCDEFG
TUVWXYZ
Binning challenge
No reference genome? What do you do?
![Page 60: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/60.jpg)
ABCDEFG
TUVWXYZ
Binning challenge
No reference genome? What do you do?
Phylogeny ....
![Page 61: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/61.jpg)
Phylogenetic Binning Using AMPHORA
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Alph
apro
teob
acteria
Betapr
oteo
bacte
ria
Gammap
roteob
acteria
Deltap
roteob
acteria
Epsil
onpr
oteo
bacte
ria
Uncla
ssifie
d Pr
oteo
bacte
ria
Cyan
obac
teria
Chlam
ydiae
Acido
bacte
ria
Bacte
roide
tes
Actin
obac
teria
Aquif
icae
Planc
tomyc
etes
Spiro
chae
tes
Firmicu
tes
Chlor
oflex
i
Chlor
obi
Uncla
ssifie
d Ba
cteria
dnaGfrrinfCnusApgkpyrGrplArplBrplCrplDrplErplFrplKrplLrplMrplNrplPrplSrplTrpmArpoBrpsBrpsCrpsErpsIrpsJrpsKrpsMrpsSsmpBtsf
AMPHORA - each read on its own tree
![Page 62: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/62.jpg)
Phylogenetic Binning Using AMPHORA
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Alph
apro
teob
acteria
Betapr
oteo
bacte
ria
Gammap
roteob
acteria
Deltap
roteob
acteria
Epsil
onpr
oteo
bacte
ria
Uncla
ssifie
d Pr
oteo
bacte
ria
Cyan
obac
teria
Chlam
ydiae
Acido
bacte
ria
Bacte
roide
tes
Actin
obac
teria
Aquif
icae
Planc
tomyc
etes
Spiro
chae
tes
Firmicu
tes
Chlor
oflex
i
Chlor
obi
Uncla
ssifie
d Ba
cteria
dnaGfrrinfCnusApgkpyrGrplArplBrplCrplDrplErplFrplKrplLrplMrplNrplPrplSrplTrpmArpoBrpsBrpsCrpsErpsIrpsJrpsKrpsMrpsSsmpBtsf
AMPHORA - each read on its own tree
Limited in past by poor genomic sampling
![Page 63: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/63.jpg)
Metagenomic Analysis Improves w/ Phylogenetic Sampling
• Small but real improvements in–Gene identification / confirmation–Functional prediction–Binning–Phylogenetic classification
![Page 64: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/64.jpg)
Metagenomic Analysis Improves w/ Phylogenetic Sampling
• Small but real improvements in–Gene identification / confirmation–Functional prediction–Binning–Phylogenetic classification
• But not a lot ...
![Page 65: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/65.jpg)
GEBA Future 1
Need to adapt genomic and metagenomic methods to make use of
GEBA data
![Page 66: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/66.jpg)
Phylogenetic Binning Using AMPHORA
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Alph
apro
teob
acteria
Betapr
oteo
bacte
ria
Gammap
roteob
acteria
Deltap
roteob
acteria
Epsil
onpr
oteo
bacte
ria
Uncla
ssifie
d Pr
oteo
bacte
ria
Cyan
obac
teria
Chlam
ydiae
Acido
bacte
ria
Bacte
roide
tes
Actin
obac
teria
Aquif
icae
Planc
tomyc
etes
Spiro
chae
tes
Firmicu
tes
Chlor
oflex
i
Chlor
obi
Uncla
ssifie
d Ba
cteria
dnaGfrrinfCnusApgkpyrGrplArplBrplCrplDrplErplFrplKrplLrplMrplNrplPrplSrplTrpmArpoBrpsBrpsCrpsErpsIrpsJrpsKrpsMrpsSsmpBtsf
AMPHORA - each read on its own tree
Improves with better phylogenetic methods
![Page 67: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/67.jpg)
Improving Phylogeny for Metagenomic Reads
• Examples using reference trees– AMPHORA (Wu and Eisen)– PPlacer (Erik Matsen)– FastTree (Morgan Price)
• Variants– Use concatenated alignment of markers not just
individual genes (Steven Kembel)– Apply to OTU identification not just classification
(Thomas Sharpton)– CoBinning: look for linkage among fragments/genes
(Aaron Darling)
![Page 68: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/68.jpg)
Phylogenetic Binning Using AMPHORA
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Alph
apro
teob
acteria
Betapr
oteo
bacte
ria
Gammap
roteob
acteria
Deltap
roteob
acteria
Epsil
onpr
oteo
bacte
ria
Uncla
ssifie
d Pr
oteo
bacte
ria
Cyan
obac
teria
Chlam
ydiae
Acido
bacte
ria
Bacte
roide
tes
Actin
obac
teria
Aquif
icae
Planc
tomyc
etes
Spiro
chae
tes
Firmicu
tes
Chlor
oflex
i
Chlor
obi
Uncla
ssifie
d Ba
cteria
dnaGfrrinfCnusApgkpyrGrplArplBrplCrplDrplErplFrplKrplLrplMrplNrplPrplSrplTrpmArpoBrpsBrpsCrpsErpsIrpsJrpsKrpsMrpsSsmpBtsf
AMPHORA - each read on its own tree
Improves with more gene families
![Page 69: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/69.jpg)
Keep only the families with:
Universality * Evenness * monophyly >= 90*90*90
Phylogenetic group Genome Number Gene Number Maker Candidates
Archaea 62 145415 102
Actinobacteria 63 267783 136
Alphaproteobacteria 94 347287 142
Betaproteobacteria 56 266362 294
Gammaproteobacteria 126 483632 141
Deltaproteobacteria 25 102115 44
Epislonproteobacteria 18 33416 446
Bacteriodes 25 71531 179
Chlamydae 13 13823 561
Chloroflexi 10 33577 140
Cyanobacteria 36 124080 532
Firmicutes 106 312309 80
Spirochaetes 18 38832 72
Thermi 5 14160 727
Thermotogae 9 17037 646
![Page 70: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/70.jpg)
Phylogenetic Binning Using AMPHORA
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Alph
apro
teob
acteria
Betapr
oteo
bacte
ria
Gammap
roteob
acteria
Deltap
roteob
acteria
Epsil
onpr
oteo
bacte
ria
Uncla
ssifie
d Pr
oteo
bacte
ria
Cyan
obac
teria
Chlam
ydiae
Acido
bacte
ria
Bacte
roide
tes
Actin
obac
teria
Aquif
icae
Planc
tomyc
etes
Spiro
chae
tes
Firmicu
tes
Chlor
oflex
i
Chlor
obi
Uncla
ssifie
d Ba
cteria
dnaGfrrinfCnusApgkpyrGrplArplBrplCrplDrplErplFrplKrplLrplMrplNrplPrplSrplTrpmArpoBrpsBrpsCrpsErpsIrpsJrpsKrpsMrpsSsmpBtsf
AMPHORA - each read on its own tree
Improves with rebuilding gene family models
![Page 71: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/71.jpg)
Other Ways to Make Better Use of the Data
• Rebuild protein family models• Experiments from across the tree needed• Need better phylogenies, including HGT• Improved tools for using distantly related
genomes in metagenomic analysis• Better recording and sharing of metadata
about organisms
![Page 72: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/72.jpg)
GEBA Future 2
The dark matter of the biological universe
![Page 73: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/73.jpg)
rRNA Tree of Life
FIgure from Barton, Eisen et al. “Evolution”, CSHL Press.
Based on tree from Pace NR, 2003.
Archaea
Eukaryotes
Bacteria
![Page 74: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/74.jpg)
Phylogenetic Diversity: Sequenced Bacteria & Archaea
From Wu et al. 2009
![Page 75: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/75.jpg)
Phylogenetic Diversity with GEBA
From Wu et al. 2009
![Page 76: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/76.jpg)
Phylogenetic Diversity: Isolates
From Wu et al. 2009
![Page 77: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/77.jpg)
Phylogenetic Diversity: All
From Wu et al. 2009
![Page 78: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/78.jpg)
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
• At least 40 phyla of bacteria• Genome sequences are mostly
from three phyla• Most phyla with cultured
species are sparsely sampled• Lineages with no cultured
taxa even more poorly sampled
Well sampled phylaPoorly sampled
No cultured taxa
![Page 79: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/79.jpg)
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
• At least 40 phyla of bacteria• Genome sequences are mostly
from three phyla• Most phyla with cultured
species are sparsely sampled• Lineages with no cultured taxa
even more poorly sampled
Well sampled phyla
Poorly sampled
No cultured taxa
![Page 80: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/80.jpg)
Uncultured Lineages:Technical Approaches
• Get into culture• Enrichment cultures• If abundant in low diversity ecosystems• Flow sorting• Microbeads• Microfluidic sorting• Single cell amplification
![Page 81: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/81.jpg)
![Page 82: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/82.jpg)
MICROBES
![Page 83: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/83.jpg)
• At least 40 phyla of bacteria
• Genome sequences are mostly from three phyla
• Some other phyla are only sparsely sampled
• Solution: Really Fill in the Tree
• GEBA• A genomic
encyclopedia of bacteria and archaea
Eisen & Ward, PIs
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
![Page 84: Jonathan Eisen slides for #HMP2010](https://reader033.vdocuments.us/reader033/viewer/2022050613/556ec305d8b42adb678b4628/html5/thumbnails/84.jpg)
GEBA Pilot Project: Components• Project overview (Phil Hugenholtz, Nikos Kyrpides, Jonathan Eisen,
Eddy Rubin, Jim Bristow)• Project management (David Bruce, Eileen Dalin, Lynne Goodwin)• Culture collection and DNA prep (DSMZ, Hans-Peter Klenk)• Sequencing and closure (Eileen Dalin, Susan Lucas, Alla Lapidus, Mat
Nolan, Alex Copeland, Cliff Han, Feng Chen, Jan-Fang Cheng)• Annotation and data release (Nikos Kyrpides, Victor Markowitz, et al)• Analysis (Dongying Wu, Kostas Mavrommatis, Martin Wu, Victor
Kunin, Neil Rawlings, Ian Paulsen, Patrick Chain, Patrik D’Haeseleer, Sean Hooper, Iain Anderson, Amrita Pati, Natalia N. Ivanova, Athanasios Lykidis, Adam Zemla)
• Adopt a microbe education project (Cheryl Kerfeld)• Outreach (David Gilbert)• $$$ (DOE, DSMZ, GBMF)