sequencing all of microbial life: challenges and opportunities
DESCRIPTION
Sequencing All of Microbial Life: Challenges and Opportunities. Rob Edwards Argonne National Laboratory San Diego State University. How much has been sequenced. 100 bacterial genomes. Environmental sequencing. Number of known sequences. First bacterial genome. 1,000 bacterial - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Sequencing All of Microbial Life: Challenges and Opportunities](https://reader034.vdocuments.us/reader034/viewer/2022051417/568148a8550346895db5baa4/html5/thumbnails/1.jpg)
Sequencing All of Microbial Life: Challenges and Opportunities
Rob Edwards
Argonne National LaboratorySan Diego State University
![Page 2: Sequencing All of Microbial Life: Challenges and Opportunities](https://reader034.vdocuments.us/reader034/viewer/2022051417/568148a8550346895db5baa4/html5/thumbnails/2.jpg)
Firstbacterial genome
100bacterial genomes
1,000bacterial genomes
Num
ber
of
know
n s
equence
s
Year
How much has been sequenced
Environmentalsequencing
![Page 3: Sequencing All of Microbial Life: Challenges and Opportunities](https://reader034.vdocuments.us/reader034/viewer/2022051417/568148a8550346895db5baa4/html5/thumbnails/3.jpg)
Everybody inToronto
Everybody inNorth America
AllculturedBacteria
100people
How much will be sequenced
One genome fromevery species
Most majormicrobial environments
![Page 4: Sequencing All of Microbial Life: Challenges and Opportunities](https://reader034.vdocuments.us/reader034/viewer/2022051417/568148a8550346895db5baa4/html5/thumbnails/4.jpg)
Rank Abundance Curves, Papers vs Genomes
• Microbial publications vs Genomes by Family
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
![Page 5: Sequencing All of Microbial Life: Challenges and Opportunities](https://reader034.vdocuments.us/reader034/viewer/2022051417/568148a8550346895db5baa4/html5/thumbnails/5.jpg)
16S Abundance -- Human Intestine
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
![Page 6: Sequencing All of Microbial Life: Challenges and Opportunities](https://reader034.vdocuments.us/reader034/viewer/2022051417/568148a8550346895db5baa4/html5/thumbnails/6.jpg)
16S Abundance -- Upland Pasture Soil
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
![Page 7: Sequencing All of Microbial Life: Challenges and Opportunities](https://reader034.vdocuments.us/reader034/viewer/2022051417/568148a8550346895db5baa4/html5/thumbnails/7.jpg)
Environmental Genomics -- Wisconsin Soil
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
![Page 8: Sequencing All of Microbial Life: Challenges and Opportunities](https://reader034.vdocuments.us/reader034/viewer/2022051417/568148a8550346895db5baa4/html5/thumbnails/8.jpg)
Line Island Metagenomics Transect
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
![Page 9: Sequencing All of Microbial Life: Challenges and Opportunities](https://reader034.vdocuments.us/reader034/viewer/2022051417/568148a8550346895db5baa4/html5/thumbnails/9.jpg)
Environmental Genomics -- Whale Fall
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
![Page 10: Sequencing All of Microbial Life: Challenges and Opportunities](https://reader034.vdocuments.us/reader034/viewer/2022051417/568148a8550346895db5baa4/html5/thumbnails/10.jpg)
There are big gaps in sequence space• 6,400 total taxa
• About 380 are human, animal or plant pathogens
• 360 complete prokaryotic genomes published
• 56 archaeal and 940 bacterial genomes in progress– ~400 are pathogens
• Approximately ~5,000 prokaroytes not yet in play– We estimate about 4,800 non-pathogen taxa
![Page 11: Sequencing All of Microbial Life: Challenges and Opportunities](https://reader034.vdocuments.us/reader034/viewer/2022051417/568148a8550346895db5baa4/html5/thumbnails/11.jpg)
The Bergey’s Manual
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
David H. Bergey
![Page 12: Sequencing All of Microbial Life: Challenges and Opportunities](https://reader034.vdocuments.us/reader034/viewer/2022051417/568148a8550346895db5baa4/html5/thumbnails/12.jpg)
Strain Distribution in CollectionsUS Collections / BRCs Strains American Type Culture Collection (ATCC) 4027 USDA ARS Collection (NRRL) 223European Collections
Deutsche Sammlung vor Microoransmen (DSMZ) 1302Culture Collection University Gottenberg (CCUG) 183Pasteur Institute (CIP) 170Laboratory for Micrbiology, Gent (LMG) 101National Collection of Industrial and
Marine Bacteria 25French Collection of Phytopathogens (CFPB) 15National Collection of Type Cultures (NCTC) 12National Collection of Phytopathogenic
Bacteria 11Asia
Japan Collection of Microorganisms (JCM) 185Institute of Fermentation, Osaka (IFO) 34Korean Collection of Type Cultures (KCTC) 28Institute of Applied Microbiology, Tokyo (IAM) 26National Institute of Technology
And Evaluation (NBRC) 24All-Russian Collection of Microorganisms (VKM) 13
![Page 13: Sequencing All of Microbial Life: Challenges and Opportunities](https://reader034.vdocuments.us/reader034/viewer/2022051417/568148a8550346895db5baa4/html5/thumbnails/13.jpg)
Estimated Sequencing RatesYear 2007 2008 2009 2010 2011 2012 2013 2014 Notes
Base Pairs per dollar 200 300 450 675 1,013 1,519 2,278 3,417 50% improvement per year
Bacterial Genome Cost in $ 20,000 13,333 8,889 5,926 3,951 2,634 1,756 1,171 ~4M bp per genome
Number Genomes for $5M 250 375 563 844 1,266 1,898 2,848 4,271Cumulative Genomes Sequenced 250 625 1,188 2,031 3,297 5,195 8,043 12,314
TargetSelection
TypeCulture Material
SequencingAssembly
RapidAnnotation(24 Hours)
MetabolicReconstruction
PhenotypeMicroarrays
![Page 14: Sequencing All of Microbial Life: Challenges and Opportunities](https://reader034.vdocuments.us/reader034/viewer/2022051417/568148a8550346895db5baa4/html5/thumbnails/14.jpg)
Target Selection
http://www.sequencingbergeys.org
![Page 15: Sequencing All of Microbial Life: Challenges and Opportunities](https://reader034.vdocuments.us/reader034/viewer/2022051417/568148a8550346895db5baa4/html5/thumbnails/15.jpg)
Microbial Idol
![Page 16: Sequencing All of Microbial Life: Challenges and Opportunities](https://reader034.vdocuments.us/reader034/viewer/2022051417/568148a8550346895db5baa4/html5/thumbnails/16.jpg)
>2,000 different media
Physical Conditions: • Temperature (4° - 120°C) • pH (1.0 - 11.0)• Salt (0 - 30%)• Light (obligate phototrophs• Pressure (few obligate piezophiles)• Redox:
Strict anaerobes Facultative Microaerobes Aerobes
Culturing by ATCC
![Page 17: Sequencing All of Microbial Life: Challenges and Opportunities](https://reader034.vdocuments.us/reader034/viewer/2022051417/568148a8550346895db5baa4/html5/thumbnails/17.jpg)
Phenotyping by Biolog
Carbon Pathways
Nitrogen Pathways
Sensitivity to Chemicals
Osmotic &Ion Effects
pHEffects
Biosynth.
Pathways
P
SN
![Page 18: Sequencing All of Microbial Life: Challenges and Opportunities](https://reader034.vdocuments.us/reader034/viewer/2022051417/568148a8550346895db5baa4/html5/thumbnails/18.jpg)
Sequencing by JGI
FY 06: # InstrumentsSanger: 107454: 1
FY 07: # InstrumentsSanger: 107 454: 2
35.4 Gb
45 Gb goal
![Page 19: Sequencing All of Microbial Life: Challenges and Opportunities](https://reader034.vdocuments.us/reader034/viewer/2022051417/568148a8550346895db5baa4/html5/thumbnails/19.jpg)
• Automated process consisting of:– Gene calling– Initial annotation of
function– Initial metabolic
reconstruction
• Process takes 1-7 hours depending on size and complexity of the genome
• ~20 genomes per day
Rapid Annotation Using Subsystems Technology
http://www.nmpdr.org/anno-server/index48.cgi
![Page 20: Sequencing All of Microbial Life: Challenges and Opportunities](https://reader034.vdocuments.us/reader034/viewer/2022051417/568148a8550346895db5baa4/html5/thumbnails/20.jpg)
Evaluation / Viewing
![Page 21: Sequencing All of Microbial Life: Challenges and Opportunities](https://reader034.vdocuments.us/reader034/viewer/2022051417/568148a8550346895db5baa4/html5/thumbnails/21.jpg)
Feedback
TargetSelection
Sequencing
AnnotationMetabolic
Reconstruction
Phenotyping
![Page 22: Sequencing All of Microbial Life: Challenges and Opportunities](https://reader034.vdocuments.us/reader034/viewer/2022051417/568148a8550346895db5baa4/html5/thumbnails/22.jpg)
Status
• 100 organism pilot - GEBA underway
• Requesting funding/approval for remainder
• Target selection about to go live
![Page 23: Sequencing All of Microbial Life: Challenges and Opportunities](https://reader034.vdocuments.us/reader034/viewer/2022051417/568148a8550346895db5baa4/html5/thumbnails/23.jpg)
PeopleJGI Jim Bristow Jonathan Eisen Phil Hugenholtz Nikos Kyrpides Paul Richardson David Bruce
MSU Jim Cole George GarrityU GA Barney WhitmanUIUC Gary Olsen
ATCC David Emmerson Tim LilburnBiolog Stacy Montgomery John Groat
ANL Rick Stevens Folker Meyer Ross Overbeek Veronika VonsteinHope Matt DeJongh
![Page 24: Sequencing All of Microbial Life: Challenges and Opportunities](https://reader034.vdocuments.us/reader034/viewer/2022051417/568148a8550346895db5baa4/html5/thumbnails/24.jpg)
Technical Feasibility FAQ• How many genomes would the project propose to sequence?
– About 5000• Who would produce the biomass needed for DNA extraction?
– Type culture centers• Will the biomass/DNA be available for distribution?
– Yes, both the DNA and the libraries could be stored for distribution• What throughput is needed for DNA production?
– In the beginning of the project ~300 taxa per year to 2000 per yr at the end• What throughput is needed for sequencing?
– 1.2 Gb/yr to 8 Gb/yr finished sequence• What combinations of sequencing technologies need to be employed?
– Sanger and Pyrosequencing initially• What throughput is needed for annotation?
– 24 hour turnaround from assembled sequence to initial availability• Is is possible to have a standard set of phenotype assays given the broad
spectrum of organisms and conditions?– We are considering Biolog as a model, but it is too limited
• How would the genomes be selected and prioritized?– At each cycle we choose genomes (e.g. via 16s) to minimize the diversity gaps
• Is it necessary to “close” the genomes?– We think no.