beiko anl soil metagenomics presentation
TRANSCRIPT
Soil, lateral gene transfer, and hybrid genomes
Robert Beiko20 October 2015
A. microbe
Lateral gene transfer
http://genome.cbs.dtu.dk/staff/dave/MScourse/Lekt_11Feb2003c.html
In the gut
Butyrate synthesis in Lachnospiraceae and other organisms
Meehan and Beiko (2014) Genome Biol Evol
Lachnospiraceae
LGT across habitats
Smillie et al. (2011) Nature
Within-site transfer rates are highest in host-associated (i.e., human) habitats
AR genes are frequently transferred BETWEEN habitats
Villegas-Torres et al. (2011) International Biodeterioration & Biodegradation
Forsberg et al. (2012) Science
Sorangium cellulosum So157-214.8 Mbp; 11,599 coding sequences>1200 putative LGT acquisitions
Han et al. (2013) Sci Rep
An ecological view of genomes
Genes as individuals, Genomes as communities
Key concept mappings:
• Diversity: counts of genes / distribution across functional categories• Community: set of genes and their interactions• Migration: lateral gene transfer
Metacommunity Leibold et al., Ecol Lett, 2004
A set of local communities that are linked by dispersal
Species B
Species A
Habitat
Habitat
Habitat
Pattern and intensityof migration for species A
Genome Metacommunity Hypothesis
• Since…• Genes are agents whose trajectories are not bound to
their host organisms• Genes can evolve and take on new functional roles in
concert with other genes
• A genome can be viewed as a community of genes• Related sets of genomes comprise a
metacommunity of genes
Genome Metacommunities Boon et al., Fems Microbiol Rev, 2014
A set of genomes that are linked by LGT
Gene B
Gene A
Genome
Genome
Genome
Pattern and intensityof LGT for gene A
Genome Metacommunities Boon et al., Fems Microbiol Rev, 2014
Related to the pan-genome, but not restricted to specific taxonomic groups
Why is a given gene present in a given genome at a given time?
How are functional roles partitioned across a community?
Soil thinking
How important is LGT in soil communities?
Does it make sense to think of gene metacommunities in the soil context?
Lots of LGTYES
Minimal LGTNO
The procedure
• In the absence of a coherent set of known genomes from a given habitat…
1. Identify an interesting sample2. Select genomes with very high marker-gene (i.e.,
16S) similarity to sequences in the sample (gOTUs)
3. Mine genomes for evidence of LGT, examine patterns of connectivity
Conclusions• Positive relationship between
abundance, diversity and pH• Specific relationships between different
bacterial (notably Acidobacteria) and fungal groups vs. pH
• Fungal OTUs appear to tolerate wider pH ranges
(1)
Chosen sample:http://metagenomics.anl.gov/?page=MetagenomeOverview&metagenome=4455674.3#org_ref (pH = 4.1)
Meet the Sample (MG-RAST)
1277 rRNA gene sequences
Meet the Sample (Matching genomes)
99% 16S identity (e-value < 1e-20):
1211 – No matchBradyrhizobiaceae: 61Pseudomonas: 2Nocardioides: 1Acidithiobacillus: 1Cyanobium: 1Total: 18 genomes covering 8 genera
97% 16S identity:
1100 – No matchBradyrhizobiaceae: 77Pseudoxanthomonas / Cycloclasticus: 25Acidobacteria: 20Other Proteobacteria: 48Other: 10Total: 114 genomes covering 74 genera
114 genomes covering 74 genera
1277 rRNA gene sequences
(1)
gOTUs
16S sequence from sample
Rhodopseudomonas palustris TIE 1
Rhodopseudomonas palustris DX 1
Rhodopseudomonas palustris CGA009
Bradyrhizobium japonicum USDA 6
99% 16S identity
97% 16S identity
141824_31298
Rhodopseudomonas palustris HaA2
Rhodopseudomonas palustris BisA53
Bradyrhizobium BTAi1Nitrobacter winogradskyi
Oligotropha carboxidovorans
Agromonas oligotrophica
Bradyrhizobium ORS278
Weird gOTUs
141824_229613
Bordetella pertussis
Bordetella bronchiseptica
Bordetella parapertussis
Gross et al., 2008
Homology search
• Compare proxy genomes against nr database
• Identify interesting patterns:• Unusual best matches (e.g., best nonself match is to a
completely different group)• Patchy distributions, phylogenetic trees• Linked sets of genes: co-transfer?• Implicated biological processes?
Acidithiobacillus ferrooxidans•A refugee from genus Thiobacillus (a group
shattered by 16S rRNA gene sequencing)
• Loves long walks on the beach, pH < 2.0, oxidizes iron, sulphur, thiosulphate
•Also loves to share genes
https://microbewiki.kenyon.edu/index.php/Acidithiobacillus_ferrooxidans
Beiko (2011) Biol Direct
504 gene trees in which A. ferrooxidans has a unique genus as partnerNot shown: 795 genes w/multiple partnersAlso not shown: 333 other trees with less frequent, unique partner genera
Split by 16S; reunited by genome sequencing?
Genome 1 – Acidithiobacillus ferrivorans(renaming of A. ferrooxidans)
3093 predicted proteins / 3035 with homology matchesObserved / Predicted capabilities:• Facultatively anaerobic• Psychrotolerant• Optimal pH = 2.5• Oxidation of iron and inorganic sulfur• Carbon fixation, nitrate reduction• Trehalose synthesis• “Bioleaching”
Liljeqvist et al. (2011) J Bacteriol
Genome 1 – Acidithiobacillus ferrivorans
Best nonself match is to…(273 non-Acidithiobacillus)
Mobile element signatures dominate• 14 x restriction system-associated• 8 x transposase• 8 x transcriptional regulators (incl CopG, TetR)• Other resistance (LacZ, bleomycin, …)• Integrase, reverse transcriptase, toxin/antitoxin,
bacteriocin, …• Nitrate reductase & related• >90 unknown
1877 found in other Acidithiobacillus + other genera
Best non-Acidithiobacillus match is to…
(only 11 Acidobacteria!)
https://www.jasondavies.com/wordcloud/#
Acidobacterial connections
• short-chain dehydrogenase/reductase SDR • HNH endonuclease • Glycoside hydrolase family 8 (x3)• RES domain protein • Transposase x 5
Phylogenetic profiles# of similar genes (evalue < 10-50)
Min 30 connections
Proteobacteria
Actinobacteria
Cyanobacteria
Planctomycetes
Acidobacteria
Bacteroidetes
Acidithiobacillus
Key observations• Connections to many other groups,
mostly Proteobacteria (not surprising)• No between-group connections outside
Proteobacteria at this threshold• Acidithiobacillus as hub rather than part
of gene-exchange community?
65
Mutual information-based network(do groups co-occur > random?)
Acidithiobacillus
Gammaproteobacteria
Alpha/Betaproteobacteria
Key observations• Connections mostly predictable by
phylogeny• Again, no interesting partners outside of
Proteobacteria• However, many connections between
Alpha/Betaproteobacteria
Phosphate ABC transportersgi 343775109
periplasmic(eval < 10-100)
gi 343775110 inner membranesubunit PstC (eval < 10-50)
gi 343775111 inner membranesubunit PstA (eval < 10-50)
Distribution:• Acidithiobacillus• Acidobacterium• Alpha/Beta/Gamma• Actinobacteria• Firmicutes
Recurrent grouping ofAcidithiobacillus (Gamma)AcidobacteriumThiobacillus (Beta)Defluviimonas (Alpha)Salinisphaera (Gamma)
Genome 2 – Terriglobus roseusEichorst et al (2007) IJSEM
• “Group 1” acidobacterium• Preferred pH: ~6
• Aerobic• Catalase, carotenoids for defense against reactive oxygen• Oligotrophic; can grow on a wide range of carbon sources
• 4245 protein-coding genes (2735 with nr matches, 558 species-specific)
Rousk et al. Eichorst et al.
Best nonself matches (183 non-Acidobacteria)Multidrug resistance / cation efflux / prophage
Best matches outside Terriglobus
Phylogenetic profiles
Profiles are wider and more diverse for Terriglobus than for Acidithiobacillus
LPS O-antigen biosynthesisgi 390412425
CDP-glucose 4,6-dehydratase
gi 390412426 glucose-1-phosphate cytidylyltransferase
gi 390412427 “LPS biosynthesisprotein”
Distribution:• Acidithiobacillus• Acidobacteria• Other proteobacteria• Other
Flavobacteria
Spirochaetes
SpirochaetesCyanobacteria
Contrasting Acidithiobacillus vs Terriglobus relationships:Same partners, different dance
Compare profiles vs Streptomycetaceae (five strains found in sample gOTU)
Acidithiobacillus Common Terriglobus
Polyphosphate kinase
glucose-6-phosphate 1-dehydrogenase
Carbon monoxide dehydrogenase
More glycolytic enzymes
Heavy-metal resistance / export
Multidrug resistance
Ammonium transporter
Catalase / peroxidaseExopolysaccaride
Conclusions
• Different layers of LGT:• Very recent: mostly mobile elements (proxies unsuitable)• Less recent (outside species / genus) (proxies potentially
more justifiable)• Taxonomy is a pain
• What’s the story with gene metacommunities?• Lots of LGT!• Recurrent patterns of sharing among groups not evident• Metacommunities at the pan-genome level?• Need many isolate genomes from single samples
Technical impacts of LGT and gene metacommunitiesMetagenomic read assignment• Recently acquired genes will still look like they belong in
the donor• These are some of the most interesting genes!!
Functional prediction (e.g., PICRUSt)• Phylogeny will fail to accurately predict the distribution of
these genes. Be very careful with extreme or poorly characterized samples!
Phylogenetic beta diversity may be misleading
Key questions in LGT and gene metacommunities• Are gene-sharing networks:• Random?• Driven by shared location / habitat?• Constrained by phylogenetic relatedness?
• Are shared genes:• Neutral or adaptive?• Driven by specific types of mobile element?
Fin