Download - Beiko cms final
![Page 1: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/1.jpg)
Robert Beiko
When trees can’t agree
![Page 2: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/2.jpg)
2
- The human microbiome -an ecosystem unlike any other
![Page 3: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/3.jpg)
Human gut microbiome: 2-3 million genes
Typically > 160 “species” at any given time
Human: ~25,000 genes
Qin et al., Nature (2010)
![Page 4: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/4.jpg)
4
Microbial communities
http://upload.wikimedia.org/wikipedia/commons/2/2d/Bacteria_%28251_31%29_Airborne_microbes.jpg
![Page 5: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/5.jpg)
5
Photo courtesy of Emma Allen-Vercoe, University of Guelph
Lachnospiraceae bacterium 3-1-57 CT1“Lachnozilla”
![Page 6: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/6.jpg)
6Meehan and Beiko (2014) Genome Biol Evol
Lachno
Lachnospiraceae – commonly thought of as “Good bacteria”
![Page 7: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/7.jpg)
7
0 1000 2000 3000 4000 5000 6000 7000 8000
Sizes of Assembly and Draft Genomes of Class Clostridia
Number of Protein-Coding Genes
Zilla
![Page 8: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/8.jpg)
?
![Page 9: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/9.jpg)
9
50
33
4
?
![Page 10: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/10.jpg)
10W. Ford Doolittle, Sci Am (1999)
![Page 11: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/11.jpg)
11
PNAS, 2012
“…pathogen-driven inflammatory responses in the gut can generate transient enterobacterial blooms in which conjugative transfer occurs at unprecedented rates.”
PLoS Biol, 2007
“…lateral gene transfer, mobile elements, and gene amplification have played important roles in affecting the ability of gut-dwelling Bacteroidetes to vary their cell surface, sense their environment, and harvest nutrient resources present in the distal intestine.”
Gene transfer matters
![Page 12: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/12.jpg)
12
The genomics toolkitGene profiles
Gene 1 Gene 2 Gene 3 Gene 4 Gene 5
…
![Page 13: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/13.jpg)
13
The genomics toolkit“Species” trees
![Page 14: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/14.jpg)
14
The genomics toolkitGene trees
Do this forALL genes
![Page 15: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/15.jpg)
15
Representing and understandingmicrobial relationships
1. Matrix-based approaches
2. Phylogenetic reconciliation
3. Gene distributions and “microbial identity”
![Page 16: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/16.jpg)
1The tyrannyof distance
![Page 17: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/17.jpg)
17
From profile to distance matrix
Gene 1 Gene 2 Gene 3 Gene 4 Gene n
A
B
C
D
E
F
S1 = 0.91 0.82 0.72 0.89
𝑑𝐴 ,𝐵=1.0−1𝑛∑
𝑔=1
𝑛
𝑆𝑔
A B C
A 0 0.165 0.252
B 0.165 0 0.297
C 0.252 0.297 0
![Page 18: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/18.jpg)
18
Neighbor-joining
Start with a ‘star’ tree
At each iteration, split off the pair of taxa that minimizes the total sum of branch lengths in the tree
Choose groups x and y to minimize the Q-criterion:
Distance matrix entry for (x,y)
x
y
Weighted distance to all leaves
![Page 19: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/19.jpg)
19
Continue until binary tree is obtained
Saitou and Nei (1987)
![Page 20: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/20.jpg)
20
Neighbor-net: Building a splits graph
Bryant and Moulton, Mol Biol Evol (2003)
![Page 21: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/21.jpg)
21
Neighbor-net is guaranteed to produce a circular set of splits
This will produce a planar graph
![Page 22: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/22.jpg)
22
Neighbor-net of 298 microbial genomes
Beiko, Biol Direct (2011)
![Page 23: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/23.jpg)
23
Limitations of neighbor-net
• Neighbor-net still imposes a constraint on the relationships among genomes: “long-distance” connections cannot be shown
?
![Page 24: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/24.jpg)
24
Explicit connections between genomes• Make each genome a vertex in a graph G
V = {A,B,C,D,E,F,…}E = {{A,B},…}
For some threshold t:{A,B} ϵ G iff dA,B ≤ tor if some other condition is satisfied
A BwA,B
![Page 25: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/25.jpg)
25
Linear programming
• Weighting networks based on straight genome-genome similarity highlights close relatives, redundancy
• LP introduces weighting scheme that constrains connections and promotes distinct relationships
![Page 26: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/26.jpg)
26
P. aeruginosaP. fluorescensP. lePewtidaP. syringaeP. entomophilaP. stutzeriP. mendocina
Holloway and Beiko, BMC Evol Biol (2010)
“Plume”
![Page 27: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/27.jpg)
27
Some like it hotPyrococcus furiosusoptimal growth temperature:
100°C
![Page 28: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/28.jpg)
28Kunin et al. (2005) Genome Res
Networks
![Page 29: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/29.jpg)
29
Networks!!!!
Dagan et al. (2008) PNAS
![Page 30: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/30.jpg)
2Inferring andcomparing trees
![Page 31: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/31.jpg)
31
Phylogenetic tree reconciliation
Species tree S Gene tree GLateral gene transfer
Subtree prune and regraftWhidden et al., Syst Biol (2014)
![Page 32: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/32.jpg)
32
For two rooted trees, dSPR is equal to thenumber of components in a MAF, minus 1
So building a MAF is equivalent to inferring the minimumnumber of SPR events needed to reconcile a species treewith a gene tree
Problem is NP-hard
dSPR = 1
MAF components = 2
Bordewich and Semple, Ann Combinatorics (2005)
![Page 33: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/33.jpg)
33
T1 T2
Case 1(separate components)
Case 3(several pendant nodes)
Case 2(one pendant node)
Chris’s algorithm
![Page 34: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/34.jpg)
34
Fixed-parameter tractability
• Problem is dominated by Case 3 (3 alternatives)
• Cut all candidate edges at each step = linear 3-approximation
• Decision problem: to decide if SPR distance ≤ k
• Problem is exponential in SPR distance, NOT number of leaves
therefore FPT
Chris Whidden + Norbert Zeh
![Page 35: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/35.jpg)
35
In practice
![Page 36: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/36.jpg)
36
SPR Supertrees
Supertree: a tree that satisfies some optimality criterion with respect to a set of input trees
SPR supertree: given a set of gene trees, find a tree that minimizes the total number of SPR operations vs. all gene trees
Building an SPR supertree: assemble an initial tree, then propose SPR operations and evaluate its total SPR distance from input trees
Whidden et al., 2014
![Page 37: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/37.jpg)
37
Why SPR supertrees?
1. Explicit representation of LGT events
2. Branches broken in MAF → implied LGT events. Can build graph of connections
![Page 38: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/38.jpg)
244 bacterial genomes40,631 gene trees= Bacterial SPR supertree
LGT patterns for Clostridium
Whidden et al., 2014
![Page 39: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/39.jpg)
(taming in progress) http://en.wikipedia.org/wiki/File:Godzilla_%2754_design.jpg
3Taming Lachnozilla
![Page 40: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/40.jpg)
What makes LachnoZilla
LachnoZilla ?
![Page 41: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/41.jpg)
41
C. difficile….
“Virulence-associated protein”Mobile DNA
Phylogenetic profile basedon extremely good matches toother genomes (> 95% ID, > 95% coverage)
= “recent” LGT events
![Page 42: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/42.jpg)
42
LZ & friends
279 genomesConserved marker-gene tree
Ben Wright
![Page 43: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/43.jpg)
43
LachnoZilla (and friends)genome graph
!
![Page 44: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/44.jpg)
44
Close relative(expected)
![Page 45: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/45.jpg)
45
Distant relative(not so expected)(big genome though!)
![Page 46: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/46.jpg)
46
Selective sharing
![Page 47: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/47.jpg)
Gene-centric graphsLZ Genom
e 1Genom
e 2Genom
e 3Genom
e 4Genom
e 5Genom
e 6
Gene 1
× ×
Gene 2
×
Gene 3
× ×
Gene 4
× × ×
Edge weights are proportional to similarity of distributionUse graph clustering to divide up completely connected, weighted graph
Gene 2
Gene 3
Gene 1
Gene 4
![Page 48: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/48.jpg)
Legionaminic acidAcetylneuraminic acid
(pathogen associated)
Bacteroides pectinophilusButyrivibrio proteoclasticusEubacterium plexicaudatumRoseburiaNeighborsWeirdly named isolates
Lachnozilla in graph form(it all makes sense now)
![Page 49: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/49.jpg)
Mystery isolate #1(made-up example)
![Page 50: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/50.jpg)
Mystery isolate #2(made-up example)
![Page 51: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/51.jpg)
Questions
Representations
Clear inference
From pattern to understanding
![Page 52: Beiko cms final](https://reader036.vdocuments.us/reader036/viewer/2022062514/55b3e027bb61ebf1218b45c4/html5/thumbnails/52.jpg)
52
FIN