week 9: coalescents, part 2 · week 9: coalescents, part 2 – p.22/74. the coalescent to simulate...
TRANSCRIPT
![Page 1: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/1.jpg)
Week 9: Coalescents, part 2
Genome 562
March, 2013
Week 9: Coalescents, part 2 – p.1/74
![Page 2: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/2.jpg)
Gene copies in a population of 10 individuals
Time
A random−mating population
Week 9: Coalescents, part 2 – p.2/74
![Page 3: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/3.jpg)
Going back one generation
Time
A random−mating population
Week 9: Coalescents, part 2 – p.3/74
![Page 4: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/4.jpg)
... and one more
Time
A random−mating population
Week 9: Coalescents, part 2 – p.4/74
![Page 5: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/5.jpg)
... and one more
Time
A random−mating population
Week 9: Coalescents, part 2 – p.5/74
![Page 6: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/6.jpg)
... and one more
Time
A random−mating population
Week 9: Coalescents, part 2 – p.6/74
![Page 7: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/7.jpg)
... and one more
Time
A random−mating population
Week 9: Coalescents, part 2 – p.7/74
![Page 8: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/8.jpg)
... and one more
Time
A random−mating population
Week 9: Coalescents, part 2 – p.8/74
![Page 9: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/9.jpg)
... and one more
Time
A random−mating population
Week 9: Coalescents, part 2 – p.9/74
![Page 10: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/10.jpg)
... and one more
Time
A random−mating population
Week 9: Coalescents, part 2 – p.10/74
![Page 11: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/11.jpg)
... and one more
Time
A random−mating population
Week 9: Coalescents, part 2 – p.11/74
![Page 12: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/12.jpg)
... and one more
Time
A random−mating population
Week 9: Coalescents, part 2 – p.12/74
![Page 13: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/13.jpg)
... and one more
Time
A random−mating population
Week 9: Coalescents, part 2 – p.13/74
![Page 14: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/14.jpg)
The genealogy of gene copies is a tree
Time
Genealogy of gene copies, after reordering the copies
Week 9: Coalescents, part 2 – p.14/74
![Page 15: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/15.jpg)
Ancestry of a sample of 3 copies
Time
Genealogy of a small sample of genes from the population
Week 9: Coalescents, part 2 – p.15/74
![Page 16: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/16.jpg)
Here is that tree of 3 copies in the pedigree
Time
Week 9: Coalescents, part 2 – p.16/74
![Page 17: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/17.jpg)
Kingman’s coalescent
Random collision of lineages as go back in time (sans recombination)
Collision is faster the smaller the effective population size
u9
u7
u5
u3
u8
u6
u4
u2
Average time for n
Average time for
copies to coalesce to
4N
k(k−1) k−1 =
In a diploid population of
effective population size N,
copies to coalesce
= 4N (1 − 1n ( generations
k
Average time for
two copies to coalesce
= 2N generations
What’s misleading about this diagram: the lineages that coalesce arerandom pairs, not necessarily ones that are next to each other in a linearorder.
Week 9: Coalescents, part 2 – p.17/74
![Page 18: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/18.jpg)
The Wright-Fisher model
This is the canonical model of genetic drift in populations. It was inventedin 1930 and 1932 by Sewall Wright and R. A. Fisher.
In this model the next generation is produced by doing this:
Choose two individuals with replacement (including the possibility thatthey are the same individual) to be parents,
Each produces one gamete, these become a diploid individual,
Repeat these steps until N diploid individuals have been produced.
The effect of this is to have each locus in an individual in the nextgeneration consist of two genes sampled from the parents’ generation atrandom, with replacement.
Week 9: Coalescents, part 2 – p.18/74
![Page 19: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/19.jpg)
Sir John Kingman
J. F. C. Kingman in about 1983
Currently Emeritus Professor of Mathematics at Cambridge University,U.K., and former head of the Isaac Newton Institute of MathematicalSciences.
Week 9: Coalescents, part 2 – p.19/74
![Page 20: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/20.jpg)
The coalescent – a derivation
The probability that k lineages becomes k − 1 one generation earlierturns out to be (as each lineage “chooses” its ancestor independently):
k(k − 1)/2 × Prob (First two have same parent, rest are different)
(since there are(k2
)= k(k − 1)/2 different pairs of copies)
We add up terms, all the same, for the k(k − 1)/2 pairs that couldcoalesce; the sum is:
k(k − 1)/2 × 1 × 12N
×(1 − 1
2N
)
×(1 − 2
2N
)× · · · ×
(1 − k−2
2N
)
so that the total probability that a pair coalesces is
= k(k − 1)/4N + O(1/N2)
Week 9: Coalescents, part 2 – p.20/74
![Page 21: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/21.jpg)
Can probabilities of two or more lineages coalescing
Note that the total probability that some combination of lineagescoalesces is
1 − Prob (Probability all genes have separate ancestors)
= 1 −
[
1 ×
(
1 −1
2N
) (
1 −2
2N
)
. . .
(
1 −k − 1
2N
)]
= 1 −
[
1 −1 + 2 + 3 + · · · + (k − 1)
2N+ O(1/N2)
]
and since1 + 2 + 3 + . . . + (n − 1) = n(n − 1)/2
the quantity
= 1 −[
1 − k(k − 1)/4N + O(1/N2)]≃ k(k − 1)/4N + O(1/N2)
Week 9: Coalescents, part 2 – p.21/74
![Page 22: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/22.jpg)
Can calculate how many coalescences are of pairs
This shows, since the terms of order 1/N are the same, that the eventsinvolving 3 or more lineages simultaneously coalescing are in the terms oforder 1/N2 and thus become unimportant if N is large.
Here are the probabilities of 0, 1, or more coalescences with 10 lineagesin populations of different sizes:
N 0 1 > 1
100 0.79560747 0.18744678 0.016945751000 0.97771632 0.02209806 0.00018562
10000 0.99775217 0.00224595 0.00000187
Note that increasing the population size by a factor of 10 reduces thecoalescent rate for pairs by about 10-fold, but reduces the rate for triples(or more) by about 100-fold.
Week 9: Coalescents, part 2 – p.22/74
![Page 23: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/23.jpg)
The coalescentTo simulate a random genealogy, do the following:
1. Start with k lineages
2. Draw an exponential time interval with mean 4N/(k(k − 1))generations.
3. Combine two randomly chosen lineages.
4. Decrease k by 1.
5. If k = 1, then stop
6. Otherwise go back to step 2.
Week 9: Coalescents, part 2 – p.23/74
![Page 24: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/24.jpg)
An accurate analogy: Bugs In A Box
There is a box ...
Week 9: Coalescents, part 2 – p.24/74
![Page 25: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/25.jpg)
An accurate analogy: Bugs In A Box
with bugs that are ...
Week 9: Coalescents, part 2 – p.25/74
![Page 26: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/26.jpg)
An accurate analogy: Bugs In A Box
hyperactive, ...
Week 9: Coalescents, part 2 – p.26/74
![Page 27: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/27.jpg)
An accurate analogy: Bugs In A Box
indiscriminate, ...
Week 9: Coalescents, part 2 – p.27/74
![Page 28: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/28.jpg)
An accurate analogy: Bugs In A Box
voracious ...
Week 9: Coalescents, part 2 – p.28/74
![Page 29: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/29.jpg)
An accurate analogy: Bugs In A Box
(eats other bug) ...
Gulp!
Week 9: Coalescents, part 2 – p.29/74
![Page 30: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/30.jpg)
An accurate analogy: Bugs In A Box
and insatiable.
Week 9: Coalescents, part 2 – p.30/74
![Page 31: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/31.jpg)
Random coalescent trees with 16 lineages
O C S M L P K E J I T R H Q F B N D G A M J B F G C E R A S Q K N L H T I P D O B G T M L Q D O F K P E A I J S C H R N F R N L M D H B T C Q S O G P I A K J E
I Q C A J L S G P F O D H B M E T R K N R C L D K H O Q F M B G S I T P A J E N N M P R H L E S O F B G J D C I T K Q A N H M C R P G L T E D S O I K J Q F A B
Week 9: Coalescents, part 2 – p.31/74
![Page 32: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/32.jpg)
Coalescence is faster in small populations
Change of population size and coalescents
Ne
time
the changes in population size will produce waves of coalescence
time
Coalescence events
time
the tree
The parameters of the growth curve for Ne can be inferred bylikelihood methods as they affect the prior probabilities of those treesthat fit the data.
Week 9: Coalescents, part 2 – p.32/74
![Page 33: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/33.jpg)
Migration can be taken into account
Time
population #1 population #2Week 9: Coalescents, part 2 – p.33/74
![Page 34: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/34.jpg)
Recombination creates loops
Recomb.
Different markers have slightly different coalescent trees
Week 9: Coalescents, part 2 – p.34/74
![Page 35: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/35.jpg)
If we have a sample of 50 copies
50−gene sample in a coalescent tree
Week 9: Coalescents, part 2 – p.35/74
![Page 36: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/36.jpg)
The first 10 account for most of the branch length
10 genes sampled randomly out of a
50−gene sample in a coalescent tree
Week 9: Coalescents, part 2 – p.36/74
![Page 37: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/37.jpg)
... and when we add the other 40 they add less length
10 genes sampled randomly out of a
50−gene sample in a coalescent tree
(purple lines are the 10−gene tree)Week 9: Coalescents, part 2 – p.37/74
![Page 38: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/38.jpg)
Cann, Stoneking, and Wilson
Becky Cann Mark Stoneking the late Allan Wilson
Cann, R. L., M. Stoneking, and A. C. Wilson. 1987. Mitochondrial DNAand human evolution. Nature 325:a 31-36.
Week 9: Coalescents, part 2 – p.38/74
![Page 39: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/39.jpg)
Mitochondrial Eve
Week 9: Coalescents, part 2 – p.39/74
![Page 40: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/40.jpg)
We want to be able to analyze human evolution
Africa
Europe Asia
"Out of Africa" hypothesis
(vertical scale is not time or evolutionary change)
Week 9: Coalescents, part 2 – p.40/74
![Page 41: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/41.jpg)
coalescent and “gene trees” versus species trees
Consistency of gene tree with species tree
Week 9: Coalescents, part 2 – p.41/74
![Page 42: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/42.jpg)
coalescent and “gene trees” versus species trees
Consistency of gene tree with species tree
Week 9: Coalescents, part 2 – p.42/74
![Page 43: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/43.jpg)
coalescent and “gene trees” versus species trees
Consistency of gene tree with species tree
Week 9: Coalescents, part 2 – p.43/74
![Page 44: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/44.jpg)
coalescent and “gene trees” versus species trees
Consistency of gene tree with species tree
Week 9: Coalescents, part 2 – p.44/74
![Page 45: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/45.jpg)
coalescent and “gene trees” versus species trees
Consistency of gene tree with species tree
Week 9: Coalescents, part 2 – p.45/74
![Page 46: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/46.jpg)
coalescent and “gene trees” versus species trees
Consistency of gene tree with species tree
coalescence time
Week 9: Coalescents, part 2 – p.46/74
![Page 47: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/47.jpg)
If the branch is more than Ne generations long ...
t1
t2
N1
N2
N4
N3
N5
Gene tree and Species tree
Week 9: Coalescents, part 2 – p.47/74
![Page 48: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/48.jpg)
If the branch is more than Ne generations long ...
t1
t2
N1
N2
N4
N3
N5
Gene tree and Species tree
Week 9: Coalescents, part 2 – p.48/74
![Page 49: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/49.jpg)
If the branch is more than Ne generations long ...
t1
t2
N1
N2
N4
N3
N5
Gene tree and Species tree
Week 9: Coalescents, part 2 – p.49/74
![Page 50: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/50.jpg)
How do we compute a likelihood for a population sample?
CAGTTTTAGCGTCC
CAGTTTTAGCGTCC
CAGTTTTAGCGTCC
CAGTTTTAGCGTCC
CAGTTTTAGCGTCC
CAGTTTTAGCGTCC
CAGTTTTAGCGTCC
CAGTTTTAGCGTCC
CAGTTTTAGCGTCC
CAGTTTTAGCGTCC
CAGTTTTAGCGTCC
CAGTTTCAGCGTCC
CAGTTTCAGCGTCC
CAGTTTCAGCGTCCCAGTTTCAGCGTCC
CAGTTTCAGCGTCC
CAGTTTCAGCGTCC
CAGTTTCAGCGTCC
CAGTTTCAGCGTCC
CAGTTTTGGCGTCC
CAGTTTTGGCGTCCCAGTTTTGGCGTCC
CAGTTTTGGCGTCC
CAGTTTTGGCGTCC
CAGTTTCAGCGTAC
CAGTTTCAGCGTAC
CAGTTTCAGCGTAC
, CAGTTTCAGCGTCC CAGTTTCAGCGTCC ), ... L = Prob ( = ??
Week 9: Coalescents, part 2 – p.50/74
![Page 51: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/51.jpg)
If we have a tree for the sample sequences, we can
CAGTTTTAGCGTCC
CAGTTTTAGCGTCC
CAGTTTTAGCGTCC
CAGTTTTAGCGTCC
CAGTTTTAGCGTCC
CAGTTTTAGCGTCC
CAGTTTTAGCGTCC
CAGTTTTAGCGTCC
CAGTTTTAGCGTCC
CAGTTTCAGCGTCC
CAGTTTCAGCGTCC
CAGTTTCAGCGTCC
CAGTTTCAGCGTCC
CAGTTTTGGCGTCCCAGTTTTGGCGTCC
CAGTTTTGGCGTCC
CAGTTTTGGCGTCC
CAGTTTCAGCGTACCAGTTTCAGCGTAC
CAGTTTCAGCGTAC
CAGTTTCAGCGTCC
, CAGTTTCAGCGTCC CAGTTTCAGCGTCCProb( | Genealogy)
so we can compute
but how to computer the overall likelihood from this?
, ...
CAGTTTCAGCGTCC
CAGTTTTAGCGTCCCAGTTTTAGCGTCC
CAGTTTCAGCGTCCCAGTTTTGGCGTCC
CAGTTTCAGCGTCC
Week 9: Coalescents, part 2 – p.51/74
![Page 52: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/52.jpg)
The basic equation for coalescent likelihoods
In the case of a single population with parametersNe effective population sizeµ mutation rate per site
and assuming G′ stands for a coalescent genealogy and D for the
sequences,
L = Prob (D | Ne, µ)
=∑
G′
Prob (G′ | Ne) Prob (D | G′, µ)
︸ ︷︷ ︸ ︸ ︷︷ ︸
Kingman′s prior likelihood of tree
Week 9: Coalescents, part 2 – p.52/74
![Page 53: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/53.jpg)
Rescaling the branch lengths
Rescaling branch lengths of G′ so that branches are given in expectedmutations per site, G = µG′ , we get (if we let Θ = 4Neµ )
L =∑
G
Prob (G | Θ) Prob (D | G)
as the fundamental equation. For more complex population scenarios onesimply replaces Θ with a vector of parameters.
Week 9: Coalescents, part 2 – p.53/74
![Page 54: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/54.jpg)
The variability comes from two sources
Ne
Necan reduce variability by looking at
(i) more gene copies, or
(2) Randomness of coalescence of lineages
affected by the
can reduce variance of
branch by examining more sites
number of mutations per site per
mutation rate(1) Randomness of mutation
affected by effective population size
coalescence times allow estimation of
µ
(ii) more loci
Week 9: Coalescents, part 2 – p.54/74
![Page 55: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/55.jpg)
Computing the likelihood: averaging over coalescents
t
t
Like
lihoo
d of
t
Like
lihoo
d of
The product of the prior on t,
times the likelihood of that t from the data,
when integrated over all possible t’s, gives the
likelihood for the underlying parameter
The likelihood calculation in a sample of two gene copies
t
1Θ
Θ
Prio
r P
rob
of t
Θ1
Θ
Θ
Week 9: Coalescents, part 2 – p.55/74
![Page 56: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/56.jpg)
Computing the likelihood: averaging over coalescents
t
t
Like
lihoo
d of
t
Like
lihoo
d of
The product of the prior on t,
times the likelihood of that t from the data,
when integrated over all possible t’s, gives the
likelihood for the underlying parameter
The likelihood calculation in a sample of two gene copies
t
2ΘΘ
Prio
r P
rob
of t
2Θ
Θ
Θ
Week 9: Coalescents, part 2 – p.56/74
![Page 57: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/57.jpg)
Computing the likelihood: averaging over coalescents
t
t
Like
lihoo
d of
t
Like
lihoo
d of
The product of the prior on t,
times the likelihood of that t from the data,
when integrated over all possible t’s, gives the
likelihood for the underlying parameter
The likelihood calculation in a sample of two gene copies
t
3Θ
Θ
Prio
r P
rob
of t
3Θ
Θ
Θ
Week 9: Coalescents, part 2 – p.57/74
![Page 58: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/58.jpg)
Computing the likelihood: averaging over coalescents
t
t
Like
lihoo
d of
t
Like
lihoo
d of
The product of the prior on t,
times the likelihood of that t from the data,
when integrated over all possible t’s, gives the
likelihood for the underlying parameter
The likelihood calculation in a sample of two gene copies
t
1Θ2
Θ
3Θ
Θ
Prio
r P
rob
of t
2Θ
3Θ
Θ1
Θ
Θ
Week 9: Coalescents, part 2 – p.58/74
![Page 59: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/59.jpg)
Labelled historiesLabelled Histories (Edwards, 1970; Harding, 1971)
Trees that differ in the time−ordering of their nodes
A B C D
A B C D
These two are the same:
A B C D
A B C D
These two are different:
Week 9: Coalescents, part 2 – p.59/74
![Page 60: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/60.jpg)
Sampling approaches to coalescent likelihood
Bob Griffiths Simon Tavaré Mary Kuhner and Jon Yamato
Week 9: Coalescents, part 2 – p.60/74
![Page 61: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/61.jpg)
Monte Carlo integration
To get the area under a curve, we can either evaluate the function (f(x)) ata series of grid points and add up heights × widths:
or we can sample at random the same number of points, add up height ×width:
Week 9: Coalescents, part 2 – p.61/74
![Page 62: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/62.jpg)
Importance sampling
Week 9: Coalescents, part 2 – p.62/74
![Page 63: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/63.jpg)
Importance sampling
The function we integrate
We sample from this density
f(x)
g(x)
Week 9: Coalescents, part 2 – p.63/74
![Page 64: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/64.jpg)
The math of importance sampling
∫f(x) dx =
∫ f(x)g(x) g(x) dx
= Eg
[f(x)g(x)
]
which is the expectation for points sampled from g(x) of the ratio f(x)g(x) .
This is approximated by sampling a lot (n) of points from g(x) and thecomputing the average:
L =1
n
n∑
i=1
f(xi)
g(xi)
Week 9: Coalescents, part 2 – p.64/74
![Page 65: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/65.jpg)
The importance function used in LAMARC
In Mary Kuhner and Jon Yamato’s program LAMARC they use as theimportance function the probability density of the tree given the data at aset of “driving values” θ0 of the parameters:
f(G) =Prob (D |G) Prob (G | θ0)
Prob (D | θ0)
The denominator is impossible to evaluate but as we will see, isn’t reallyneeded.
The resulting likelihood ratio is
L(Θ)
L(Θ0)=
1
n
n∑
i=1
Prob (Gi|Θ)
Prob (Gi|Θ0)
Week 9: Coalescents, part 2 – p.65/74
![Page 66: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/66.jpg)
Markov Chain Monte Carlo (MCMC) methods
To do the importance sampling, MCMC methods are employed (in allprograms that do full likelihood or Bayesian analyses).
To sample from f(G), start with a tree Gold and
1. Have a “proposal distribution” from which you sample a new treeGnew
2. Compute the function f(Gnew) (we have that also for the old tree)
3. Draw a random fraction R between 0 and 1
4. If R < f(Gnew)f(Gold)
, accept the new tree. (Note that in that ratio any
horrible, but shared, denominators cancel out).
repeat this vast numbers of times (the correct number of times is infinity).
Week 9: Coalescents, part 2 – p.66/74
![Page 67: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/67.jpg)
Rearrangement to sample points in tree space
A conditional coalescent rearrangement strategy
Week 9: Coalescents, part 2 – p.67/74
![Page 68: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/68.jpg)
Dissolving a branch and regrowing it backwards
First pick a random node (interior or tip) and remove its subtree
Week 9: Coalescents, part 2 – p.68/74
![Page 69: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/69.jpg)
We allow it coalesce with the other branches
Then allow this node to re−coalesce with the tree
Week 9: Coalescents, part 2 – p.69/74
![Page 70: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/70.jpg)
and this gives another coalescent
The resulting tree proposed by this process
Week 9: Coalescents, part 2 – p.70/74
![Page 71: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/71.jpg)
An example of an MCMC likelihood curve
0
−10
−20
−30
−40
−50
−60
−70
−80
0.001 0.002 0.005 0.01 0.02 0.05 0.1
Θ
ln L
0.00650776
Results of analysing a data set with 50 sequences of 500 baseswhich was simulated with a true value of Θ = 0.01
Week 9: Coalescents, part 2 – p.71/74
![Page 72: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/72.jpg)
Major MCMC likelihood or Bayesian programs
LAMARC by Mary Kuhner and Jon Yamato and others. Likelihoodinference with multiple populations, recombination, migration,population growth. No historical branching events or serialsampling, yet.
BEAST by Andrew Rambaut, Alexei Drummond and others.Bayesian inference with multiple populations related by a tree.Support for serial sampling (no migration or recombination yet).
genetree by Bob Griffiths and Melanie Bahlo. Likelihood inference ofmigration rates and changes in population size. No recombination orhistorical branching events.
migrate by Peter Beerli. Likelihood inference with multiplepopulations and migration rates. No recombination or historicalbranching events yet.
IM and IMa by Rasmus Nielsen and Jody Hey. Two or morepopulations allowing both historical splitting and migration after that.No recombination yet.
Week 9: Coalescents, part 2 – p.72/74
![Page 73: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/73.jpg)
Approximately Bayesian Computation (ABC) methods
These involve approximating the sampling by computing some “summarystatistics” from the data, then finding parameter values that, in a simulationof a tree and data, result in summary statistic values close to these.
They are faster, and very popular now.
But ... they are very dependent on getting the right summary statistics soas not to lose too much power compared to fully-powerful likelihood orBayesian MCMC methods.
Week 9: Coalescents, part 2 – p.73/74
![Page 74: Week 9: Coalescents, part 2 · Week 9: Coalescents, part 2 – p.22/74. The coalescent To simulate a random genealogy, do the following: 1. Start with k lineages 2. Draw an exponential](https://reader035.vdocuments.us/reader035/viewer/2022081522/5fb5e7c17bef1a1be446a20a/html5/thumbnails/74.jpg)
Fixation probabilities from the diffusion approximation
100
10
10.1
−0.1−1
−10
−100
Probability of fixation of an allele with multiplicative fitnesses. Results fromthe diffusion approximation for various values of 4Ns and p are shown.The values of 4Ns are shown next to the nine curves, except for thediagonal, which has 4Ns = 0. Week 9: Coalescents, part 2 – p.74/74