1 choosing mutation models in population genetics joachim mergeay research institute for nature and...

30
1 Choosing mutation models in population genetics Joachim Mergeay Research Institute for Nature and Forest BELGIUM POPGROUP 47 Bath, 10.01.2014 Old school POPGEN

Upload: addison-albright

Post on 14-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

1

Choosing mutation models in population genetics

Joachim MergeayResearch Institute for Nature and Forest BELGIUMPOPGROUP 47 Bath, 10.01.2014

Old school POPGEN

2

Processes affecting genetic diversity

Mutation, µMigration, mSelection, S Drift, Ne

3

Processes affecting genetic diversity

When all parameters (µ, m, S, Ne) stable expected equilibrium genetic

diversity, Heq

At this stage H ≈ 4Neµ (for the entire population)

75% equilibrium: how long does it take?Crow & Aoki 1984, PNAS: time to halfway equilibrium is Ln(2) / (2µ +1/(2N))

µ=10-9 (SNP) t ≈ 2.8 Ne generations

µ=10-4 (µsat) t ≈ 2.0 Ne generations

4

Processes affecting genetic diversity

When all parameters (µ, m, S, Ne) stable expected equilibrium genetic

diversity, Heq

Equilibrium

ElusiveIllusion

5

Processes affecting genetic diversity

When all parameters (µ, m, S, Ne) stable expected equilibrium genetic

diversity, Heq

Most populations

6

Processes affecting genetic diversity

When all parameters (µ, m, S, Ne) stable expected equilibrium genetic

diversity, Heq

Deviation from expected pattern Heq? inference on underlying processes causing the deviation

Different processes can lead to similar patterns

7

Mutation models

Genetic structure: F-statistics & co

FST, GST, RST, D, HST, DST, pST, FST, ST, NST, rST

Demographic changes (expansion – contraction = bottlenecks)

Selection(Relatedness – phylogeography – Phylogenetics)

8

Types of mutation models

Models that take evolutionary relations among alleles into account

SMM, TPM (only µsats) DNA sequence evolution models (e.g.,

GTR+G+I)e.g. allele AAA is more related to ATA

than to GGCExplicit mutation models

Models that don’t IAM - oldest model KAM SNPs: K ≤ 4 (IAM:

K=∞)Implicit mutation models

9

Explicit mutation models

Give extra weight to relation among allelesSMM/TPM is considered “best for microsatellite studies” rather than IAM. “The Two-Phase Model (TPM) was used since it’s more appropriate and realistic for microsatellites (Luikart & Cornuet 1998; Piry et al. 1999)”

Whenever we can use explicit models, we tend to think we should use them because we can.

“Parametric tests are better for experimental studies, non-parametric tests are better for field surveys”

10

Explicit mutation models

Is this extra information (evolutionary relations among alleles) informative for the biological question asked?

Time & space: recent small-scale versus old and large scale processes?

11

Explicit model or not?

Landscape with three pops

Phylogeography: explicit MM

Current landscape genetics & gene flow: implicit MM

12

T=0: 3 phylogroups

13

T=1: 4 populations founded from different sources (admixed populations)

In the admixed populations we integrate allele histories that originated in the source populations. Is this informative for the tested hypothesis?

14

Conflicts between population history and allele history

Assumption of EMM in population geneticsMutation is the prime source of genetic diversity in a population

Population history // allele history

Spatial and temporal scale of the study: new allele may (not) have entered (meta)population through migration. Allele ancestry confuses signal of migration Allele ancestry represents more ancient process of mutation that

happened elsewhere

Explicit MM wrongly assumes that immigrant allele has local common ancestor with other resident alleles

We cannot make the difference between mutation and migration as causal process in the make-up of the genetic structure of populations, but wrongly assume that it was mutation

Equivalence to homoplasy: the allele is identical by state, but not identical by descent in the sense that it did not descend from another allele within the population, but from outside the population.

15

Implicit mutation models

No assumption related to ancestry of allelesDo not make assumptions on origin of diversity: m or µ

Choice of mutation model essential in tests for deviations of equilibrium

16

Detecting selection based on deviation from mutation-selection equilibrium

Tajima’s D (1989a), Fu’s Fs (1997)Watterson’s neutrality test (1975,77,78)

Assumes constant Ne, µ and m

17

Detection of Ne change (expansion/contraction) Test for deviation of mutation-drift eq.Same principle, different assumptionsTajima’s D (1989b), Fu’s Fs (1997),Nei et al. (1975), Chakraborty & Nei (1977), Cornuet & Luikart (1996), etc.

...

Assumes constant µ, m=0, S=0

Who ever did a bottleneck test?

949 1856

272

4928+6212160

19

Ne change: Bottleneck

Tests for population contraction / expansion are tests for deviations from mutation-drift equilibrium

Assumption: m = 0, s = 0. Population was in MSMD equilibrium prior to disturbanceDisturbance disequilibrium march to new equilibrium

Time for new equilibrium is function of µ and N, assuming m=0 and s=0Crow & Aoki (1984)

Cornuet & Luikart (1996)Piry, Luikart & Cornuet (1999)

20

“Bottleneck” tests with migration

If gene flow (“migration”) is clearly primary source of genetic diversity

m>>µIf markers are neutral (most anonymous markers are on average)

mutation-selection-migration-drift

Test for migration-drift equilibrium

change in Ne or m

Evol relationships among alleles add NOISE to the data

explicit mutation models are useless

21

“Bottleneck” tests with migration

Broquet et al. (2010) simulated sudden drops in m in equilibrium populations“signals akin to genetic bottlenecks” “excess in gene diversity relative to mutation-drift equilibrium”Actually deviation of migration-drift

equilibriumChange in landscape genetic structure

(fragmentation!)

22

Considering assumptionsBottleneck tests & related tests can be used for deviation from equilibrium from

selection-mutationmutation-driftmigration-drift

Caveat: underlying assumptions change

23

Considering assumptions

Are you sure you know which deviation you are testing for?Decision tool for demography change test:

If µ << m migration-drift implicit model!

If µ >> m mutation-drift no recent or past admixture explicit model;

with recent or past admixture, or no information implicit model

Test for admixture using phylogeographic data exploration: mtDNA & nDNA, individual-based clustering approaches, population homogeneity tests, … Hypothesis should drive the sampling design, mutation model, and then marker design

24

Fishing for patterns…“To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of”R.A. Fisher

25

Who considers assumptions?Who chooses mutation model based on

question?

1/20 mentions assumption of no migration

0/20 attempted to check assumption of no migration

17/20 have probable migration among tested pops

16/20 chose mutation model based on marker type

3/20 test all models and cherry-pick afterwards

1/17 using SMM or TPM has evidence that µ >> m in the genetic make-up before and after the putative bottleneck

1/20 did not violate the migration assumption, and chose the MM that least likely violated other assumptions

Screening of 20 most recent studies citing Piry et al. (1999) and looking for population bottlenecks

26

Who considers assumptions?Who chooses mutation model based on

question?

Referee = author of other paper

Statistical machismo: complex models = better, long computation times = better

but by all means, don’t bother about the assumptions

Screening of 20 most recent studies citing Piry et al. (1999) and looking for population bottlenecks

27

Phylogeography and recent genetic structure in an expanding damselfly

Swaegers et al. JEB, nearly there

mtDNA whole range: EMM Expansion (Tajima’s D) After LGM

µsats: per phylocluster test for expansion/contraction, TPMW-EU: ExpansionE-EU: ExpansionN-Afr: inconclusive (no effect)

In recent range: IAMNo evidence of disequilibrium

28

“Bottleneck” tests with migrationLower Nem does not directly affect rate of local drift, but

impacts how much drift is compensated by gene flow

Connected pops, same local sizeExperienced Ne is larger due to compensation of drift through gene flow

Isolated popsSquare size ~ NeTotal metapop Ne size = sum

Total metapop size still the sameAverage proportion of Ne not shared among pops =1-FST

“Bottleneck” tests with migration

Relation between Fst and gene flow is hyperbolicChange in Fst per extra unit of gene flow largest around Nm=1

Nm=1 is considered minimal required gene flow for functional connectivity

A small change in Nm around the minimal threshold yields a comparably large deviation of migration-drift equilibrium

Useful in conservation genetics to detect genetic extinction debt / early warning for change in demography (N and m)

Only with IAM

30

Thanks for the attention, & thanks to

Organizers, for letting me fill an orphaned slot

Janne Swaegers, Joost Vanoverbeke, Marc Ventura, Joost Raeymaekers

Luc De Meester (KULeuven), Maurice Hoffmann (INBO)

Water fleas, for being awesome model organisms