Download - Genomic Conflict and DNA Sequence Variation
Marcy K. UyenoyamaDepartment of Biology
Duke University
Genomic Conflict and DNA Sequence Variation
• Population geneticsHistorically model-richPresent need: model-based interpretation of observed
patterns of genomic variationWhat are hallmarks of each model?
• Self-incompatibility systems in plantsRecognizing genomic conflict due to sexual
antagonism
Overview
• Neutral evolutionPure neutrality: distribution of offspring number is
independent of any trait in parentDemographic history: deme founding, gene flowPurifying selection: maintain functioning state
against random deleterious mutations
• SelectionBalancing selection: maintenance of different formsSelective sweeps: substitution of most fit for less fit
Canonical models
• How do we know it when we see it?Patterns evident in genome variation
• Model selectionChoosing among a small number of canonical models
for any particular system
Hallmarks of evolution
A random sample of genes
Ancestral sequence
Sample
Observed
Site frequency spectrum
Allele and mutation spectra
0
1
2
3
4
5
6
7
1 2 3 4 5 6 17
Multiplicity
Num
ber o
f mut
atio
ns
a = {a1 = 6, a3 = 1, a5 = 1, a6 = 1}, for ai the number of alleles with multiplicity i
After an interval choose a lineage at random
– Replace it by two identical copies with probability
– Mutate it according to P with probability
The neutral coalescent
Sample root from stationary distribution of P,mutation transition matrix and bifurcate
t : exp(1 2 )
1 / (1 2 )
2 / (1 2 )
• Events on level k Bifurcation at rate Mutation at rate
• Population parameters: ratios of rates Next event is a bifurcation/coalescence with probability
Evolutionary rates
Nk
/21
ku2
1
1
2/2
2/2
lim0/1,
21
1
kk
kuNk
Nk
Nu
Nu
Nu 2/1limfor
0/1,
Site frequency spectrum
Allele and mutation spectra
0
1
2
3
4
5
6
7
1 2 3 4 5 6 17
Multiplicity
Num
ber o
f mut
atio
ns
a = {a1 = 6, a3 = 1, a5 = 1, a6 = 1}, for ai the number of alleles with multiplicity i
• MutationNovel allelic types formed at rate u per gene per generation
• ReproductionFrequency of allele i in the parental population: pi
Multinomial sampling of N genes to form the offspring
To find: probability of the sample of n genes (n1, n2, …, nk) or (a1, a2, …, an)
for k the number of distinct haplotypes (alleles)ni the number of replicates of allele i
ai the number of alleles with i replicates
Infinite-alleles model
!1
)1()1(!)(
1 i
an
i ainnp
i
a
a = (a1, a2, …, an), for ai the number of alleles representedby i replicates in a sample of size n
= 2Nu, for N the effective number of genes and u the per-locus, per-generation rate of mutation
Ewens (1972, Theoretical Population Biology)
Ewens sampling formula
Site frequency spectrum
Allele and mutation spectra
0
1
2
3
4
5
6
7
1 2 3 4 5 6 17
Multiplicity
Num
ber o
f mut
atio
ns
a = {a1 = 6, a3 = 1, a5 = 1, a6 = 1}, for ai the number of alleles with multiplicity i
Population genomics
http://www.arabidopsis.org
About 750 accessions isolated from natural populations worldwideSummary statistics for sample of 19 entire genomes
Arabidopsis SNP spectra
Kim et al. (2008 Nature Genetics. 39: 1151)
Site frequency spectra differ among functional classes2Minor allele counts 3 5 6 7 84
• Biallelic sample of size m
• Multiplicities i and (m – i )
ESF conditioned on two alleles
1
1 2
1( 2 | )1 1
mm
l j
jP K ml j
1
1
/2 1
1
1/ 1/ ( )( 1, 1| 2, ) for / 21/
2 /( 2 | 2, )1/
i m i m
j
m m
j
i m iP a a K m i mj
mP a K mj
independent of !
!1
)1()1(!)(
1 i
an
i ainnp
i
a
a = (a1, a2, …, an), for ai the number of alleles representedby i replicates in a sample of size n
= 2Nu, for N the effective number of genes and u the per-locus, per-generation rate of mutation
Ewens (1972, Theoretical Population Biology)
Ewens sampling formula
Actual site frequency spectra
Excess of rare and common types, deficiency of intermediate typesData from NIEHS Environmental Genome Project
Direct resequencing of loci considered environmentally-sensitiveGlobal representation of ethnicities
Hernandez, Williamson, and Bustamante (2007)
Black: constant population sizeGrey: recent expansion from small population size
Braverman et al. (1995)
Spectrum shapeSignature of expansion?
Expansions maintain more rare mutationsSignature of selective sweep?
Neutral variants experience selection asa population bottleneck
Arabidopsis SNP spectra
Kim et al. (2008 Nature Genetics. 39: 1151)
Site frequency spectra differ among functional classes2Minor allele counts 3 5 6 7 84
Modelling a SNP data set
• Single segregating mutation in the sample genealogyConditional on exactly one segregating site, determine the
distribution of the size (number of descendants) of the branch on which the mutation occurs
• Exactly two alleles in the sampleConditional on two haplotypes, bearing any number of
segregating sites, determine the distribution of numbers of the two alleles
Nordborg (2001 Handbook of Statistical Genetics)
• Two alleles
• One segregating site
Conditioning
1
1 2
1( 2 | )1 1
mm
l j
jP K ml j
1
1 2
1( 1| )1 1
mm
l j
jP S ml j
• Single segregating site in a sample of size m
• Multiplicity i
1
1 2
1( 1| )1 1
mm
l j
jP S ml j
Multiplicity conditioned on a SNP
1
2
2
1 11 1
( | , )1 1
1
m i
l
m
j
m lli l i
f i mm
i j
dependent on θ!
Ganapathy and Uyenoyama (2009 Theoretical Population Biology)
Arabidopsis SNP spectra
Kim et al. (2008 Nature Genetics. 39: 1151)
Site frequency spectra differ among functional classes2Minor allele counts 3 5 6 7 84
• Population geneticsHistorically model-richPresent need: model-based interpretation of observed
patterns of genomic variationWhat are hallmarks of each model?
• Self-incompatibility systems in plantsRecognizing genomic conflict due to sexual
antagonism
Overview
• PhenotypesMultiple genes generally influence a given phenotype
• ConflictTarget trait value differs among genes that control
phenotypeSexual antagonism
Male and female function collaborate in reproductionGenes influencing each function may come into conflict
Genomic conflict
• Mating type regions as a battlegroundS-locus controls self-incompatibility in flowering
plantsHow does sexual antagonism affect the pattern of
molecular-level variation within the S-locus?What are hallmarks of conflict?
• Develop a basis for inferenceModel-based approach to the analysis of genetic
variation
Conflict and genomic variation
Mariana Ruizhttp://commons.wikimedia.org/wiki/File:Mature_flower_diagram.svg
• Flower developmentBasic perfect flower includes
both male and female components
• FertilizationPollen grains deposited on
stigma germinate and pollen tubes grow down style to the ovary
Norbert Holsteinhttp://commons.wikimedia.org/wiki/File:Gametophytic_self-incompatibility.png
Mariana Ruizhttp://commons.wikimedia.org/wiki/File:Mature_flower_diagram.svg
• Gametophytic SI (GSI)Specificity expressed by
individual pollen grain or tube determined by own S-allele
• Pollen rejectionGrowth of pollen tube
arrested in style
Norbert Holsteinhttp://commons.wikimedia.org/wiki/File:Sporophytic_self-incompatibility.png
Mariana Ruizhttp://commons.wikimedia.org/wiki/File:Mature_flower_diagram.svg
• Sporophytic SI (SSI)Specificity expressed by
individual pollen grain or tube determined by the S-locus genotype of its parent
• Pollen rejectionGermination of pollen grain
may be arrested at stigma surface
Norbert Holsteinhttp://commons.wikimedia.org/wiki/File:Gametophytic_self-incompatibility.png
Norbert Holsteinhttp://commons.wikimedia.org/wiki/File:Sporophytic_self-incompatibility.png
Mariana Ruizhttp://commons.wikimedia.org/wiki/File:Mature_flower_diagram.svg
An Bn
Sn
Pistil (A) component: rejection ofrecognized specificities
Pollen (B) component: declaration ofspecificity
Mating type regions
Uyenoyama (2005)
Human Y chromosome
Skaletsky et al. (2003 Nature 423: 825)
• Non-recombining male-specific Y (MSY)Euchromatic region ~ 23 MBDifferences between two random Ys every 3 – 4 KB
• Mammalian sex determinant SRYY-linked regulator of transcription of many male-
specific Y-linked genes
Mating type regions
Uyenoyama (2005)
Linkage between pistil (A) and pollen (B)components is essential to SI function• Pollen: declaration of specificity• Pistil: rejection of recognized specificities
Brassica S-locus
Pollen componentPistil component
Nasrallah (2000 Curr. Opin. Plant Biol.)
Natural populations often contain 30 – 50 S-alleles
Vierstra (2009, Nature Reviews Molecular Cell Biology)
Ubiquitin tags proteins for degradation
• Style: S-RNase disrupts pollen tube growthUpon entering a pollen tube, S-RNases initially sequestered in a vacuoleIn incompatible crosses, vacuole breaks down, releasing S-RNases into
cytoplasm of pollen tube
• Pollen: SLF (S-locus F-box)Mediator of ubiquitinylation (attachment of ubiquitin)Disables all S-RNases except those of the same specificity
• Pistil: why reject fertilization?Screening of potential mates may improve offspring
qualityCost under incomplete reproductive compensation:
ovules may go unfertilized
• Pollen: why provoke rejection?Self-rejection may improve quality of own ovulesRejection by other plants reduces siring success
Hide behind another S-specificity in sporophytic SI?Decline to declare S-specificity altogether?
Sexual antagonism
• Basic discrete time recursion
• Symmetries in genotype and allele frequenciesModel change in frequency of focal allele i, assuming
all other alleles in equal frequency
GSI model
'
, ,
/ 21 1
jk ikij i j
k i j k i jj k i k
P PP q q
q q q q
Wright (1937, Genetics)
1 for [1 ( 1)] / for ,
2( 1) / 2 (1 ) / ( 1) for
ij jk
i j
nP P j i P P n j k i
q q P n q q n j i
• Change in allele frequency
• Diffusion equation coefficients
holds for large population size (N) and u (rate of mutation to new S-alleles) of order 1/N
Diffusion approximation
(1 ) for the number of common S-alleles3 2(1 ) for 1/
( 1)( 2)
q qnq nn q
nq nq q nn n
Wright (1937, Genetics)
2
( ) (1 ) / ( 1)( 2)
( ) (1 2 ) / 2
x nx nx n n ux
x x x N
Num
ber o
f S-a
llele
s
Frequency in population
• Diffusion with jumps
• Turnover rate
Wright’s diffusion model
(x) nx(1 nx)
(n 1)(n 2) ux
2 x(1 2x)
2N
4( 1)( 2)
Nunn n
Takahata (1993, Mechanisms of Molecular Evolution)
Expansion of time scaleunder balancing selection
• High rate of invasion of rare allelesPromotes invasion of new
and retention of rare typesMaintains high numbers of
alleles• Genealogical relationships
Tree shape similar under symmetric balancing selection and neutrality
Greatly expanded time scale
• Quasi-equilibrium of S-allelesInvasion of new, rare S-alleles balanced by extinction
of common S-alleles
• Expansion of time scaleRate of divergence among S-allele classes similar to
rate among neutral lineages, but in a population of size fN:
S-allele turnover
2
2
(1 1 / )2 2 ( 1)( 2)
2 4 162
j jn
n n n nfnNf N N u
• Basic discrete time recursion
• Diffusion approximation
Parameters:Effective population size (N)Rate of mutation to new S-specificities (u)
Gametophytic SI models
'
, ,
/ 21 1
jk ikij i j
k i j k i jj k i k
P PP q q
q q q q
2
(1 )( )( 1)( 2)
(1 2 )( )2
nx nxx uxn nx xx
N
• Stationary distribution of allele frequencyMost time spent close to
deterministic equilibrium (1/n) or in boundary layer close to extinction
• Number of S-allelesAnalytical expectation for
number of common S-alleles
Simulation results
Vallejo-Marín and Uyenoyama (2008)
Norbert Holsteinhttp://commons.wikimedia.org/wiki/File:Gametophytic_self-incompatibility.png
Norbert Holsteinhttp://commons.wikimedia.org/wiki/File:Sporophytic_self-incompatibility.png
Mariana Ruizhttp://commons.wikimedia.org/wiki/File:Mature_flower_diagram.svg
An Bn
Sn
Pistil (A) component: rejection ofrecognized specificities
Pollen (B) component: declaration ofspecificity
Pollen specificity in GSI• Each pollen expresses its
own specificityRarer specificities are
incompatible with fewer plants
• Incompatible matingsFor n S-alleles in equal
frequencies, a pollen type is incompatible with a proportion 2/n of all plants
Norbert Holsteinhttp://commons.wikimedia.org/wiki/File:Gametophytic_self-incompatibility.png
• Pistil: why reject fertilization?Screening of potential mates may improve offspring
qualityCost under incomplete reproductive compensation:
ovules may go unfertilized
• Pollen: why provoke rejection?Self-rejection may improve quality of own ovulesRejection by other plants reduces siring success
Hide behind another S-specificity in sporophytic SI?Decline to declare S-specificity altogether?
Sexual antagonism
1.00.80.60.40.20.00.0
0.2
0.4
0.6
0.8
1.0
Column 2Inf
Data from "Ainv"
s
Col
umn
2
Self-pollen fraction (s)
Rel
ativ
e vi
abili
ty o
f inb
red
offs
prin
g (
)
Full SC
Polymorphism
Full SI
Fate of style-part mutantAn+1 Bn
Sa
1.00.80.60.40.20.00.0
0.2
0.4
0.6
0.8
1.0
Data from "Binv"
s
n= 1
0
Self-pollen fraction (s)
Rel
ativ
e vi
abili
ty o
f inb
red
offs
prin
g (
)
Full SC
Polymorphism
Full SI
Disruption
Uyenoyama, Zhang, and Newbigin (2001)
Fate of pollen-part mutantAn Bn+1
Sb
An+1 Bn
Sa
An Bn+1
Sb
An+1 Bn+1
Sn+1
An Bn
Sn
Direction of pollen flow
Uyenoyama, Zhang, and Newbigin (2001)
An+1 Bn
Sa
An Bn+1
Sb
An+1 Bn+1
Sn+1
An Bn
Sn
Uyenoyama, Zhang, and Newbigin (2001)
Evolutionarily unlikelyTURN OFFPartial breakdown of SIby pollen disablement
TURN ONRestoration of SIby stylar recognition
Evolutionarily unlikely
Joint genealogies
Newbigin, Paape, and Kohn (2008)
Unlike S-RNase genes, SLF genes show– Low divergence between allelic types– No trans-specific sharing of lineages
Solanaceae and Plantaginaceae Rosaceae
• Family-specific genealogiesRosaceae: do highly-diverged, ancient SFB lineages
reflect continuous operation or restoration of same F-box genes?
Solanaceae, Plantaginaceae: Recruitment of new F-box genes?
• Turnover of pollen-specificity lociExpression and recognition of a paralogue of the
former pollen specificity gene?Can homologues be distinguished from paralogues
with new function?
Cycles of loss/restoration of SI?
Brassica S-locus
Pollen componentPistil component
Nasrallah (2000 Curr. Opin. Plant Biol.)
Natural populations often contain 30 – 50 S-alleles
• Sexual antagonism in mating type regionsNeutral variation in linked regionsRates of substitution at determinants of mating type
• InferenceGoal: use the pattern of variation in population
samples of genomic regions as a basis for inference about the evolutionary process
Detection • genomic conflict and other forms of selection• mating systems and population structure
An appeal for inference methods
Norbert Holsteinhttp://commons.wikimedia.org/wiki/File:Sporophytic_self-incompatibility.png
Pollen specificity in SSI• Codominance
Both specificities expressedAlmost twice as many incompatible
styles under SSI than GSI for same number of S-alleles
• Complete dominanceOne specificity expressed
SRK genealogies
Edh, Widén and Ceplitis (2009)
• Sporophytic SIDiploid genotype of pollen parent
determines S-specificity of each pollen grain
Class I is dominant over Class II, with codominance within class
• Class II: pollen-recessiveLower number of segregating
alleles, each with relatively higher frequency in population
Greater genealogical relationship within class?
Is class II younger
than class I?
Uyenoyama (1995)
• MRCA agesClass I: 25.5 ± 8.1 MYClass II: 3.1 ± 0.9 MYI/II: 41.4 ± 12.7 MY
• Origin of SLG/SRK system42.1 ± 9.0 MY