population structure, heritability, and polygenic risk · 10/17/2016  · europeans east asians...

26
Population structure, heritability, and polygenic risk Alicia Martin Daly Lab October 18, 2016 @genetisaur [email protected]

Upload: others

Post on 02-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Population structure, heritability, and polygenic risk · 10/17/2016  · Europeans East Asians Africans South Asians Admixed Americas. Varying admixture proportions across populations

Population structure, heritability, and polygenic risk

Alicia MartinDaly Lab

October 18, 2016

@[email protected]

Page 2: Population structure, heritability, and polygenic risk · 10/17/2016  · Europeans East Asians Africans South Asians Admixed Americas. Varying admixture proportions across populations

Project goals

✓ Call local ancestry in large case/control PTSD cohort of African Americans

• Estimate heritability using local ancestry tracts. Compare/contrast this estimate with SNP-based heritability in this and European cohort (in progress)

• Perform admixture mapping

• Considerations: transferability of polygenic risk scores, cross-population heritability

(Work with Karestan Koenen, Mark Daly, Laramie Duncan, Caroline Nievergelt)

Page 3: Population structure, heritability, and polygenic risk · 10/17/2016  · Europeans East Asians Africans South Asians Admixed Americas. Varying admixture proportions across populations

Data overview

Study PI Analyst NTotal NAAData label

1 GTP (Grady Trauma Project) Kerry Ressler Lynn Almli 4752 3492 gt2y2 Detriot (DNHS) Monica Uddin Guia Guffanti 812 650 dnhy3 Genetics of Substance Dependence Goel Gelernter Pingxing Xie 5451 3100 gsdy4 Marine Resilience Study Caroline Nievergelt /

Dewleen Baker Adam Maihofer 4036 226 mrsy

5 Family Study of Cocaine Dependence Laura Bierut Louis Fox 1271 653 fscy6 COGEND Laura Bierut Louis Fox 2768 711 cogy

7 Nurses Health Study Karestan Koenen Andrew Ratanatharathorn 1378

8 Stein South Africa Dan Stein / Kerry Ressler Lynn Almli 434

9 Ohio National Guard Israel Liberzon Tony King 239  Summary Statistics from imputed data

10 Duke J. Beckham / M. Hauser / A. Ashley-Koch Melanie Garrett 1963

11 National Center for PTSD (Boston) Mark Miller / Mark Logue Mark Logue 652

 Total 23,756 8,832

Page 4: Population structure, heritability, and polygenic risk · 10/17/2016  · Europeans East Asians Africans South Asians Admixed Americas. Varying admixture proportions across populations

Local ancestry calling strategy

1. Merge intersecting genotyped SNPs (N=421,607 with MAF > 0.05)

2. Phase aggregated dataset with HAPI-UR 3x and take best combined phase

3. Split jointly phased haplotypes into reference + 50 sets of admixed samples for computational feasibility

4. Aggregate local ancestry calls across all runs

5. Collapse local ancestry output

gt2y dnhy gsdy mrsy fscy cogy YRI CEU+ + + + + + +

AA + reference genos

AA + reference jointly phased haplotypes

Local ancestry

run 1

Local ancestry run 50

Local ancestry

run 2

Local ancestry run 49

+ ++ +...

Combined local ancestry calls

Collapsed bed files, ancestry karyograms, and plink files

1

2

3

4

5

Page 5: Population structure, heritability, and polygenic risk · 10/17/2016  · Europeans East Asians Africans South Asians Admixed Americas. Varying admixture proportions across populations

Heritability estimates

h2 estimate Kinship matrix SE N

h2g REAP 0.018 0.046 7548

h2g GCTA GRM 0.02 0.048 7248

h2γlocal ancestry GRM ? ?

h2

Zaitlen, N., et al. (2014). Nat. Genet. 46, 1356–1362.

h2� =phenotypic variation described by variation in local ancestry

�2� =phenotypic variation explained by variation in local ancestry

�2e =residual phenotypic variance

h2� =

�2�

�2� + �2

e

FSTC =weighted allele frequency di↵erences

between ancestral populations at causal loci

✓ =genome-wide ancestry proportions

h2� =2FSTC✓(1� ✓)h2

Page 6: Population structure, heritability, and polygenic risk · 10/17/2016  · Europeans East Asians Africans South Asians Admixed Americas. Varying admixture proportions across populations

1000 Genomes phase 3 populations

Auton, A., et al. (2015). Nature 526, 68–74.

Page 7: Population structure, heritability, and polygenic risk · 10/17/2016  · Europeans East Asians Africans South Asians Admixed Americas. Varying admixture proportions across populations

Substantial global genetic diversity in 1000 Genomes

CEUGBR IBS TS

IFIN KH

VCHSCHBJPT

CDX

MSLYRIESNLWK

GWD

GIH PJ

LITU BE

BSTU

ACB

PURCLMMXLPEL

ASW

K=5

K=6

K=7

EuropeansEast Asians Africans

SouthAsians

AdmixedAmericas

Page 8: Population structure, heritability, and polygenic risk · 10/17/2016  · Europeans East Asians Africans South Asians Admixed Americas. Varying admixture proportions across populations

Varying admixture proportions across populations in the Americas

0.00.20.40.60.81.0

NAT

CEU

YRI

0.00.20.40.60.81.0

ACB

ASW

0.00.20.40.60.81.0

PUR

CLM

MXL

PEL

Referencepanel

AfricanAmerican

Hispanic/Latino

African AmericansACB = African Caribbean in BarbadosASW = African Ancestry in SW US

Hispanic/LatinosCLM = ColombiansMXL= Mexicans

PUR = Puerto RicansPEL = Peruvians

NAT = Mao et al, (2007). AJHG. 80, 1171–1178.

Page 9: Population structure, heritability, and polygenic risk · 10/17/2016  · Europeans East Asians Africans South Asians Admixed Americas. Varying admixture proportions across populations

Admixed samples in the Americas

Page 10: Population structure, heritability, and polygenic risk · 10/17/2016  · Europeans East Asians Africans South Asians Admixed Americas. Varying admixture proportions across populations

Admixture tracts inform subcontinental-level ancestral populations

RFMix: Maples, B.K., et al (2013). AJHG. 93, 278–288.

HG01893 (Peruvian)

Page 11: Population structure, heritability, and polygenic risk · 10/17/2016  · Europeans East Asians Africans South Asians Admixed Americas. Varying admixture proportions across populations

Ancestry-specific PCA provides insight into subcontinental admixture origins

−5

−4

−3

−2

−1

0

1

−1.0 −0.5 0.0 0.5 1.0PC1

PC2

ReferenceAFREURNAT

AdmixedACBASWCLMMXLPELPUR

ASPCA: Moreno-Estrada, A., et al. (2013). PLoS Genetics. 9, e1003925.

Page 12: Population structure, heritability, and polygenic risk · 10/17/2016  · Europeans East Asians Africans South Asians Admixed Americas. Varying admixture proportions across populations

African Americans have northern European tracts, Hispanics have southern European tracts

ASPCA: Moreno-Estrada, A., et al. (2013). PLoS Genetics. 9, e1003925.

−3

−2

−1

0

1

−2 −1 0 1 2PC1

−PC2

ReferenceFINCEUGBRIBSTSI

AdmixedACBASWCLMMXLPELPUR

Page 13: Population structure, heritability, and polygenic risk · 10/17/2016  · Europeans East Asians Africans South Asians Admixed Americas. Varying admixture proportions across populations

African Americans have African tracts closest to Nigerian reference panel

−2

−1

0

1

−1 0 1 2−PC1

PC2

ReferenceESNGWDLWKMSLYRI

AdmixedACBASW

ASPCA: Moreno-Estrada, A., et al. (2013). PLoS Genetics. 9, e1003925.

GWD

LWKMSL YRI

ESN

Page 14: Population structure, heritability, and polygenic risk · 10/17/2016  · Europeans East Asians Africans South Asians Admixed Americas. Varying admixture proportions across populations

Africans have more genetic variation than out-of-Africa populations

1000 Genomes Project Consortium. (2015). A global reference for human genetic variation. Nature 526, 68–74.

AFR

AMR

EAS

EUR

SAS

Page 15: Population structure, heritability, and polygenic risk · 10/17/2016  · Europeans East Asians Africans South Asians Admixed Americas. Varying admixture proportions across populations

Biased genetic discoveries

African

LatinoEast Asian

Middle Eastern

European

OceanicSouth Asian

Global population

East Asian

European

PGC GWAS(SCZ, BIP, MDD, ADHD)

Page 16: Population structure, heritability, and polygenic risk · 10/17/2016  · Europeans East Asians Africans South Asians Admixed Americas. Varying admixture proportions across populations

Europeans (and Hispanic/Latinos) are overrepresented in disease databases

1000 Genomes Project Consortium. (2015). A global reference for human genetic variation. Nature 526, 68–74.

Page 17: Population structure, heritability, and polygenic risk · 10/17/2016  · Europeans East Asians Africans South Asians Admixed Americas. Varying admixture proportions across populations

Computing polygenic risk scores from summary statistics

• LD clumping for all variants with MAF ≥ 0.01:

• Apply p-value threshold (p=0.01)

• Thin for LD within window (R2=0.5, window=250kb)

(P+T in LDpred paper)

X =mX

i=1

gi�i

!

Page 18: Population structure, heritability, and polygenic risk · 10/17/2016  · Europeans East Asians Africans South Asians Admixed Americas. Varying admixture proportions across populations

Polygenic risk score for height reflects adaptive event in Europeans… and bias

Wood, A.R., et al. (2014). Nature Genetics 46, 1173–1186.

0

2000

4000

6000

0.0e+00 2.5e−04 5.0e−04 7.5e−04 1.0e−03Polygenic Risk Score

Den

sity Region

N.EuropeS.Europe

European height score

Page 19: Population structure, heritability, and polygenic risk · 10/17/2016  · Europeans East Asians Africans South Asians Admixed Americas. Varying admixture proportions across populations

Polygenic risk score for height reflects adaptive event in Europeans… and bias

Wood, A.R., et al. (2014). Nature Genetics 46, 1173–1186.

0

2000

4000

6000

0.0e+00 2.5e−04 5.0e−04 7.5e−04 1.0e−03Polygenic Risk Score

Den

sity Region

N.EuropeS.Europe

European height score

0

2500

5000

7500

10000

0.0e+00 2.5e−04 5.0e−04 7.5e−04 1.0e−03Polygenic Score

Den

sity

Superpopulation

AFRAMREASEURSAS

Global height score

Page 20: Population structure, heritability, and polygenic risk · 10/17/2016  · Europeans East Asians Africans South Asians Admixed Americas. Varying admixture proportions across populations

Polygenic risk score for Type II diabetes highlights role of demography

European: Gaulton, K.J., et al. (2015). Nat. Genet. 47, 1415–1425.Multi-ethnic: Mahajan, A., et al. (2014). Nat. Genet. 46, 234–244.

0

25

50

75

100

0.54 0.56 0.58 0.60Polygenic Score

Den

sity

Superpopulation

AFRAMREASEURSAS

Global T2D (Multi−ethnic) score

0

5

10

15

20

25

0.50 0.55 0.60 0.65Polygenic Score

Den

sity

Superpopulation

AFRAMREASEURSAS

Global T2D (EUR) score

Page 21: Population structure, heritability, and polygenic risk · 10/17/2016  · Europeans East Asians Africans South Asians Admixed Americas. Varying admixture proportions across populations

Coalescent model for simulation framework

Demographic model: Gravel, S., et al. (2011). Proc. Natl. Acad. Sci. U. S. A. 108, 11983–11988.msprime: Kelleher, J., Etheridge, A.M., and Mcvean, G. (2015). PLoS Comput Biol 1–22.

Page 22: Population structure, heritability, and polygenic risk · 10/17/2016  · Europeans East Asians Africans South Asians Admixed Americas. Varying admixture proportions across populations

Simulation steps

• Simulate for chr20 (μ=2e-8 mutations/(bp*generation)) genotypes with HapMap recombination map for 200k each: Africans, East Asians, Europeans

• Assign “true” causal effect sizes to m evenly spaced variants as:

• As before, define X as:

• Normalize:

• Compute true PRS as (such that total variance is h2):

X =mX

i=1

gi�i

� ⇠ N(0,h2

m)

ZX =X � µX

�X

G =ph2 ⇤ ZX

Page 23: Population structure, heritability, and polygenic risk · 10/17/2016  · Europeans East Asians Africans South Asians Admixed Americas. Varying admixture proportions across populations

Simulation steps

• Compute the total liability for each individual (epsilon is standard normal noise), such that:

• Assuming a 5% prevalence, assign 10,000 European individuals at the most extreme end of the liability threshold “case” status. Randomly assign different 10,000 European individuals “control” status.

• Run a simulated GWAS, computing Fisher’s exact test for all sites with MAF ≥ 0.01.

• Clump SNPs into LD blocks for all sites with p≤1e-2, R2≥0.5 in Europeans, and window size of 250kb.

• Compute inferred PRS from summary stats and ⍴ with true PRS• Evaluate over 50 simulations for m = 200,500,1000 and

h2=0.33,0.50,0.67

T =ph2 ⇤ ZX +

p1� h2 ⇤ Z✏

h2 =�2g

�2g + �2

Page 24: Population structure, heritability, and polygenic risk · 10/17/2016  · Europeans East Asians Africans South Asians Admixed Americas. Varying admixture proportions across populations

True vs inferred PRS with same causal variants, different effect sizes are inconsistent

h2=0.67, m=1000

G H I

Page 25: Population structure, heritability, and polygenic risk · 10/17/2016  · Europeans East Asians Africans South Asians Admixed Americas. Varying admixture proportions across populations

Best performance in European study population

h2=0.67, m=1000, 50 replicates

●●

200 500 1000

0.00

0.25

0.50

0.75

1.00

AFR

EAS

EUR

ALL

AFR

EAS

EUR

ALL

AFR

EAS

EUR

ALL

# Causal variants

Pear

son'

s co

rrela

tion

Superpopulation

AFREASEURALL

h2 = 0.67

●●

200 500 1000

0.00

0.25

0.50

0.75

1.00AF

R

EAS

EUR

ALL

AFR

EAS

EUR

ALL

AFR

EAS

EUR

ALL

# Causal variants

Pear

son'

s co

rrela

tion

Superpopulation

AFREASEURALL

h2 = 0.67

Page 26: Population structure, heritability, and polygenic risk · 10/17/2016  · Europeans East Asians Africans South Asians Admixed Americas. Varying admixture proportions across populations

http://biorxiv.org/content/early/2016/08/23/070797