ben domingue institute of behavioral science...

30
Genome-wide estimates of heritability Ben Domingue Institute of Behavioral Science University of Colorado Boulder [email protected] 1/16

Upload: vutruc

Post on 17-May-2018

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com

Genome-wide estimates of heritability

Ben DomingueInstitute of Behavioral ScienceUniversity of Colorado [email protected]

1/16

Page 2: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com

I Genes → behaviors & outcomes of interest.

I Genome-wide data: FHS, HRS, AddHealth, etc....I Hard to get a handle on genotype/phenotype

connection.I GWAS results help, but have limited availability.I Even when available, polygenic scores have limited

predictive value.

What else can we do?

2/16

Page 3: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com

GCTA

Genome-wide Complex Trait Analysis (GCTA) tells usabout heritability.

I GCTA estimates heritability without knowledge ofcausal variants.

I Instead uses “genetic similarity” (similar to logic oftwin studies).

3/16

Page 4: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com

Method1. Estimate genome-wide similarity:

Ajk =1

N

∑i

(xij − 2pi)(xik − 2pi)

2pi(1− pi)

2. Then estimate mixed model:

y = Xβ + g + ε

where g ∼ MVN[0, σ2gA].

3. Heritability:σ̂2g

σ̂2g+σ̂2

ε

.

Complicated model & not the DGP.

4/16

Page 5: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com

Method1. Estimate genome-wide similarity:

Ajk =1

N

∑i

(xij − 2pi)(xik − 2pi)

2pi(1− pi)

2. Then estimate mixed model:

y = Xβ + g + ε

where g ∼ MVN[0, σ2gA].

3. Heritability:σ̂2g

σ̂2g+σ̂2

ε

.

Complicated model & not the DGP.

4/16

Page 6: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com

Method1. Estimate genome-wide similarity:

Ajk =1

N

∑i

(xij − 2pi)(xik − 2pi)

2pi(1− pi)

2. Then estimate mixed model:

y = Xβ + g + ε

where g ∼ MVN[0, σ2gA].

3. Heritability:σ̂2g

σ̂2g+σ̂2

ε

.

Complicated model & not the DGP.

4/16

Page 7: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com

Method1. Estimate genome-wide similarity:

Ajk =1

N

∑i

(xij − 2pi)(xik − 2pi)

2pi(1− pi)

2. Then estimate mixed model:

y = Xβ + g + ε

where g ∼ MVN[0, σ2gA].

3. Heritability:σ̂2g

σ̂2g+σ̂2

ε

.

Complicated model & not the DGP.4/16

Page 8: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com

Sensitivity to genetic architecture?

I Robust to# of causalvariants.

I Sensitive toLD.

5/16

Page 9: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com

Sensitivity to genetic architecture?

I Robust to# of causalvariants.

I Sensitive toLD.

5/16

Page 10: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com

Sensitivity to environment?

Could genetic similarity just be a proxy for environmentalsimilarity?

6/16

Page 11: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com

My goal: Offer intuition and basic guidance on whenGCTA estimates may be reliable.

7/16

Page 12: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com

Data

HRS: 4950 non-Hispanic whites, ≈ 1.5M autosomalSNPs.

I Height: 0.40

8/16

Page 13: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com

Q1: Gen sim as function of SNPs

Correlation

50% Sample 0.9830% Sample 0.9510% Sample 0.83

r 2 = 0.01 0.57r 2 = 0.2 0.75r 2 = 0.5 0.88

9/16

Page 14: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com

Q1: Gen sim as function of SNPs

Correlation

50% Sample 0.9830% Sample 0.9510% Sample 0.83

r 2 = 0.01 0.57r 2 = 0.2 0.75r 2 = 0.5 0.88

9/16

Page 15: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com

Q1: Gen sim as function of SNPs

Correlation

50% Sample 0.9830% Sample 0.9510% Sample 0.83

r 2 = 0.01 0.57r 2 = 0.2 0.75r 2 = 0.5 0.88

9/16

Page 16: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com

Q2: GWAS (height) variants

10/16

Page 17: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com

Q2: GWAS (height) variants

10/16

Page 18: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com

Q2: GWAS (height) variants

10/16

Page 19: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com

Q2: GWAS (height) variants

10/16

Page 20: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com

Q3: HeteroskedasticityHeteroskedasticiy is common problem.

I weight on height.I own education on paternal education.

Of concern here since we’re estimating variancecomponents.

I Simulate outcome based on GCTA model.I y = 0.5 · height + g + ε.I εi has variance exp(α · height · σ2ε ), where α controls

level of heteroskedasticity and σ2ε controlsheritability.

Examine recovery of heritability, but def’n no longersimple.

11/16

Page 21: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com

Q3: HeteroskedasticityHeteroskedasticiy is common problem.

I weight on height.I own education on paternal education.

Of concern here since we’re estimating variancecomponents.

I Simulate outcome based on GCTA model.I y = 0.5 · height + g + ε.I εi has variance exp(α · height · σ2ε ), where α controls

level of heteroskedasticity and σ2ε controlsheritability.

Examine recovery of heritability, but def’n no longersimple.

11/16

Page 22: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com

Q3: HeteroskedasticityHeteroskedasticiy is common problem.

I weight on height.I own education on paternal education.

Of concern here since we’re estimating variancecomponents.

I Simulate outcome based on GCTA model.I y = 0.5 · height + g + ε.I εi has variance exp(α · height · σ2ε ), where α controls

level of heteroskedasticity and σ2ε controlsheritability.

Examine recovery of heritability, but def’n no longersimple.

11/16

Page 23: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com

Q3: Heteroskedasticity

12/16

Page 24: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com

Q3: Heteroskedasticity

12/16

Page 25: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com

Q4: Environmental Moderation

Heritability not constant: What are implications forGCTA?

I Standard GCTA: g ∼ MVN[0, σ2gA].

I We simulate data using g ∼ MVN[0,A′] where(i , j)-th entry of A′ is hihjAij .

13/16

Page 26: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com

Q4: Environmental Moderation

Heritability not constant: What are implications forGCTA?

I Standard GCTA: g ∼ MVN[0, σ2gA].

I We simulate data using g ∼ MVN[0,A′] where(i , j)-th entry of A′ is hihjAij .

13/16

Page 27: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com

Q4: Environmental Moderation

What if weignoreenvironment?

14/16

Page 28: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com

Q4: Environmental Moderation

What if weallow forenvironmentalvariation?

15/16

Page 29: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com

I LD is important consideration (aside: I’m skepticalabout using KING or REAP estimates).

I Heteroskedasticiy leads to inflation of h2 estimates.

I Environmental differences are likely to beproblematic (and yet may be rampant?).

In closing: GCTA is like a table saw.

16/16

Page 30: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com

I LD is important consideration (aside: I’m skepticalabout using KING or REAP estimates).

I Heteroskedasticiy leads to inflation of h2 estimates.

I Environmental differences are likely to beproblematic (and yet may be rampant?).

In closing: GCTA is like a table saw.

16/16