aysu okbay vu amsterdam...• “neur_swb_trait_1.txt” and “neur_swb_trait_2.txt” are tab...

35
MTAG: MULTI-TRAIT ANALYSIS OF GWAS Aysu Okbay VU Amsterdam

Upload: others

Post on 24-Feb-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

MTAG: MULTI-TRAIT ANALYSIS OF GWAS

Aysu Okbay

VU Amsterdam

Page 2: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard
Page 3: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

MOTIVATION

• For polygenic traits, GWAS requires large 𝑁𝑁• Improving prediction requires even more

• In many cases, are GWAS of other (genetically) correlated traits available

• GOAL: Boost power by pooling GWAS results from multiple related traits

Page 4: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

MULTI-TRAIT ANALYSIS OF GWAS

• Joint analyses can boost statistical power, but often impractical in GWAS:

– Some require individual-level data.

– Some can be applied to summary statistics, but only if there is zero sample overlap

– Computationally burdensome.

• MTAG effectively addresses these challenges.

Page 5: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

MTAG THEORETICAL FRAMEWORK

There are 𝑇𝑇 traits. Let βj be the vector of marginal effects for SNP 𝑗𝑗.From GWAS, we estimate

�𝜷𝜷𝑗𝑗 = 𝛽𝛽𝑗𝑗 + 𝑒𝑒𝑗𝑗𝑒𝑒𝑗𝑗 ~ 𝑁𝑁(0,𝚺𝚺𝑗𝑗)

where Σj is the variance-covariance matrix ofestimation error.

Assume 𝛽𝛽𝑗𝑗 are random effects with some correlation between traits, identically distributed across 𝑗𝑗

𝐸𝐸[𝛽𝛽𝑗𝑗] = 0𝑉𝑉𝑉𝑉𝑉𝑉[𝛽𝛽𝑗𝑗] = Ω

Page 6: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

NON-GENETIC VARIATION IN �𝜷𝜷𝑗𝑗

• Non-genetic variation includes sampling variation and bias

• When are the off-diagonal elements non-zero?

𝚺𝚺𝑗𝑗 can be estimated with intercept of LD score regression

Page 7: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

GENETIC VARIATION IN �𝜷𝜷𝑗𝑗

• Related to heritability and genetic correlation

• Heritability: diagonal• Genetic correlation: off-

diagonal• Key assumption of MTAG:𝛀𝛀 is

homogeneous across SNPs!

• Using method of moments �𝛀𝛀 = 𝑉𝑉𝑉𝑉𝑉𝑉(�𝜷𝜷𝑗𝑗 ) − �𝚺𝚺

Page 8: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

MTAG ESTIMATING EQUATIONS

• MTAG is a generalized methods of moments (GMM) estimator

• Imagine we regressed the GWAS estimates for trait 𝑠𝑠 onto the true marginal effect size for trait 𝑡𝑡 (and a constant)

• The first-order condition of the OLS minimization is

𝐸𝐸 �̂�𝛽𝑗𝑗,𝑠𝑠 −𝜔𝜔𝑡𝑡,𝑠𝑠

𝜔𝜔𝑡𝑡,𝑡𝑡𝛽𝛽𝑗𝑗,𝑡𝑡 = 0

𝜔𝜔𝑡𝑡,𝑠𝑠: 𝑡𝑡, 𝑠𝑠-th element of 𝛀𝛀𝑇𝑇 such moment conditions, 1 parameter → GMM

Page 9: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

MTAG OVERVIEW

• Builds on LD score regression framework.

• Assigns a weight to each coefficient estimate, that depends (in intuitive ways) on two sources of correlation between GWAS estimates.

• Allows for all sources of estimation error (not just sampling variation)

• Outputs a set of association test statistics that are trait-specific.

• Estimation is not computationally burdensome.

Page 10: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

SPECIAL CASES

• No sample-overlap• Off-diagonal elements of 𝚺𝚺𝑗𝑗 are zero• Assumes uncorrelated biases

• Perfect genetic correlation• 𝜔𝜔𝑡𝑡𝑠𝑠 = 𝜔𝜔𝑡𝑡𝑡𝑡𝜔𝜔𝑠𝑠𝑠𝑠 for all 𝑡𝑡, 𝑠𝑠

• Equal heritabilities• 𝜔𝜔𝑡𝑡𝑡𝑡 = 𝜔𝜔𝑠𝑠𝑠𝑠 for all 𝑡𝑡, 𝑠𝑠

All of these special cases together: standard meta-analysiswith LD score intercept correction

Page 11: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

MTAG APPLICATION

Note. Okbay et al. (2016), Nat Genet, 48, 624-633.

Page 12: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

COHORTS IN MTAG APPLICATION

Page 13: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

DEP NEUR SWB

GWAS MTAG GWAS MTAG GWAS MTAG

SNP-based comparisons

Lead SNPs(P < 5×10-8)

32 74 9 66 13 60

Mean χ2 1.44 1.60 1.28 1.56 1.31 1.57

Neff 354,862 168,105 388,538

Page 14: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

DEP NEUR SWB

GWAS MTAG GWAS MTAG GWAS MTAG

SNP-based comparisons

Lead SNPs(P < 5×10-8)

32 74 9 66 13 60

Mean χ2 1.44 1.60 1.28 1.56 1.31 1.57

Neff 354,862 168,105 388,538

Page 15: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

DEP NEUR SWB

GWAS MTAG GWAS MTAG GWAS MTAG

SNP-based comparisons

Lead SNPs(P < 5×10-8)

32 74 9 66 13 60

Mean χ2 1.44 1.60 1.28 1.56 1.31 1.57

Neff 354,862 168,105 388,538

Page 16: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard
Page 17: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

HOW MUCH WOULD WE HAVE HAD TO BOOST N IN EACH UNIVARIATE GWAS TO MATCH OBSERVED MTAG

GAINS?

Page 18: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

GWAS-EQUIVALENT SAMPLE SIZE FOR MTAG

• DEP: 37% increase (N = 354K to 479K)

• NEUR: 96% increase (N = 168K to 330K)

• SWB: 85% increase (N = 388K to 718K)

Page 19: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

ARE THESE GAINS “REAL”?

Page 20: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

PREDICTION ACCURACY IN HRS

Page 21: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

WHAT’S THE BAD NEWS?

• Model misspecification → potentially substantial bias

• Most problematic are “bad SNPs”• SNPs that are null for the primary trait, but nonnull

for some secondary trait• Inflated type-I error rate, false discovery rate

• Prediction of other traits and biological annotation may be biased.

Page 22: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

MAXIMUM FALSE DISCOVERY RATE

• Based on multivariate spike-and-slab distribution• Each SNP may be associated with all, some, or no traits• Potentially leads to “bad SNPs,” which are associated with

secondary traits but not the primary trait

𝛽𝛽𝑗𝑗~

𝑁𝑁 0, 0 00 0 with probability 𝜋𝜋𝐹𝐹𝐹𝐹

𝑁𝑁 0, 𝜔𝜔11 00 0 with probability 𝜋𝜋𝑇𝑇𝐹𝐹

𝑁𝑁 0, 0 00 𝜔𝜔22

with probability 𝜋𝜋𝐹𝐹𝑇𝑇

𝑁𝑁 0,𝜔𝜔11 𝜔𝜔12𝜔𝜔12 𝜔𝜔22

with probability 𝜋𝜋𝑇𝑇𝑇𝑇

Page 23: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

MAXIMUM FALSE DISCOVERY RATE

Maximize FDR over all feasible spike-and-slab distributions

Page 24: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

RECOMMENDATIONS

• Replication!

• Choose settings with a low risk of a high false discovery rate (FDR)• Genetic correlation between traits is high (>0.7) AND• Mean 𝜒𝜒2-statistic of primary trait is high (>1.7) OR

higher than that of secondary traits

• Possible to run into problems even when above is satisfied• Perform maxFDR calculations

Page 25: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

SOFTWARE

Code is publicly available at:

https://github.com/omeed-maghzian/mtag

Page 26: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

PRACTICAL

Page 27: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

INTRO

• Begin by looking at the MTAG options by typing

mtag –h

• Copy the files into your working directorymkdir MTAG_practicalcd MTAG_practical

cp /faculty/aysu/MTAG/SWB_Full.txt .cp /faculty/aysu/MTAG/Neuroticism_Full.txt .cp -r /faculty/aysu/MTAG/eur_w_ld_chr/ .

Page 28: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

INPUT FILE FORMAT

• These are GWAS results on neuroticism and subjective well-being from Okbay et. al. (2016) , restricted to HapMap3 SNPs.

• Have a look at the data

head SWB_Full.txt

Page 29: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

INPUT FILE FORMAT

• The following columns are necessary for MTAG to run (order not important):

• snpid (--snp_name)• a1/a2 (--a1_name / --a2_name)• freq (--eaf_name) • z (--z_name)• n (--n_name)

• The other columns are not directly used by mtag.py but are part of the munging procedure implemented via ldsc.

Page 30: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

RUNNING MTAG WITH THE DEFAULTS

Using mtag with the default options implements the following steps: 1. Read in the input GWAS summary statistics and filter the

SNPs by MAF ≥ 0.01 and sample size N ≥ (2/3) * 90th percentile.

2. Merge the filtered GWAS summary statistics results together, taking the intersection of available SNPs.

3. Estimate the residual covariance matrix (Σ) via LD Score regression.

4. Estimate the genetic covariance matrix (Ω) 5. Perform MTAG and output results.

Page 31: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

RUNNING MTAG WITH THE DEFAULTS

mtag --sumstats SWB_Full.txt,Neuroticism_Full.txt \--snp_name MarkerName \--chr_name CHR \--bpos_name POS \--a1_name A1 \--a2_name A2 \--eaf_name EAF \--use_beta_se \--beta_name Beta \--se_name SE \--p_name Pval \--n_name N \--ld_ref_panel ./eur_w_ld_chr/ \--out ./NEUR_SWB \--stream_stdout

Page 32: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

MTAG OUTPUT

Running mtag should have produced five files in your current directory:• “NEUR_SWB.log” timestamps the different steps taken by mtag.py.• “NEUR_SWB_sigma_hat.txt” stores the estimated residual

covariance matrix.• “NEUR_SWB_omega_hat.txt” stores the estimated genetic

covariance matrix.• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab-

delimited results files corresponding to the MTAG-adjusted effect sizes and standard errors for the neuroticism and subjective well-being summary statistics, respectively.

Note that the files are numbered in the order they were presented in the list provided to the --sumstats flag.

Page 33: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

LOG FILE

Apart from providing time stamps for the estimating the different matrices needed for MTAG, the log file also displays the calculated values of Omega and Sigma along with a summary output of the results:

Page 34: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

RESULTS FILES

NEUR_SWB_trait_1.txt provides the MTAG-adjusted effect sizes for the neuroticism GWAS:

• The first eight columns are copied from the corresponding input file.

• mtag_beta and mtag_se provide the unstandardized weights and standard errors calculated by mtag, yielding the corresponding z-scores mtag_z and p-values mtag_pval.

Page 35: Aysu Okbay VU Amsterdam...• “NEUR_SWB_trait_1.txt” and “NEUR_SWB_trait_2.txt” are tab delimited results files corresponding to the MTAG-adjusted effect sizes and standard

SPECIAL CASES

• --no_overlap : Assumes no overlap between any of the cohorts in any pair of GWAS studies fed into mtag

• --perfect_gencov : Assumes the T summary statistics used in MTAG are GWAS estimates for traits that are perfectly correlated with one another

• -equal_h2 : Requires --perfect_gencov.Assumes all summary statistics files have in MTAG have the same heritability