lecture 15 regulatory variation and eqtls

59
Lecture 15 Regulatory variation and eQTLs Chris Cotsapas [email protected] rg 6.047/6.878/HST.507 Computational Biology: Genomes, Networks, Evolution

Upload: evangelina-zaro

Post on 01-Jan-2016

25 views

Category:

Documents


0 download

DESCRIPTION

6.047/6.878/HST.507 Computational Biology: Genomes, Networks, Evolution. Lecture 15 Regulatory variation and eQTLs. Chris Cotsapas [email protected]. Module 4: Population / Evolution / Phylogeny. L15/16: Association mapping for disease and molecular traits - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Lecture 15 Regulatory  variation and  eQTLs

Lecture 15Regulatory variation and eQTLs

Chris [email protected]

6.047/6.878/HST.507Computational Biology: Genomes, Networks, Evolution

Page 2: Lecture 15 Regulatory  variation and  eQTLs
Page 3: Lecture 15 Regulatory  variation and  eQTLs

Module 4: Population / Evolution / Phylogeny

• L15/16: Association mapping for disease and molecular traits– Statistical genetics: disease mapping in populations (Mark Daly)– Quantitative traits and molecular variation: eQTLs, cQTLs

• L17/18: Phylogenetics / Phylogenomics– Phylogenetics: Evolutionary models, Tree building, Phylo inference– Phylogenomics: gene/species trees, coalescent models, populations

• L19/20: Human history, Missing heritability– Measuring natural selection in human populations– The missing heritability in genome-wide associations

• And done! Last pset Nov 11 (no lab), In-class quiz on Nov 20– No lab 4! Then entire focus shifts to projects, Thanksgiving, Frontiers

Page 4: Lecture 15 Regulatory  variation and  eQTLs

Today: Regulatory variation and eQTLs1. Quantitative Trait Loci (QTLs), Regulatory Variation

– Molecular phenotypes as QTs: expression, chromatin…– Discretization: a GWAS for each gene. Cis-/Trans-eQTLs– Underlying regulatory variation: eQTLs, GWAS, cis-eQTL

2. Finding trans-eQTLs (distal from gene that varies)– Challenges: Power, structure, sample size– Cross-phenotype analysis: trans QTLs affect many genes

3. Identifying underlying regulatory mechanisms– Cis-eQTLs: TSS-distance, cell type specificity– eQTLs vs. GWAS: Expression as intermediate trait

4. Population differences, emerging efforts– Shared associations, SNP-gene pairs, allelic direction– Confound: environment, preparation, batch, ancestry

Page 5: Lecture 15 Regulatory  variation and  eQTLs

Quantitative traits- weight, height- anything measurable- today: gene expression

QTLs (QT Loci)- The loci that control

quantitative traits

Page 6: Lecture 15 Regulatory  variation and  eQTLs

Regulatory variation

• What do trait-associated variants do?• Genetic changes to:

– Coding sequence **– Gene expression levels– Splice isomer levels– Methylation patterns– Chromatin accessibility– Transcription factor binding kinetics– Cell signaling– Protein-protein interactions

Regulatory

Page 7: Lecture 15 Regulatory  variation and  eQTLs

BASIC CONCEPTSHistory, eQTL, mQTL, others

Page 8: Lecture 15 Regulatory  variation and  eQTLs
Page 9: Lecture 15 Regulatory  variation and  eQTLs

Within a population

• Damerval et al 1994• 42/72 protein levels differ in maize• 2D electrophoresis, eyeball spot quantitation• Problems:

– genome coverage– quantitation– post-translational modifications

• Solution: use expression levels instead!

Page 10: Lecture 15 Regulatory  variation and  eQTLs

Usual mapping tools available

• Discretization approach

Page 11: Lecture 15 Regulatory  variation and  eQTLs

gene 3

Whole-genome eQTL analysis is an independent GWAS for expression of each

gene

gene 2

gene N

gene 5

gene 4

gene 1

Page 12: Lecture 15 Regulatory  variation and  eQTLs

• cis-eQTL– The position of the eQTL maps near

the physical position of the gene.– Promoter polymorphism?– Insertion/Deletion?– Methylation, chromatin

conformation?

• trans-eQTL– The position of the eQTL does not

map near the physical position of the gene.

– Regulator?– Direct or indirect?

Modified from Cheung and Spielman 2009 Nat Gen

Genetics of gene expression (eQTL)

Page 13: Lecture 15 Regulatory  variation and  eQTLs
Page 14: Lecture 15 Regulatory  variation and  eQTLs
Page 15: Lecture 15 Regulatory  variation and  eQTLs
Page 16: Lecture 15 Regulatory  variation and  eQTLs

eQTL – THE ARRAY ERAyeast, mouse, maize, human

Page 17: Lecture 15 Regulatory  variation and  eQTLs

Yeast

• Brem et al Science 2002• Linkage in 40 offspring of lab x wild strain

cross • 1528/6215 DE between parents• 570 map in cross

– multiple QTLs– 32% of 570 have cis linkage

• 262 not DE in parents also map

Page 18: Lecture 15 Regulatory  variation and  eQTLs

trans hotspots

Brem et al Science 2002

Page 19: Lecture 15 Regulatory  variation and  eQTLs

Yvert et al Nat Genet 2003

Page 20: Lecture 15 Regulatory  variation and  eQTLs

Mammals I

• F2 mice on atherogenic diet• Expression arrays; WG linkage

Schadt et alNature 2003

Page 21: Lecture 15 Regulatory  variation and  eQTLs

Mammals II

Chesler et al Nat Genet 2005

10% !!

Page 22: Lecture 15 Regulatory  variation and  eQTLs

Mammals III

• No major trans loci in humans– Cheung et al Nature 2003– Monks et al AJHG 2004– Stranger et al PLoS Genet 2005, Science 2007

Page 23: Lecture 15 Regulatory  variation and  eQTLs

Today: Regulatory variation and eQTLs1. Quantitative Trait Loci (QTLs), Regulatory Variation

– Molecular phenotypes as QTs: expression, chromatin…– Discretization: a GWAS for each gene. Cis-/Trans-eQTLs– Underlying regulatory variation: eQTLs, GWAS, cis-eQTL

2. Finding trans-eQTLs (distal from gene that varies)– Challenges: Power, structure, sample size– Cross-phenotype analysis: trans QTLs affect many genes

3. Identifying underlying regulatory mechanisms– Cis-eQTLs: TSS-distance, cell type specificity– eQTLs vs. GWAS: Expression as intermediate trait

4. Population differences, emerging efforts– Shared associations, SNP-gene pairs, allelic direction– Confound: environment, preparation, batch, ancestry

Page 24: Lecture 15 Regulatory  variation and  eQTLs

WHERE ARE THE TRANS eQTLS?Open question

Page 25: Lecture 15 Regulatory  variation and  eQTLs

gene 3

Whole-genome eQTL analysis is an independent GWAS for expression of each

gene

gene 2

gene N

gene 5

gene 4

gene 1

Page 26: Lecture 15 Regulatory  variation and  eQTLs

Issues with trans mapping

• Power– Genome-wide significance is 5e-8

– Multiple testing on ~20K genes– Sample sizes clearly inadequate

• Data structure– Bias corrections deflate variance– Non-normal distributions

• Sample sizes– Far too small

Page 27: Lecture 15 Regulatory  variation and  eQTLs

But…

• Assume that trans eQTLs affect many genes…

• …and you can use cross-trait methods!

Page 28: Lecture 15 Regulatory  variation and  eQTLs

Association data

Z1,1 Z1,2 … … Z1,p

Z2,1

::

Zs,1 Zs,p

Page 29: Lecture 15 Regulatory  variation and  eQTLs

Cross-phenotype meta-analysis

SCPMA ~L(data | λ≠1)

L(data | λ=1)

Cotsapas et al, PLoS Genetics

Page 30: Lecture 15 Regulatory  variation and  eQTLs

CPMA detects trans mixtures

Page 31: Lecture 15 Regulatory  variation and  eQTLs

Open research questions

• Do trans effects exist?– Yes – heritability estimates suggest so.– Can we detect them?

• Larger cohorts?– Most eQTL studies ~50-500 individuals– See later, GTEx Project

• Better methods?– Collapsing data?– PCA, summary statistics, modeling?

Page 32: Lecture 15 Regulatory  variation and  eQTLs

Today: Regulatory variation and eQTLs1. Quantitative Trait Loci (QTLs), Regulatory Variation

– Molecular phenotypes as QTs: expression, chromatin…– Discretization: a GWAS for each gene. Cis-/Trans-eQTLs– Underlying regulatory variation: eQTLs, GWAS, cis-eQTL

2. Finding trans-eQTLs (distal from gene that varies)– Challenges: Power, structure, sample size– Cross-phenotype analysis: trans QTLs affect many genes

3. Identifying underlying regulatory mechanisms– Cis-eQTLs: TSS-distance, cell type specificity– eQTLs vs. GWAS: Expression as intermediate trait

4. Population differences, emerging efforts– Shared associations, SNP-gene pairs, allelic direction– Confound: environment, preparation, batch, ancestry

Page 33: Lecture 15 Regulatory  variation and  eQTLs

CAN WE LEARN REGULATORY VARIATION FROM eQTL?

Page 34: Lecture 15 Regulatory  variation and  eQTLs

First, let’s define the question

• Can we use genetic perturbations as a way to understand how genes are regulated?

• In what groups, in which tissues? • To what stimuli/signaling events? • Do cis eQTLs perturb promoter elements?• Do trans perturb TFs? Signaling cascades?

Page 35: Lecture 15 Regulatory  variation and  eQTLs

Most significant SNP per gene 0.001 permutation threshold

Significant associations are symmetrically distributed around TSS

Stranger et al., PLoS Gen 2012

Page 36: Lecture 15 Regulatory  variation and  eQTLs

268 271 262

73 85 82

86 86 86

Cell type-specific and cell type-shared gene associations(0.001 permutation threshold)

cell type

No.

of c

ell t

ypes

with

gen

e as

soci

ation

69-80% of cis associations are cell type-specific

• cis association sharing increases slightly when significance thresholds are relaxed• Cell type specificity verified experimentally for subset of eQTLs

Dimas et al Science 2009

Dimas et al Science 2009Slide courtesy Antigone Dimas

Page 37: Lecture 15 Regulatory  variation and  eQTLs

Open research questions

• Do cis eQTLs perturb functional elements?– Given each is independent, how can we know?

• Do tissue-specific effects correlate with the expression of a gene across tissues? Or a regulator?– Perhaps a gene is expressed, but in response to different

regulators across tissues?• If we ever find trans eQTLs…

– Common regulators of coregulated genes?– Tissue specificity?– Mechanisms?

Page 38: Lecture 15 Regulatory  variation and  eQTLs

APPLICATION TO GWASCandidate genes, perturbations underlying organismal phenotypes

Page 39: Lecture 15 Regulatory  variation and  eQTLs

eQTLs as intermediate traits

Schadt et al Nat Genet 2005

Page 40: Lecture 15 Regulatory  variation and  eQTLs

Modified from Nica and Dermitzakis Hum Mol Genet 2008

Exploring eQTLs in the relevant cell type is important for disease association studies

cell type not relevant for diseaserelevant cell type for disease

Importance of cataloguing regulatory variation in multiple cell types

Slide courtesy Antigone Dimas

Page 41: Lecture 15 Regulatory  variation and  eQTLs

Barrett et al 2008de Jager et al 2007

Page 42: Lecture 15 Regulatory  variation and  eQTLs
Page 43: Lecture 15 Regulatory  variation and  eQTLs
Page 44: Lecture 15 Regulatory  variation and  eQTLs

Franke et al 2010Anderson et al 2011

Page 45: Lecture 15 Regulatory  variation and  eQTLs

Today: Regulatory variation and eQTLs1. Quantitative Trait Loci (QTLs), Regulatory Variation

– Molecular phenotypes as QTs: expression, chromatin…– Discretization: a GWAS for each gene. Cis-/Trans-eQTLs– Underlying regulatory variation: eQTLs, GWAS, cis-eQTL

2. Finding trans-eQTLs (distal from gene that varies)– Challenges: Power, structure, sample size– Cross-phenotype analysis: trans QTLs affect many genes

3. Identifying underlying regulatory mechanisms– Cis-eQTLs: TSS-distance, cell type specificity– eQTLs vs. GWAS: Expression as intermediate trait

4. Population differences, emerging efforts– Shared associations, SNP-gene pairs, allelic direction– Confound: environment, preparation, batch, ancestry

Page 46: Lecture 15 Regulatory  variation and  eQTLs

POPULATION DIFFERENCES

Page 47: Lecture 15 Regulatory  variation and  eQTLs

Shared association in 8 HapMap populations

APOH: apolipoprotein H Stranger et al., PLoS Gen 2012

Page 48: Lecture 15 Regulatory  variation and  eQTLs

Number of genes with cis-eQTL associations8 extended HapMap populations

Spearman Rank CorrelationEnsGene

0.01 0.001 0.0001# genes FDR # genes FDR # genes FDR

CEU 2869 0.06 657 0.03 313 0.01CHB 2832 0.06 774 0.02 378 0.00GIH 2959 0.06 698 0.03 300 0.01JPT 2900 0.06 795 0.02 386 0.00

LWK 3818 0.05 773 0.02 311 0.01MEX 2609 0.07 472 0.04 165 0.01MKK 4222 0.04 947 0.02 411 0.00YRI 3961 0.05 799 0.02 328 0.01

non-redundant 12494 3130 1132>2 pops 6889 0.55 1074 0.34 547 0.488 pops 151 0.01 63 0.02 28 0.02

4 PCA + 4 non-PCAGenes are Ensembl genes.Generated on July 22, 2009, modified July 28th because of MKK error.These numbers differ from v2 because all assocs where dist to true TSS > 1Mb were removed.This removed some genes. I redid the MostSigSnpEnsGene after removing those SNPs. This is in column MostSigSnpEnsGene_1

SRC: permutation threshold

Stranger et al., PLoS Gen 2012

Page 49: Lecture 15 Regulatory  variation and  eQTLs

Direction of allelic effectsame SNP-gene combination across populations

AGREEMENT

OPPOSITE

Population 1

rs40915

THAP5

TTCTCC

8.00

7.75

7.50

7.25

7.00

6.75

6.50

Population 2

rs40915

THAP5

TTCTCC

8.00

7.75

7.50

7.25

7.00

6.75

6.50

log 2 e

xpre

ssio

n

log 2 e

xpre

ssio

n

rs40915

THAP5

TTCTCC

8.00

7.75

7.50

7.25

7.00

6.75

6.50

log 2 e

xpre

ssio

n

rs40915

THAP5

TTCTCC

8.00

7.75

7.50

7.25

7.00

6.75

6.50

log 2 e

xpre

ssio

n

Stranger et al., PLoS Gen 2012

Page 50: Lecture 15 Regulatory  variation and  eQTLs

Slide courtesy Alkes Price

Page 51: Lecture 15 Regulatory  variation and  eQTLs

Population differences could have non-genetic basis

• Differences due to environment? (Idaghdour et al. 2008)

• Differences in cell line preparation? (Stranger et al. 2007)

• Differences due to batch effects? (Akey et al. 2007)

(Reviewed in Gilad et al. 2008)

Slide courtesy Alkes Price

Page 52: Lecture 15 Regulatory  variation and  eQTLs

Gene expression experiment

Does gene expression in 60 CEU + 60 YRI vary with ancestry?

Does gene expression in 89 AA vary with % Eur ancestry?

60 CEU + 60 YRI from HapMap, 89 AA from Coriell HD100AAGene expression measurements at 4,197 genes obtained using Affymetrix Focus array

c

Slide courtesy Alkes Price

Page 53: Lecture 15 Regulatory  variation and  eQTLs

Gene expression differences in African Americans validate CEU-YRI differences

c = 0.43 (± 0.02)(P-value < 10-25)

12% ± 3%

in cis

Slide courtesy Alkes Price

Page 54: Lecture 15 Regulatory  variation and  eQTLs

EMERGING EFFORTSRNAseq, GTEx

Page 55: Lecture 15 Regulatory  variation and  eQTLs

RNAseq questions

• Standard eQTLs – Montgomery et al, Pickrell et al Nature 2010

• Isoform eQTLs– Depth of sequence!

• Long genes are preferentially sequenced• Abundant genes/isoforms ditto• Power!?• Mapping biases due to SNPs

Page 56: Lecture 15 Regulatory  variation and  eQTLs

Strategies for transcript assembly

Garber et al. Nat Methods 8:469 (2011)

Page 57: Lecture 15 Regulatory  variation and  eQTLs

GTEx – Genotype-Tissue EXpressionAn NIH common fund project

Current: 35 tissues from 50 donors

Scale up: 20K tissues from 900 donors.

Novel methods groups: 5 current + RFA

Page 58: Lecture 15 Regulatory  variation and  eQTLs

RNAseq combined with other techs

• Regulons: TF gene sets via CHiP/seq– Look for trans effects

• Open chromatin states (Dnase I; methylation)– Find active genes– Changes in epigenetic marks correlated to RNA– Genetic effects

• RNA/DNA comparisons – Simultaneous SNP detection/genotyping– RNA editing ???

Page 59: Lecture 15 Regulatory  variation and  eQTLs

Today: Regulatory variation and eQTLs1. Quantitative Trait Loci (QTLs), Regulatory Variation

– Molecular phenotypes as QTs: expression, chromatin…– Discretization: a GWAS for each gene. Cis-/Trans-eQTLs– Underlying regulatory variation: eQTLs, GWAS, cis-eQTL

2. Finding trans-eQTLs (distal from gene that varies)– Challenges: Power, structure, sample size– Cross-phenotype analysis: trans QTLs affect many genes

3. Identifying underlying regulatory mechanisms– Cis-eQTLs: TSS-distance, cell type specificity– eQTLs vs. GWAS: Expression as intermediate trait

4. Population differences, emerging efforts– Shared associations, SNP-gene pairs, allelic direction– Confound: environment, preparation, batch, ancestry