contemporary research in human genomics - human genome variation

46
Slide 1 Joe Mychaleckyj CONTEMPORARY RESEARCH IN HUMAN GENOMICS Genetics, Ethics and the Law May 29-31, 2009 Josyf Mychaleckyj, D.Phil. Center for Public Health Genomics University of Virginia

Upload: medresearch

Post on 30-Nov-2014

297 views

Category:

Documents


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Contemporary Research in Human Genomics - Human Genome Variation

CONTEMPORARY RESEARCH IN HUMAN GENOMICS

Genetics, Ethics and the LawMay 29-31, 2009

Josyf Mychaleckyj, D.Phil.Center for Public Health GenomicsUniversity of Virginia

Page 2: Contemporary Research in Human Genomics - Human Genome Variation

Slide 2

Joe Mychaleckyj

Today we’ll review…

• Genome Wide Association Studies (GWAS)• Copy Number Variants (CNVs)• Medical Resequencing• Direct-to-Consumer Services (DTC)

Page 3: Contemporary Research in Human Genomics - Human Genome Variation

Joe Mychaleckyj

Slide 3

Genome Wide Association Studies (GWAS)

Page 4: Contemporary Research in Human Genomics - Human Genome Variation

Slide 4

Joe Mychaleckyj

A C C G C G T G T C

Single Nucleotide Polymorphisms: SNPs (‘SNiPs’)

A C C G T G T G T C

Chromosome #1

Chromosome #2

C, T are the 2 different alleles for this SNP

Mutation = Rare variantPolymorphism = Frequent (> 1% prevalence)

Page 5: Contemporary Research in Human Genomics - Human Genome Variation

Slide 5

Joe Mychaleckyj

Homozygote f(AA)

Each person carries pairs of chromosomes with a separate allele at the SNP position on each chromosome

3 Possible SNP Genotypes

A A

Heterozygote f(AG)A G

Homozygote f(GG)G G

f(AA) + f(AG) + f(GG) = 1

frequency

Page 6: Contemporary Research in Human Genomics - Human Genome Variation

Slide 6

Joe Mychaleckyj

Case Control Association studyCases =

Clinical Disease Controls =

Disease Free

eg Blue Allele: 0.48 (48%) 0.41 (41%)

Page 7: Contemporary Research in Human Genomics - Human Genome Variation

Quantitative Trait Locus (QTL)

Association Study

Page 8: Contemporary Research in Human Genomics - Human Genome Variation

Slide 8

Joe Mychaleckyj

Genome Wide Association Study

• SNPs most common type of human genome variant by number (10-15 Million)

• Stable, easy to assay, accurately genotype• Able to multiplex 1000’s of SNPs into same assay

Affymetrix Human 6.0906,000 SNPS946,00 probes for CNV

Illumina 1M-Duo

Page 9: Contemporary Research in Human Genomics - Human Genome Variation

Slide 9

Joe Mychaleckyj

GWAS • SNPs present in genes (affect proteins) but

since coding sequence is ~2% of genome, the vast majority of human SNPs are outside exons or introns

• Genotype Dense map of SNPs across all chromosomes of the human genome

• Studies with 500,000 SNPs are becoming routine and 1 Million SNP panels are available

• Do not have to test all 10M SNPs because of SNP-SNP correlations (linkage disequilibrium)

Page 10: Contemporary Research in Human Genomics - Human Genome Variation

Slide 10

Joe Mychaleckyj

GWAS approach

Does not assume a knowledge of genes or biology

Hardy J, Singleton A.N Engl J Med. 2009 Apr 23;360(17):175

Page 11: Contemporary Research in Human Genomics - Human Genome Variation

Joe Mychaleckyj

Slide 11

Genome wide Association Analysis of Coronary Artery Disease, NEJM 2007

Page 12: Contemporary Research in Human Genomics - Human Genome Variation

Slide 12

Joe Mychaleckyj

But Common Diseases are Complex

Gene 1

Gene 2

Gene 4Gene 3VPPGEEQRYT[C/Y]QVEHPGLD

rs1800562GGGGAAGAGCAGAGATATACGT[A/G]CCAGGTGGAGCACCCAGGCCTG

C282Y

HFE

P( Hemochromatosis+ | CC homozyote) ~ 60-100%

Environment 1

Environment 2

Clinical Complex Disease

Environment 3

Clinical Monogenic Disease

OR

OR

Gene 5OR

Page 13: Contemporary Research in Human Genomics - Human Genome Variation

Slide 13

Joe Mychaleckyj

Monogenic vs Complex DiseaseMonogenic Complex

1 or small # of genes Many

Often etiologic Susceptibility / molecular (severe phenotype) pathology ?

Highly penetrant Modest penetrance

High Odds Ratio Modest/Low Odds Ratio

Strong selection => Weak/No selection => Low frequency/Rare High frequency/Common

Coding Sequence Non-coding/regulation (?)

Page 14: Contemporary Research in Human Genomics - Human Genome Variation

Slide 14

Joe Mychaleckyj

What are GWAS Studies Finding

• Typically detected variants are common (allele freq >10%)

• low genotype risk, odds ratio (1.1-1.5)• Small sibling relative risk• Causal variants have not been mapped -

function unknown and major signals occur in non-coding regions

• Penetrance model not well known

Page 15: Contemporary Research in Human Genomics - Human Genome Variation

Slide 15

Joe Mychaleckyj

Example: Crohn Disease

First susceptibility gene NOD2 for Crohn DiseaseSNP: rs17221417

• GRR (het) = 1.29, GRR Homo = 1.92• Allele frequency 0.287 • Sibling Risk Ratio = 1.02• Familial risk in NOD2 has been estimated at

1.19-1.49 but varies with populationLewis J Med Genet 2007, Economou Am J Gastroenterol 2004

Page 16: Contemporary Research in Human Genomics - Human Genome Variation

Slide 16

Joe MychaleckyjHindorff, PNAS 2009

>200 GWAS studies published as of December 2008

Page 17: Contemporary Research in Human Genomics - Human Genome Variation

Slide 17

Joe Mychaleckyj

Nature Genetics 41, 666 - 676 (2009) Published online: 10 May 2009Genome-wide association study identifies eight loci associated with blood pressure

Page 18: Contemporary Research in Human Genomics - Human Genome Variation

Slide 18

Joe Mychaleckyj

The GWAS conundrum: Little variance/risk is explained by GWAS alleles• Obesity

– FTO and MC4R <2% of variance

• Lipids– 30 gene loci, proportion of variance explained in each trait:– 9.3% for HDL cholesterol– 7.7% for LDL cholesterol– 7.4% for triglycerides

• Diabetes– 18 replicated loci: combined sibling relative risk ~1.07

Page 19: Contemporary Research in Human Genomics - Human Genome Variation

Slide 19

Joe Mychaleckyj

Example: Height

• Highly heritable (heritability ~0.8)• Combined sample of ~63,000• 54 validated variants in multiple genes• Each locus explains ~0.3% - 0.5% of the

phenotypic variance• Total variance explained < 5% overall

Page 20: Contemporary Research in Human Genomics - Human Genome Variation

Slide 20

Joe Mychaleckyj

What are we missing?

• Population differences• Alleles with small effect sizes• Copy number variants• Rare variants• Epigenetic effects

Page 21: Contemporary Research in Human Genomics - Human Genome Variation

Slide 21

Joe Mychaleckyj

• Genotype and phenotype datasets made available as rapidly as possible to a wide range of scientific investigators

• Grantees are expected to develop a sharing plan consistent with the GWAS policy.

• Plan should include data submission to the NIH GWAS data repository (dbGaP).http: grants.nih.gov/grants/guide/notice-files/NOT- OD- 07-

088.html)

Pezzolesi et al Diabetes 2009

Page 22: Contemporary Research in Human Genomics - Human Genome Variation

Slide 22

Joe Mychaleckyj

http://www.ncbi.nlm.nih.gov/gap

Page 23: Contemporary Research in Human Genomics - Human Genome Variation
Page 24: Contemporary Research in Human Genomics - Human Genome Variation

Slide 24

Joe Mychaleckyj

NIH GWAS Data Sharing Issues• Sharing of individual genotype & phenotype

data with any approved researcher worldwide

(*Public access to genetic summary statistics)• Review by a central NIH data use committee

(DUC) not constituted by the study • Informed consent templates for new GWAS • ‘Retrofitting’ existing cohorts to conform to

NIH Policy – adequacy of consents– Data sharing clauses– Use of data for research purposes not intended or foreseen

• Ancestry, ethnic origins – harm to community http://grants.nih.gov/grants/gwas/

Page 25: Contemporary Research in Human Genomics - Human Genome Variation

Slide 25

Joe MychaleckyjPloS Genetics Aug 2008

0.0 0.25 0.75 1.0 Allele Frequency

More Likely to be in mixture

MixtureReference Sample

Personal Genome

Summation over all SNPs, can infer with very high confidence whether the Person (or a close relative) is more likely to be in the Mixture versus a Reference Sample

Example Results for one SNP

Page 26: Contemporary Research in Human Genomics - Human Genome Variation

Joe Mychaleckyj

Slide 26

Copy Number Variants (CNVs)

Page 27: Contemporary Research in Human Genomics - Human Genome Variation

Slide 27

Joe Mychaleckyj

Copy Number Variants• Submicroscopic structural genome

rearrangments (cf cytogenetics, FISH)– ~ 10 – 10,000 base pairs in length– Insertions, deletions, duplications (2+ copies), inversions

• Copy number variant or polymorphism – polymorphism = more common CNV (> 1% frequency = CNP)

• Common feature of the genome• Frequency >1% => polymorphism (CNPs)• Assay using genome wide SNP or CNV arrays

– Electronic FISH study

Page 28: Contemporary Research in Human Genomics - Human Genome Variation

Slide 28

Joe Mychaleckyj

Copy number variants (CNVs)

The Copy Number Variation (CNV) Projecthttp://www.sanger.ac.uk/humgen/cnv/

Page 29: Contemporary Research in Human Genomics - Human Genome Variation

Slide 29

Joe Mychaleckyj

~11kb deletion on chromosome 8 revealed by ultra-high resolution CGH. Blue lines: individuals with two copies. Red line: individual with zero copies.

The Copy Number Variation (CNV) Projecthttp://www.sanger.ac.uk/humgen/cnv/

Points are SNPs or probes from GWAS Array

Page 30: Contemporary Research in Human Genomics - Human Genome Variation

Slide 30

Joe Mychaleckyj

Location and frequency of CNVs in the genome

Nature. 2006 Nov 23;444(7118):444-54

Page 31: Contemporary Research in Human Genomics - Human Genome Variation

Joe Mychaleckyj

Slide 31

Medical Resequencing: Next Generation Sequencing (NGS)

Page 32: Contemporary Research in Human Genomics - Human Genome Variation

Slide 32

Joe Mychaleckyj

Public Reference Human Genome Sequence (2001, 2004) is Haploid and Chimeric

DNA Library 2, Individual 2

DNA Library 1, Individual 1

DNA Library 3, Individual 3

Page 33: Contemporary Research in Human Genomics - Human Genome Variation

Slide 33

Joe Mychaleckyj

Next Generation Sequencing (NGS) enables Diploid Sequencing of an individual

Positions of variants, SNPS, CNVs etc

Hundreds of Millions of small random sequence ‘reads’

Page 34: Contemporary Research in Human Genomics - Human Genome Variation

Slide 34

Joe Mychaleckyj

Mapping of Individual Variants (SNPs, CNVs)

N = 1 individual

A

T

A

T

A

T

Shotgun Reads:T

G

G

G

G

G

G

CReference Genome

Page 35: Contemporary Research in Human Genomics - Human Genome Variation

Slide 35

Joe Mychaleckyj

Mapping of Individual Variants

• Random reads from diploid genome sequencing – Align random shotgun reads from single individual diploid library

& look for high quality mismatches– Find heterozygous positions

• Medical Sequencing (to determine disease risk profile)– Incorporation of sequence and variants in the Medical Record

Page 36: Contemporary Research in Human Genomics - Human Genome Variation

Slide 36

Joe Mychaleckyj

ABBA00000000

Page 37: Contemporary Research in Human Genomics - Human Genome Variation

Slide 37

Joe Mychaleckyj

‘Project Jim’

Bio-IT World June 2007

1.3 percent of Watson’s genome did not match the existing reference genome. > 600,000 novel SNPs< 68,000 insertions and deletions compared to the reference sequence, 3bp - 7kbases

Page 38: Contemporary Research in Human Genomics - Human Genome Variation

Slide 38

Joe Mychaleckyj

NGS of Diploid Genomes

5 Completely Sequenced as of (May 2009):J. Craig VenterJames WatsonYoruban (West Africa, HGVS)Chinese (YH)Korean (SJK May 2009)

Levy et al, PLoS Biology, 2007

Page 39: Contemporary Research in Human Genomics - Human Genome Variation

Slide 39

Joe Mychaleckyj

Scientific American 2006

Page 40: Contemporary Research in Human Genomics - Human Genome Variation

Slide 40

Joe Mychaleckyj

Page 41: Contemporary Research in Human Genomics - Human Genome Variation

Slide 41

Joe Mychaleckyj

2008: Announcement of the $5,000 Genome

Page 42: Contemporary Research in Human Genomics - Human Genome Variation

Joe Mychaleckyj

Slide 42

Direct-to-Consumer Services

Page 43: Contemporary Research in Human Genomics - Human Genome Variation

Slide 43

Joe Mychaleckyj

Bio-IT World November 2008

Launch Platform List Cost Counselor

deCODEme Nov-07 Illumina $985 Referrals

23andMe Nov-07 Illumina $399 No

Navigenics Apr-08 Affymetrix $2500+$250 annual sub

On staff

SeqWright Jan-08 Affymetrix $998 No

Page 44: Contemporary Research in Human Genomics - Human Genome Variation

Slide 44

Joe Mychaleckyj

Page 45: Contemporary Research in Human Genomics - Human Genome Variation

Slide 45

Joe Mychaleckyj

Rival genetic tests leave buyers confused

Firms that offer to predict your risk of disease give worryingly varied resultsNic Fleming

(September 7, 2008)

Page 46: Contemporary Research in Human Genomics - Human Genome Variation

Slide 46

Joe Mychaleckyj

Different Companies produce differing assessments of risk• Different genetic variants reviewed and

included – threshold for inclusion• Level of expertise in companies to review

literature• Different statistical models for risk prediction

– no ‘right’ answer• How frequently updated – new findings in

literature