whole genome sequencing for colorectal cancer

33
WHOLE GENOME SEQUENCING FOR COLORECTAL CANCER Ulrike (Riki) Peters Fred Hutchinson Cancer Research Center University of Washington

Upload: kalila

Post on 24-Feb-2016

61 views

Category:

Documents


0 download

DESCRIPTION

Whole Genome Sequencing for Colorectal Cancer. Ulrike ( Riki ) Peters Fred Hutchinson Cancer Research Center University of Washington. Overview. Significance and rationale Current efforts on rare and less frequent variants Specific aims and design of whole genome sequencing grant. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Whole Genome Sequencing  for Colorectal  Cancer

WHOLE GENOME SEQUENCING FOR COLORECTAL CANCER

Ulrike (Riki) PetersFred Hutchinson Cancer Research CenterUniversity of Washington

Page 2: Whole Genome Sequencing  for Colorectal  Cancer

Overview• Significance and rationale• • Current efforts on rare and less frequent variants

• Specific aims and design of whole genome sequencing grant

Page 3: Whole Genome Sequencing  for Colorectal  Cancer

Structure Biology Biology Advancing Improvingof genomes of genomes of diseases medicine healthcare &

prevention

1990-2003Human Genome Project

2004 - 2010

2011- 2020

Beyond 2020

Progress of Genomic Research(adapted from Green and Guyer Nature 2011)

Page 4: Whole Genome Sequencing  for Colorectal  Cancer

Examples of GWAS for Drug TargetsDrug Drug target Drug indication GWAS traitStatins HMGCR Hypercholesterolemia LDL, cholesterolZnt8 agonists SLC30A8 Type 2 diabetes Type 2 diabetes

Ustekinumab IL12B Psoriasis, Crohn’s disease Psoriasis, Crohn’s

For additional examples, see Sanseau et al. Nat Biotechnol 2012

Drug Drug target Current drug indication GWAS traitNepicastat DBH Post-traumatic stress

disorder Smoking cessation

Denosumab/AMG-162

TNFSF11 Osteoporosis/bone cancer Crohn’s disease

Biib-003 LINGO-1 Multiple sclerosis Essential tremor

Examples of GWAS for Drug Repositioning

Page 5: Whole Genome Sequencing  for Colorectal  Cancer

Use of GWAS Findings to Inform Screening Decisions (using breast cancer as example)

So et al. Am J Hum Genet 2010

Colors show 10-year risk of breast cancer at different risk percentiles based on 13 GWAS loci

Average 10-year risk of breast cancer for a 50-year-old woman is 2.4%

Page 6: Whole Genome Sequencing  for Colorectal  Cancer

What is Known About the Genetic Contribution of Colorectal Cancer

Scandinavian Twin Registry, Lichtenstein et al. New Engl J Med 2000

Cancer Site

Heritable Factors

Environmental FactorsShared Non-shared

Prostate 0.42 (0.29-0.50) 0 (0-0.09) 0.58 (0.50-0.67)

Colorectal 0.35 (0.10-0.48) 0.05 (0-0.23) 0.60 (0.52-0.70)

Bladder 0.31 (0.00-0.45) 0 (0-0.28) 0.69 (0.53-0.86)

Breast 0.27 (0.04-0.41) 0.06 (0-0.22) 0.67 (0.56-0.76)

Lung 0.26 (0.00-0.49) 0.12 (0-0.34) 0.62 (0.51-0.73)

Page 7: Whole Genome Sequencing  for Colorectal  Cancer

Colorectal Cancer GWAS 21 GWAS loci Each SNP associated

with a modest increase in risk

Published and Newly Discovered Colorectal Cancer Susceptibility Loci

Houlston Nat Genet 2010; Tomlinson Nat Genet 2008; Zanke Nat Genet 2007; Haiman Nat Genet 2007; Hutter. BMC Cancer 2010; Tomlinson Nat Genet 2008;Tenesa Nat Genet 2008; Tomlinson Nat Genet 2011; COGENT Nat Genet 2008; Jaeger Nat Genet 2008; Broderick Nat Genet 2007; Peters, Hunter Hum Genet 2011; Dunlop Nat Genet 2012; Peters Gastroenterol (submitted)

Identified within GECCO

Page 8: Whole Genome Sequencing  for Colorectal  Cancer

Estimated Total Number of GWAS HitsPhenotype Estimated

number of GWAS hits

(95%CI)

Total genetic variance explained (95%CI)

Height 201 (75, 494) 16.4 (10.6, 30.6)

Crohn’s disease 142 (71, 244) 20.0 (15.7, 28.0)

Breast, Prostate and Colorectal Cancer 67 (31,173) 17.1 (11.6, 35.8)

Park et al. Nat Genet 2010

=> Known familial syndromes, such as FAP and Lynch Syndrome explain less than 3-5%

Page 9: Whole Genome Sequencing  for Colorectal  Cancer

Estimated Total Number of GWAS HitsPhenotype Estimated

number of GWAS hits

(95%CI)

Total genetic variance explained (95%CI)

Height 201 (75, 494) 16.4 (10.6, 30.6)

Crohn’s disease 142 (71, 244) 20.0 (15.7, 28.0)

Breast, Prostate and Colorectal Cancer 67 (31,173) 17.1 (11.6, 35.8)

Park et al. Nat Genet 2010

=> Known familial syndromes, such as FAP and Lynch Syndrome explain less than 3-5%

Page 10: Whole Genome Sequencing  for Colorectal  Cancer

What Explains Missing Heritability of Cancer?• Additional familial syndromes• Heritable epigenomic variability • Gene-gene and gene-environment interaction• Less frequent and rare variants • Structural variations/ Copy number variation (CNV)• Others or heritability may be overestimated

Page 11: Whole Genome Sequencing  for Colorectal  Cancer

What Explains Missing Heritability of Cancer?• Additional familial syndromes• Heritable epigenomic variability • Gene-gene and gene-environment interaction• Less frequent and rare variants • Structural variations/ Copy number variation (CNV)• Others or heritability may be overestimated

Page 12: Whole Genome Sequencing  for Colorectal  Cancer

Most Genetic Variation is Rare

Green ESPOrange ENCODEBlue HapMap

GWAS only investigated ~15% of genetic variation

Next-Generation sequencing can identify rare variants

Minor allele frequency

all rare variants all rare variants

Page 13: Whole Genome Sequencing  for Colorectal  Cancer

Feasibility to Identify Genetic Variants by Risk Allele Frequency and Strength of Genetic Effect

Manolio et al. Nature 2009

Page 14: Whole Genome Sequencing  for Colorectal  Cancer

Feasibility to Identify Genetic Variants by Risk Allele Frequency and Strength of Genetic Effect

Manolio et al. Nature 2009

Page 15: Whole Genome Sequencing  for Colorectal  Cancer

Overview• Significance and rationale

• Current efforts on rare and less frequent variants

• Specific aims and design of whole genome sequencing grant

Page 16: Whole Genome Sequencing  for Colorectal  Cancer

16

Current efforts in GECCO to Search for Less Frequent and Rare Variants (Genetics and Epidemiology of Colorectal Cancer Consortium)

The global view of genetic contribution to colorectal cancer

GECCOCoordinati

ng Center

WHI

ARCTIC

VITAL

DACHS

PLCO

CPS

ASTERISK

DALSColo2&3

MEC

PHS

HPFS

NHS

CCFR

MECC

NGCC

HRT-CCFR

FHCRC Coordinating

Center

~30,000 subjectsU01 and X01, Peters, 2009-2013

• Imputation to 1000 Genomes Project in ~28,000 samples with GWAS

• Exome chip genotyping • On about 25,000

samples• CIDR Pilot

• Whole exome sequencing on 130 high risk colorectal cancer cases + 30 controls

Page 17: Whole Genome Sequencing  for Colorectal  Cancer

17

NHLBI - Exome Sequencing Project• Whole Exome Sequencing of 7,000 European and African

Americans to identify rare variants associated with common complex diseases

• Sequencing centers• Broad • University of Washington

• Cohorts• Women’s Health Initiative• HeartGo

• ARIC, CARDIA, CHS, FHS, JHS, MESA

• LungGo

• Phenotypes • Early On-set MI• Early onset/FH+ Stroke• Extreme BMI/T2D• Extreme Lipids• Extreme Blood pressure• COPD• Pulmonary hypertension • Cystic fibrosis

Page 18: Whole Genome Sequencing  for Colorectal  Cancer

18

Whole Exome vs Whole Genome• Exome covers only 1-2% of genome

• 88% of all GWAS findings are outside of the well-studied protein-coding regions• 78% of GWAS findings with MAF<5%

Page 19: Whole Genome Sequencing  for Colorectal  Cancer

19

Junk No More: ENCODE Project Finds "Biochemical Functions for 80% of the

Genome“

The ENCODE Project Consortium, “An integrated encyclopedia of DNA elements in the human genome" Nature 2012

Page 20: Whole Genome Sequencing  for Colorectal  Cancer

Overview• Significance and rationale• • Current efforts in GECCO on rare and less frequent variants

• Specific aims and design of whole genome sequencing grant

Page 21: Whole Genome Sequencing  for Colorectal  Cancer

Aims of the U01 Sequencing Grant• Aim 1. To identify novel CRC susceptibility variants

across the genome, mainly variants with allele frequency 0.1-5% • Rare variants <1%• Less frequent variants 1-5%• Common variants >5%

• Aim 2. To investigate whether known environmental risk factors for CRC modify genetic susceptibility to CRC (Gene-Environment interactions)

Page 22: Whole Genome Sequencing  for Colorectal  Cancer

Study Design Overview

R01; PI: Peters

Page 23: Whole Genome Sequencing  for Colorectal  Cancer

Funding Information • 17% Budget Cut • 4 year instead of 5 year• U01 designation• Expected start date: before 9/31/12

Total budget cut 33%

Page 24: Whole Genome Sequencing  for Colorectal  Cancer

Whole Genome SequencingN=1,600 cases, 1,600 controls

Imputation of WGS DataN=9,129 cases, 11,728 controls

Aim 1.1 Aim 1.2

FReplicationN=3,100 cases, 3,100 controls; ~3,000 variants

Gene-Environment Interaction Analyses2-Stage Screening, Weighted Hypothesis,

Empirical Bayes

Association Testing Individual & Aggregated Variants

Aim

1A

im 2

N=10,729 cases, 13,328 controls; ~18M variants

Aim 1.3

Aim 2Aim 1.2

Total sample size is 13,829 cases and 16,428 controls

Page 25: Whole Genome Sequencing  for Colorectal  Cancer

Classes of Genetic Variants Being Examined

Variant Type Definition in This Proposal

Expected #

Single nucleotide variant (SNV)

Single base pair change with MAF>0.1% & <5%

~13- 15M

Single nucleotide polymorphism (SNP)

Single base pair change with MAF>5%

~5 M

Insertion/deletion (indel) Insertion/deletion or inversion <50bp

~1.5- 2M

Copy number variant (CNV)

Insertion/deletion or inversion >5kb

~20K

Page 26: Whole Genome Sequencing  for Colorectal  Cancer

StudiesStudy Cases Controls GWAS #SNPs

Studies with GWAS (sequencing and imputation)ARCTIC 850 800 100K, 500KDACHS 2,900 2,400 300KDALS 1,100 1,200 300K, 550K, 610KHPFS 850 850 730KMEC 400 400 300kNHS 500 900 730KPHS 400 400 730KPLCO 1,200 1,800 300K, 610K, 500KASTERISK 1,000 1,000 300KVITAL 300 300 300KWHI 1,300 2,200 300K, 550KStudies with no GWAS (replication)North German CCS 4,000 4,000 N/ACPS-II 1,000 1,000 N/AMECC 3,400 3,000 N/ANon-whites 400 650Total 20,000 21,000

Page 27: Whole Genome Sequencing  for Colorectal  Cancer

Data Harmonization of Environmental Risk Factors

• Collecting 74 variables in 11 categories

• Multi-step collaborative process leading to common data elements with standardized definitions, permissible values and coding

Meta-analysis across 15 studies

Page 28: Whole Genome Sequencing  for Colorectal  Cancer

Sequencing and Genotyping• At Genome Science, University of Washington• Whole genome-sequencing

• At lower depth • Illumina HiSeq• In years 1 to 3• Total ~1,600 cases and 1,600 controls

• Year 1: ~600• Year 2: ~1,000• Year 3: ~1,700

• Replication genotyping • In years 3 and 4 • 6,200 samples for 3000 SNPs• 2,400 samples for 384 SNPs

Page 29: Whole Genome Sequencing  for Colorectal  Cancer

Variant Calling Based on Sequencing Data

• Variant calling • Depended on depth of sequencing• Multi-sample calling improves accuracy and, hence, we will call in

batches of increasing # of samples

• Structural variation/copy number variant (CNV) calling

• Indel and CNV calling is error prone and requires genotyping follow up• Follow-up genotyping on 384 SNPs in 1,600 samples

Page 30: Whole Genome Sequencing  for Colorectal  Cancer

Imputation of Sequencing data into GWAS

• Imputation• Use whole genome sequencing data as

reference panel to impute into samples with only GWAS data

• Important points raise:• Imputation accuracy improves with increasing

sample size of reference panel (samples with whole genome sequencing data)

• Imputation accuracy improves with increasing denser GWAS platform

• Follow-up genotyping on 384 SNPs in 800 samples

Whole genome sequence

3200 samples~18M variants

GWAS 19,000 samples

Page 31: Whole Genome Sequencing  for Colorectal  Cancer

Statistical Analysis • Marginal and burden testing

• Single variant test• Aggregated tests to test all rare variants across defined region,

such as a gene• Motivation:

• Mendelian diseases show that multiple different mutations can lead to disease • Rare variants tested individually have limited power to show association

(unless highly penetrant)

• Gene-environment interaction testing

Page 32: Whole Genome Sequencing  for Colorectal  Cancer

Advisory Committee• NCI

• Stephen Chanock• Daniela Seminara• Peggy Tucker

• Suggestions for external investigators• Mike Boehnke (U of Michigan)• Elaine Mardis (Washington U in St. Lois) • Nicole Soranzo (Wellcome Trust / Sanger Inst) • Stephen Thibodeau (Mayo Clinic, Rochester)

Page 33: Whole Genome Sequencing  for Colorectal  Cancer

Timeline

Activities Yr 1 Yr 2 Yr 3 Yr 4 Yr 5Sample preparation and QA/QC                    

Whole genome sequencing and variant calling (Aim 1.1)Imputation and association testing (Aim 1.2)

Replication genotyping (Aim 1.3)                    

GxE analysis (Aim 2)

Preparation of manuscripts