the uk biobank –the scientific challenge · methods development 9 9 9 9 samples for: twins w2 obb...
TRANSCRIPT
The UK Biobank –The Scientific Challenge
John BellRegius Professor of Medicine, Oxford University
Chairman, UK Biobank Scientific CommitteeChairman, Steering Committee, MOLPAGE
Biobank UK: The Context
• A landmark epidemiology/genetic/genomic prospective cohort for UK science
• Modern, efficient methodology in patient recruitment, statistics, IT, genotyping, LIMS and genomics
• The resource to develop and expand UK expertise in the integrated field of genetic epidemiology
• To identify the major genetic and environmental factors that contribute to common disease pathogenesis
Recruitment500,000 participants
aged 40-69
Data & Sample storage
• Environmental exposure
• Physiological variables
• Neuropsychiatricevaluation
• Biochemical markers
• DNA, plasma, urine
Follow-up• Record linkage• Registries• GP records• Hospital
admissions• Cancer registries• Prescriptions• Death
Biobank Objectives – April 20051. Study genetic and environmental factors in disease.*2. Study genes and environment in a variety of settings
(disease subtype, age, sex, social settings)3. Use prevalence cases to validate or establish genetic
association in a range of disease4. Characterise impact of socio-economic factors on
disease*5. Characterise in detail genetically or other risk factor
defined subpopulations to understand natural history of disease and study biological effects of these events (intensively phenotyped subpopulations)
Biobank Objectives – April 2005*6. NHS utilization*7. Utilise biomarkers (protein, small molecules) to define
disease subtypes and as markers of environmental exposures
*8. Identify disease related risk factors using genomicallyderived prediction and progression testing
9. Pharmacogenetics (?)
NHSutilisation
Social factors
in disease
Biomarkers& disease surrogates
Environ-mental
determinantsof disease Disease
genes
Intensive Phenotypingof genotypepopulations
Psycho-logical factors
in disease
Diagnostics
GeneEnvironmentinteraction
Pharmaco-genetics
BioBank500,000
participants
Experimental Medicine: Genotype-Phenotype correlations and disease
β2 Adrenoreceptor (BAR)-2 gene PPAR Pro12/Ala
Data from OBB
Haplotype Glucose(mmol/l)
BAR 6/6 (n=25) 5.32 (0.08)BAR 2/2 (n=170) 5.18 (0.03)BAR 4/4 (n=100) 5.04 (0.04)
ANOVA, P<0.01
Timetable
• 2003: 6 RCCs selected. • 2004/5: Protocol incl questionaire,
physiological measurements, recruitment strategy developed
• 2004/5: Storage facilities established• 2005: Phase 1 Pilots began Feb to
evaluate assessment visit• Phase 2 Pilots begin February to assess
recruitment strategies
Assessment visit
• 120 minutes• Consent: paperless• 140 questions on touch screen for direct
data entry• Direct interview for cognitive function• Measurements• Blood, urine collected
Measurements
• Height• Weight• Impedence• BP• Waist-hip• Spirometry• Hand Grip• ECG
• Ankle-brachial index• Pulse-wave velocity• Carotid IMT• Axial DEXA scan• Calcaneal BMD• Timed Shuttle test• Spirometry
reversibility
REMEASUREMENT AT 1 MONTH IN 10%
Questionnaire
• 140 core questions with 80 stems• Separate consideration of psychological
status, cognitive function, environment and diet
• Diet: 24 hour recall by touch screen and subsequent remeasurement
Why Prospective Study?• Most accurate approach to defining
environmental exposures• Essential for understanding impact of premorbid
conditions on disease natural history (cognitive function, anxiety, biochemical and metabolic variation)
• Evaluation of predictive biomarkers in disease• Proven value of prospective studies
(Framingham, Nurses Health Study, Doctors Study, MRFIT, EPIC)
0 20 40 60 80 100 120
OXCHECK
Scottish Heart Study
Paisley & Renfew
Oxford/FPA
Whitehall
National Child Development
ALSPAC
BUPA Cohort Study
EPIC-Norfolk
SEARCH
British Doctors' Study
RCGP Oral Contraception Study
EPIC-Oxford
Heart Protection Study
Million Women Study
University of Oxford
Smaller UK studies: Regional Heart Study 8000; 1946 Birth Cohort 5000; Caerphilly & Speedwell 4000; Ely study 2000.
Cohort size (thousands)
UK COHORT STUDIES (>10,000 participants)
Phenotyping• Extensive at recruitment
– Full biochemical screen– Toxin profile– Physiologic BP, ECG, cardiovascular, dopplers– Psychometric testing: anxiety, cognitive function– Socio-economic parameters
• Opportunities for new phenotyping tools on stored samples (biomarkers)
• Rephenotyping in defined subsets for validation and specific studies of subpopulations defined by risk factors
After 10 years of follow-up
After 5 years of follow-up
ASCERTAINABLE THROUGH ROUTINE PRACTICE AND OTHER FOLLOW-UP
115005800Diabetes millitus (incidence)82504100Myocardial infarction (incidence)45002250Stroke (incidence)20001000Dementia1700850Rheumatoid arthritis (incidence)1300650Parkinson’s disease (incidence)1200600Hip fracture (incidence)
2540980Lung cancer (incidence)1410610Non-Hodgkin’s lymphoma (incidence)1130430Bladder cancer1120510Ovarian cancer (incidence)855325Stomach cancer (incidence)
2540980Lung cancer (incidence)1410610Non-Hodgkin’s lymphoma (incidence)1130430Bladder cancer1120510Ovarian cancer (incidence)855325Stomach cancer (incidence)
After 10 years of follow-up
After 5 years of follow-upASCERTAINABLE FROM ROUTINE DATA
Total number of cases expectedCondition
Prospective Studiesage range
suggestions to include younger age groups
45-69 35-69 35-44EXPECTED DEATHSischaemic heart disease 4300 3200 130cerebrovascular disease 1100 830 40
EXPECTED INCIDENCEbreast cancer 6270 5140 640colorectal cancer 5410 4000 150prostate cancer 3290 2360 6
myocardial infarction 8250 6270 380*stroke 4500 3370 170
*based on rates in participants 35-49
Canada
USA ChinaMexico
UK Europe
Large Prospective Cohorts with Biological samples
What is novel about Biobank?
1. Size of study2. Breadth of biological resource3. Consideration of biomarkers and
genes4. Recall of subsets for intensive
phenotyping5. Novel Storage Facility6. Broad use of NHS records
MOLPAGE
European Consortium
• 10 Universities/Research Institutes• 2 Pharma• 6 Biotech/SME
• 4 Year Programme• European Union FP6 • 12 million Euros
Molecular Phenotyping to Accelerate Genomic Epidemiology
MolPAGE Goals
Molecular Phenotyping to Accelerate Genomic Epidemiology
“The application of genomic, metabonomic and proteomic tools to develop new technical, data
analysis and integration protocols that will enable large-scale biomarker typing
and discovery projects”
Clinical Focus
The consortium will initially focus on;
(1) Type 2 diabetes – to identify biomarkers indicative of a pre-clinical status and increased risk for developing diabetes
(2) Cardiovasuclar – to identify biomarkers indicative of an increased risk of developing vascular disease in diabetics
Later, the protocols will be applied to;
(3) Cancer
MolPAGE Programme
The programme is divided into three parts:
• The evaluation of sample collection and storage methodology; to optimally reduce sample variation and maximise analyte stability
• The development of genomic, metabonomic and proteomic tools;for molecular phenotyping on an epidemiologic scalefrom readily obtainable clinical samples
• The development of bioinformatic tools for data warehousing, data interrogation and statistical analysis in large sample sets
Approach
The consortium will adopt two separate approaches;
(1) Genome wide, systematic methodology (Metabonomics, mass spec based proteomics)
(2) Limited analysis, using sets of candidate biomarkers (Transcript profiling, DNA methylation studies, affinity arrays, protein arrays, tissue arrays)
Platforms For theanalysis of large sets ofbiomolecules
Sample collection& standardisation
of processing,storage anddescription
DNAsequence
Epigenomics
Transcriptprofiling
Mass Spec proteomics
Meta-bonomics
databasing Stat.analysis
Peptidomics
Affinityproteomics
Platforms for the analysis of focused setsof bio-molecules
WP1
WP10
WP4
WP5
WP5
WP2
WP6,7
WP3
WP9WP8
WP10
WP4
WP5
WP5
WP2
WP6,7
WP3
Training
WP11
Management
WP12
Work Package Overview
Consortium Partners & Projects (1)
Work Package 1: Sample Collection, Processing & Storage
• University of Oxford• Guys and St Thomas Hospital NHS Trust • Obesity Research Unit, INSERM• Charles University, Prague
• Provision of blood, serum, plasma, RNA, urine, fat and metabolic tissue samples to all technology development work packages
• Collection of twin samples to measure variability
• Establish optimal collection and storage parameters
Sample Collections
MUSCLE, LIVER, FAT
~100Individuals undergoing surgerySurgical samples (Sx)
BLOODURINE
~1000Families with early onset T2DYoung Onset (YO)
BLOODURINE, FAT
>4800 twin pairs
Caucasian twin pairs aged 18-80 (50% MZ)
St Thomas’ UK Adult Twin Registry(TWINS)
BLOODURINE, FAT
~100Obese individuals taking part in a weight reduction program
PraToulOb study(PTO)
BLOODURINE, FAT
>1000 indivsPopulation based sample of UK middle aged individuals
Oxford Biobank(OXBB)
BLOODURINE
~290 families~1450 indivs
UK sibships asc’d for proband with BP >5th centile
HTO(HTO)
BLOODURINE
~600 families~2500 indivs
UK sibships ascertained on basis of single T2D and several unaffected sibs
Progene / Diabetes in Families(Prog: DIF)
BLOODURINE
CELL LINES
~827 families~2600 indivs
UK pedigrees segregating T2D and unaffected relatives
Warren 2 families (W2)
AvailableNumbers DescriptionName
Twin Studies
MZ
40 pairs
MZ female Representative for BMI
DZ
20 pairs
MZD
10 pairs
DZ female Representative for BMI
MZ same sexDiscordant for BMI*
(two visits)+ extra biopsies
From each twin collect;
Blood For RNA
Urine
Blood For EBV
Serum Plasma
Adipose
*Repeat for 10 pairsDiscordant for T2D
Consortium Partners & Projects (2)
Work Package 2: Metabonomics
• Imperial College & Novo Nordisk
• Optimising bio fluid Nuclear Magnetic Resonance to identify biomarkers
• To produce an NMR bio-fluid database in man•
Work Package 3: Transcript Profiling
• Novo Nordisk
• RNA analysis using Affymetrix array technology to identify diagnostic transcription patterns in PBL, EBV transformed lymphocytes and fat
Early Biomarker promise…..
“separation” in metabonomicspace of samples from individualswith normal coronary arteries and triple vessel disease
Transcript profiling defines breast cancer prognosis
Van de Vijver et al NEJM 2002
Schadt et al. Genetics of gene expression surveyed in maize, mouse and man. Nature 422, 297 – 302, 2003
Consortium Partners & Projects (3)
Work Package 4: DNA methylation studies
• Epigenomics AG, Berlin• Centre National de Genotypage
• To characterise the DNA methylation profile of candidate genes in disease using E SMETM software
• To identify de novo differentially methylated genes in DNA from fat and PBMC’s
Consortium Partners & Projects (4)
• Centre National de Genotypage• Oxford Gene Technologies• Roche, Basel, Switzerland
• Clinical Proteomics (ClinProt) •• Differential peptide Display
• 2D gel peptide mass fingerprinting
Work Package 5: Mass Spectrometry Proteomics
• BioVisioN AG, Hannover• Centre National de Genotypage
Consortium Partners & Projects (5)Work Packages 6 & 7:Affinity Ligand & Affinity Array Proteomics
• Royal Institute of Technology, Stockholm• Affibody AB, Bromma• Centre National de Genotypage
• Generation of 200 antibodies to human serum proteins – including known and de novo MolPAGE biomarkers
• Tissue microarrays using control and diseased metabolic tissue•• First generation protein array based on affibodies
• First generation ‘Compact Disc’ for liquid handling of serum and mass spec detection
• University of Uppsala• Gyros AB, Uppsala
Status of chromosome 21 proteomics project
Done
No protein
No RNA
January 25, 2002
Work Package 8:• University of Oxford Data analysis & statistical• Guys and St. Thomas Hospital Trust methods development
Work Package 9:• EBI, Cambridge Sample Database &• IMCS, Riga Data warehousing
Work Package 10:• Centre National de Genotypage Genetic Analysis
Work Package 11:• University of Pavia Training & Mobility
Consortium Partners & Projects (6)
Sample Collections
Prog
Tissue arrays
eQTL study
Biomarker discovery
Sample processing
Heritability, variability etc
Methods developmentothergdmYOSxObbW2TwinsSamples for:
Progene(&DIF)
OxBB(and PTO)
Young-onset
DNARNAProteinSamples for metabonomicsEpigenomics
Plasma/serumUrineAdipocyte biopsiesSurgical samples: liver, fat, muscle
TECHNOLOGY COLLECTIONS
GENOMIC EPIDEMIOLOGY
....subsequent doubts
Sample Collections
MUSCLE, LIVER, FAT
Large biopsies of muscle, fat, liver possible
~100Individuals undergoing surgery
Surgical samples (Sx)
BLOODURINE
Individuals with known genetic basis for their diabetes
~1000Families with early onset T2D
Young Onset (YO)
BLOODURINE
FAT
Extensive bioclinical data for majority including DNA; genome scan complete on 1500 DZ pairs
>4800 twin pairs
Caucasian twin pairs aged 18-80 (50% MZ)
St Thomas’ UK Adult Twin Registry(TWINS)
BLOODURINE
FAT
Fat biopsies available before and after weight reduction program; extensively phenotyped
~100obese indivs taking part in a weight reduction program
PraToulOb study(PTO)
BLOODURINE
FAT
Extensively phenotyped for metabolic traits; consented to further contact for detailed physiological studies
>1000 indivsPopulation based sample of UK middle aged individuals
Oxford Biobank(OXBB)
BLOODURINE
Extensively phenotyped for cardiovascular and metabolic traits; genome scan completed;
~290 families~1450 indivs
UK sibships asc’d for proband with BP >5th centile
HTO(HTO)
BLOODURINE
Extensive intermediate phenotypes;Proportion fo these currently being resampled in 2005
~600 families~2500 indivs
UK sibships ascertained on basis of single T2D and several unaffected sibs
Progene / Diabetes in Families(Prog: DIF)
BLOODURINE
CELL LINES
Cell lines; extensive phenotypes; genome scan for linkage completed; longitudinal study in unaffected relatives (=Progene – see next item)
~827 families~2600 indivs
UK pedigrees segregating T2D and unaffected relatives
Warren 2 families (W2; Progene)
AvailableAdded scientific value Numbers DescriptionName