uk biobank: current status and what it means for epidemiology

4

Click here to load reader

Upload: naomi-allen

Post on 27-Nov-2016

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: UK Biobank: Current status and what it means for epidemiology

Available online at www.sciencedirect.com

journal homepage: www.elsevier.com/locate/hlpt

Health Policy and Technology (2012) 1, 123–126

2211-8837/$ - see frohttp://dx.doi.org/1

nCorresponding au7LF, UK. Tel.: +44 1

E-mail address: n1Request for reprin

5360; fax: +44 1 61

UK Biobank: Current status and what it meansfor epidemiology

Naomi Allena,b,n, Cathie Sudlowa,c, Paul Downeya, Tim Peakmana,John Daneshd, Paul Elliotte, John Gallacherf, Jane Greeng, Paul Matthewsh,Jill Pelli, Tim Sprosenj, Rory Collinsa,b, on behalf of UK Biobank1

aUK Biobank, Adswood, Stockport, UKbClinical Trial Service Unit and Epidemiological Studies Unit, University of Oxford, UKcDivision of Clinical Neurosciences, University of Edinburgh, Edinburgh, UKdCardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UKeMRC-HPA Centre for Environment and Health, Imperial College London, UKfDepartment of Primary Care & Public Health, Neuadd Meirionnydd, Heath Park, Cardiff, UKgCancer Epidemiology Unit, University of Oxford, UKhDepartment of Medicine, Division of Brain Sciences, Imperial College London, UKiInstitute of Health and Wellbeing, University of Glasgow, UKjSchool of Public Health, Imperial College London, UKAvailable online 3 August 2012

nt matter & 20120.1016/j.hlpt.2012

thor at: Clinical T865 743805; fax: +

[email protected]: UK Biobank Co475 5361. mailto:

AbstractUK Biobank is a very large prospective study which aims to provide a resource for the investigationof the genetic, environmental and lifestyle determinants of a wide range of diseases of middle ageand later life. Between 2006 and 2010, over 500,000 men and women aged 40 to 69 years wererecruited and extensive data on participants’ lifestyles, environment, medical history and physicalmeasures, along with biological samples, were collected. The health of the participants is nowbeing followed long-term, principally through linkage to a wide range of health-related records,with validation and characterisation of health-related outcomes. Further enhancements are alsounderway to improve phenotype characterisation, including internet-based dietary assessment,biomarker measurements on the baseline blood samples and, in sub-samples of the cohort, physicalactivity monitoring and proposals for extensive brain and body imaging. UK Biobank is now availablefor use by all researchers, without exclusive or preferential access, for any health-related researchthat is in the public interest. The open-access nature of the resource will allow researchers fromaround the world to conduct research that leads to better strategies for the prevention, diagnosisand treatment of a wide range of life-threatening and disabling conditions.& 2012 Fellowship of Postgraduate Medicine. Published by Elsevier Ltd. All rights reserved.

Fellowship of Postgraduate Medicine. Published by Elsevier Ltd. All rights reserved..07.003

rial Service Unit and Epidemiological Studies Unit, University of Oxford, Roosevelt Drive, Oxford, OX344 1 865 743985.

ox.ac.uk (N. Allen).ordinating Centre, 1 & 2 Spectrum Way, Adswood, Stockport, Cheshire, SK3 0SA, UK. Tel.: +44 1 61 [email protected].

Page 2: UK Biobank: Current status and what it means for epidemiology

N. Allen et al.124

Introduction

Large cohorts with stored biological samples are crucial forunderstanding the determinants of complex disease. In aprescient move over a decade ago, the Medical ResearchCouncil and Wellcome Trust decided to establish UK Biobank,a large population-based prospective cohort with extensive andreliable measurement of a wide range of exposures, along withrigorous follow-up of health outcomes, to allow detailed inves-tigation of the genetic and environmental determinants of awide range of diseases of middle and old age [1,2]. This articleprovides an overview of the rationale behind UK Biobank, itscurrent status and future developments of the cohort.

Setting the standard for modern population-based epidemiology

Understanding the determinants of common life-threaten-ing and disabling diseases is challenging. Such conditions aretypically caused by a variety of different exposures whichmay each have modest effects and interact with each otherin complex ways [3,4]. In order to investigate a wide rangeof exposures, extensive information needs to be collectedthrough questionnaires and physical measures, as well asthrough storing biological samples that allow many differenttypes of assay to be performed (e.g., genetic, proteomic,metabonomic, biochemical).

Prospective cohorts have a number of advantages forassessing the combined effects of lifestyle, environment,genes and other exposures on a variety of health outcomes[4,5]. In particular, exposures can be assessed before theyare affected by disease or its treatment (thereby avoidingrecall bias and minimising reverse causation bias). In addi-tion, the prospective nature of the study means that a widerange of conditions can be investigated, including those thatare difficult, if not impossible, to study retrospectively (e.g.,dementia and rapidly fatal conditions). Moreover, the overallbeneficial and adverse effects of a specific exposure on thelife-time risks of multiple health outcomes can be considered(e.g., associations of obesity with different causes of death[6]). However, because only a small proportion of theparticipants will develop any one condition and the effectsof different exposures on the development of that conditionare likely to be modest, prospective studies need to be large,with many tens of thousands of participants [3].

While prospective studies are crucial for the reliable identi-fication and quantification of risk factors for disease, theyrequire substantial long-term investment and typically collecteither a large amount of data on a small number of participants(e.g. the Framingham Heart Study, with a wide range ofphysical measures on 5000 participants [7]), or a relativelysmall amount of data on a large number of participants (e.g.the Million Women Study, with questionnaire data on 1.3 millionwomen [8]). Other prospective studies have focused on theassessment of certain types of exposure on specific outcomes(e.g. diet and cancer in the EPIC study of 500,000 people inEurope [9]). Because UK Biobank collected extensive baselinequestionnaire data, physical measures, and biological sampleson 0.5 million participants, who are now being exhaustivelyfollowed up, it is a rich resource for investigating why somepeople develop particular diseases while others do not.

Recruitment and follow-up of participants

UK Biobank aimed to be as inclusive as reasonably possible,with all people aged 40–69 years who were registered withthe National Health Service and living up to about 25 milesfrom one of the 22 study assessment centres invited toparticipate. Overall, about 9.2 million invitations weremailed in order to recruit 503,325 participants (i.e. aresponse rate of 5.47%). Regardless of participation rates,as long as there are sufficiently large numbers of partici-pants with different levels of the relevant risk factors underinvestigation, generalisable associations between baselinecharacteristics and subsequent health outcomes can bemade. Successful recruitment of 0.5 million participantsduring 2006–2010 was largely achieved through extensivepiloting and establishment of highly efficient, centralisedand bespoke processes, with assistance from UK Biobank’sextensive academic collaborative network.

Volunteers who attended an assessment centre gaveinformed consent and completed questionnaires on their life-style, environment and medical history, had a wide range ofphysical measures performed and had samples of blood, urineand saliva collected [10] (Table 1). Some lifestyle factors, suchas diet and physical activity, are notoriously difficult tomeasure reliably using questionnaires. For this reason, UKBiobank aims to conduct more detailed phenotyping in sub-samples of the cohort in order to calibrate the baselinemeasures. For example, a series of web-based 24-h dietaryquestionnaires will supplement the dietary data collected atthe assessment clinic [11] and mailed tri-axial accelerometerswill supplement the questionnaire data on physical activity(and provide more reliable assessment of other aspects ofnormal daily living, such as sleep patterns). Other enhance-ments being planned include baseline blood measurements onthe entire cohort of biomarkers known to be relevant fordisease (e.g., lipids for cardiovascular disease), of high diag-nostic value (e.g., HbA1c for diabetes) or that characterisephenotypes not otherwise well assessed (e.g. liver and kidneyfunction measures). UK Biobank has also submitted a proposalfor funding to perform a range of imaging modalities (e.g.magnetic resonance imaging of brain and body, carotid ultra-sound and dual-energy X-ray absorptiometry body scans) in upto 100,000 of the participants. In addition, since measurementerror in risk factor levels (due to short-term biologicalvariability or to longer-term within-person fluctuations) maysubstantially underestimate the true aetiological associationsthat exist (i.e. regression dilution bias [12]), a repeat of thebaseline assessment visit will be conducted every few years insubsets of 20–25,000 participants.

The value of UK Biobank depends not only on its ability toobtain detailed baseline data and biological samples but alsoon achieving detailed follow-up of the health of participants,which is made possible through linkage to routine dataavailable from the UK National Health Service (e.g. mortality,cancer registrations, hospital admissions, primary care data,etc.). Information will also be sought directly from partici-pants about conditions that are typically under-reported (e.g.,cognitive decline, depression). Many cohort studies have notbeen in a position to well-characterise the wide range ofhealth outcomes identified during follow-up, leading to a lossof statistical power caused by misclassification of cases and/orthe grouping together of disparate subtypes. UK Biobank is

Page 3: UK Biobank: Current status and what it means for epidemiology

Table 1 Data collected at baseline.

Type of measure Topic area Details

Questionnaire(touch-screen andverbal interview)

Socio-demographics Employment status, marital status, education, income, carownership, ethnicity

Family history and early lifeexposures

Family history of major diseases, birth place, birth weight,breastfeeding, maternal smoking, childhood body size

Psychosocial factors Mental health, social supportEnvironmental factors Current address, occupation, housing, domestic heating and

cooking fuel, means of travel, shift work, mobile phone useLifestyle Smoking, alcohol consumption, physical activity, diet, sleepHealth status Medical history, medications, operations, hearing, sight, sexual and

reproductive historyHearing threshold Hearing testCognitive function Tests for episodic and numeric memory, reaction time, fluid

intelligence, prospective memory

Physical measures Blood pressure, heart rate Two automated measures one minute apartHand grip Left and right hand grip strengthAnthropometry Height, weight and bioimpedance, hip/waist circumferenceSpirometry Lung function testsBone density Left and right heel calcaneal ultrasoundArterial stiffnessa Pulse wave velocityFitness testa Cycle ergometry with ECG monitoringEye examinationa Refractive index, intraocular pressure, visual acuity, optical

coherence tomography

Biological samples Blood Plasma, serum, buffy coat, red cells, DMSO blood, RNAUrineSalivaa

aMeasures introduced towards the end of recruitment and available for 70,000 to 120,000 participants.

UK Biobank: Current status and what it means for epidemiology 125

therefore undertaking substantial efforts to ascertain, confirmand characterise the most common health outcomes, both forprevalent and incident disease. This will involve cross-refer-encing diagnoses via multiple sources of information, startingwith the use of lower-cost electronic sources and followed bythe use of more resource-intensive methods for their con-firmation and further sub-phenotyping, thereby enablingresearchers to focus on particular sub-types of disease.

Opportunities for the future

Opportunities now exist for research based on prevalentdisease (e.g., there are 24,000 participants with self-reported diabetes and 11,000 with breast cancer) and otherinformation recorded at baseline. Over the next few years,large-scale research will be possible on incident cases ofsome of the more common conditions (e.g. diabetes melli-tus, coronary heart disease, chronic obstructive pulmonarydisease and breast cancer). Beyond the fifteenth year offollow-up (i.e. after 2020), UK Biobank will become suffi-ciently mature to allow reliable investigation of an increas-ingly wide range of conditions (Table 2).

Access to the resource

The UK Biobank resource launched in April 2012 and isnow available for use by researchers, without exclusive or

preferential access, for any health-related research that isin the public interest (http://www.ukbiobank.ac.uk). Inorder to encourage extensive use of the resource for healthresearch, all bona fide researchers can apply, includingthose from the academic, charity, public and commercialsectors, both in the UK and internationally. The onlineapplication process enables researchers to select data fieldsspecific to their research proposal and is linked to automated systems for sample retrieval. Robust safeguards arein place to help ensure anonymity and confidentiality ofparticipants’ data and samples [13]. UK Biobank is aregistered charitable company and, as such, researchersare only required to pay for access to the resource on acost-recovery basis for their proposed research, the resultsof which will be incorporated back into the resource so thatothers can benefit from their findings. The involvement ofUK Biobank in international initiatives aimed at dataharmonisation across studies (such as DataSHaPER [14] andBBMRI [15]) will also help to improve accessibility andcollaboration with other research studies.

Conclusion

UK Biobank has shown that it is possible to establish a largepopulation-based prospective study with a high quality ofdata collection, both of participants’ baseline characteris-tics and their subsequent health outcomes. This has beenmade possible with an emphasis on highly-efficient and

Page 4: UK Biobank: Current status and what it means for epidemiology

Table 2 Estimated numbers of incident cases of various disease outcomes during follow-up in UK Biobanka.

Condition Incident cases

By 2012 By 2017 By 2022 By 2027

Diabetes 10,000 24,000 40,000 68,000Myocardial infarction and coronary death 7000 17,000 28,0000 47,000Stroke 2000 5000 9000 20,000COPD 3000 8000 14,000 25,000Breast cancer 3000 6000 10,000 16,000Colorectal cancer 1000 4000 7000 14,000Prostate cancer 1000 4000 7000 14,000Lung cancer 1000 2000 4000 8000Hip fracture 1000 3000 6000 17,000Rheumatoid arthritis 1000 2000 3000 5000Alzheimer’s disease 1000 3000 9000 30,000Parkinson’s disease 1000 3000 6000 14,000

aBased on UK age-and sex-specific rates with adjustment for potential ‘healthy-cohort effects’ and losses to follow-up [2].

N. Allen et al.126

centralised processes with close collaboration with theacademic community. The open-access nature of theresource will allow researchers from around the world toconduct research that lead to better strategies for theprevention, diagnosis and treatment of a wide range of life-threatening and disabling conditions.

Acknowledgements

UK Biobank is funded by the Medical Research Council,Wellcome Trust, Department of Health, British Heart Foun-dation, Northwest Regional Development Agency, ScottishGovernment, and Welsh Assembly Government. The viewsexpressed are those of the authors and not necessarily thoseof the NHS, the NIHR or the Department of Health.

References

[1] Palmer LJ UK. Biobank: bank on it. Lancet 2007;369:1980–2.[2] UK Biobank. UK Biobank: rationale, design and development of

a large-scale prospective resource, /http://www.ukbiobank.ac.uk/resources/S.

[3] Burton PR, Hansell AL, Fortier I, et al. Size matters: justhow big is BIG?: Quantifying realistic sample size requirementsfor human genome epidemiology Int J Epidemiol 2009;38:263–273.

[4] Manolio TA, Bailey-Wilson JE, Genes Collins FS. Environmentand the value of prospective cohort studies. Nat Rev Genet2006;7:812–20.

[5] Grimes DA, Schulz KF. Cohort studies: marching towards out-comes. Lancet 2002;359:341–5.

[6] Whitlock G, Lewington S, Sherliker P, et al. Body-mass indexand cause-specific mortality in 900,000 adults: collaborativeanalyses of 57 prospective studies. Lancet 2009;373:1083–96.

[7] Higgins MW. The Framingham Heart Study: review of epide-miological design and data, limitations and prospects. ProgClin Biol Res 1984;147:51–64.

[8] The Million Women Study Collaborative Group. The millionwomen study: design and characteristics of the study popula-tion. Breast Cancer Res 1999;1:73–80.

[9] Riboli E, Kaaks R. The EPIC Project: rationale and study design.European Prospective Investigation into Cancer and Nutrition.Int J Epidemiol 1997;26(Suppl. 1):S6–14.

[10] Elliott P, Peakman TC. The UK Biobank sample handling andstorage protocol for the collection, processing and archiving ofhuman blood and urine. Int J Epidemiol 2008;37:234–44.

[11] Liu B, Young H, Crowe FL, et al. Development and evaluationof the Oxford WebQ, a low-cost, web-based method forassessment of previous 24 h dietary intakes in large-scaleprospective studies. Public Health Nutr 2011;14:1998–2005.

[12] Clarke R, Shipley M, Lewington S, et al. Underestimation ofrisk associations due to regression dilution in long-term follow-up of prospective studies. Am J Epidemiol 1999;150:341–53.

[13] UK Biobank. Access Procedures: application and review proce-dures for access to the UK Biobank Resource, /http://www.ukbiobank.ac.uk/resources/S.

[14] Fortier I, Burton PR, Robson PJ, et al. Quality, quantity andharmony: the DataSHaPER approach to integrating data acrossbioclinical studies. Int J Epidemiol 2010;39:1383–93.

[15] BBMRI. Biobanking and Biomolecular Resources Research Infra-structure, /http://wwwbbmri.eu/S.