infrastructures for research and innovation birney.pdf · infrastructures for research and...
Post on 05-Aug-2020
4 Views
Preview:
TRANSCRIPT
Infrastructures for research and innovationInfrastructures for research and innovation
Professor Ewan Birney FRSDirector, EMBL-EBIwww.ebi.ac.uk
Outline of talk
• Who Am I, What is EMBL?• The change in genomics• The needs for stratified patients in clinical care and drug
discovery• Europe’s assets• A path to releasing Europe’s strengths
The European Molecular Biology Laboratory
Heidelberg, Germany
Main Laboratory
Barcelona, Spain
Tissue Biology, Disease Modeling
80+ nationalities80+ nationalities
Hinxton, Cambridge, UK
Bioinformatics
Neuroscience
Rome, Italy
>1600 personnel>1600 personnel
Grenoble, France
Hamburg, Germany
Structural Biology
6 sites in Europe6 sites in Europe
Structural Biology
Ewan Birney
• Lead the original team that analysed the human genome (gene sets)• Algorithm research in genomic information• Set up many key databases in genomics (eg, Ensembl)
• Director of EMBL-EBI• Non-executive director for Genomics England (NHS clinical
genomics)• Formal Advice to UK, Finnish, Danish, US governments; informal to
other governments• Advisor to both large (GSK) and small (Oxford Nanopore) companies• Chair of the Global Alliance for Genomics and Health (GA4GH)
We have been living through a revolution.
One genome 2003 to 2018
The cost of sequencing agenome in 2018
The cost of sequencing agenome in 2003
Imaging: new technologies change the gameEM tomography,Atomic-scale models from EM
Super-resolutionlight microscopy
High-resolution MRI and CTLight sheet microcopy
Genomics: from research to healthcare
Research
• English language• Light-weight legal• Similar systems• Open data• Publications• Grant funding
Practicing Medicine
• National language• Heavy legal framework• Different systems• Closed data• Not published• Contract funding
Big numbers!
Stratification of PatientsStratification of Patients
Stratification
Class A
Class B
Class C
Stratification
Benefits of stratification
• In clinical practice• Better diagnosis and prognosis
• Better use of (expensive) medicines (“personalised”medicine)
• Specific care pathways optimised for the cases
• In drug discovery• More clarity on the therapeutic goals in early development
• Cheaper and more likely to succeed Phase II and Phase IIItrials
4 Pillars of stratification
Very LargeVirtual Cohortsideally withpopulation scaleascertainment
At scalegenomicassays
Harmonisedrepresentation ofkey aspects ofEHRs
Clear legal basis toaccess appropriatedata and approachpatients
Europe’s AssetsEurope’s Assets
Well regulated, often state run healthcare
• Total population size of >200 million• The largest coherent EHR records in the world (Denmark,
6 million Danish citizens)• Sweden, Norway, Finland all have good record keeping• Large, predominantly state run systems in France and UK
• Historical as well as future health data
The most advanced clinical + populationgenomics programs globally
• Finland - >10% of the population sequenced in 5 years• Estonia – aiming for all 1 million biobanked• Denmark – 5Million EHRs, 100,000 sequenced• UK – Goal of 5 million with genomic assays within 5 years• France – Clinical + Population scale assays for ~1 million
within 5 years• Spain – Variety of regional programs with scale to
millions
An European Framework: MEGA
Genomic Infrastructure
• EMBL-EBI• World leader in genome
information and analysis
• The most comprehensivelifescience datasetsglobally
• ELIXIR• European wide network
with National nodes toconnect local researchand healthcare
ELIXIRNode Map
Associated Institutes
ELIXIR-BEKatholiekeUniversiteit Leuven
ELIXIR-BEUniversity ofAntwerp
ELIXIR-BEUniversity of Liège
ELIXIR-BEVrije UniversiteitBrussel
ELIXIR-BEUniversiteit Hasselt
ELIXIR-BE InteruniversityInstitute of BioinformaticsBrussels
ELIXIR-CZ: MasarykUniversity (CEITEC)
ELIXIR-CZ: MasarykUniversity (CERIT-SC)
ELIXIR-CZ: Institute ofChemical Technology
ELIXIR-CZ: Institute ofExperimental Botany AS CR, v.v. i.
ELIXIR-CZ: Institute ofMolecular Genetics of the ASCR
ELIXIR- CZ Institute ofMicrobiology ASCR
ELIXIR-CZ: Cesnet
ELIXIR-CZ: University of SouthBohemia
The need for infrastructureClinicalRecord +Diagnosis
NationalGenomeDatabase
ReferenceInfrastructure
A vibrant commercial research sector
• Many European large scale pharmaceutical companies• Sanofi, GSK, Roche, AstraZeneca, Novartis• Balance of US vs European research intensity
• Vibrant SME community• Based around clusters – Heidelberg-Stuttgart-Munich-Basel,
Paris-Brussels-Amsterdam, Oxford-London-Cambridge,Barcelona, Stockholm-Helsinki
• Public-private partnerships• IMI• OpenTargets @EMBL-EBI
A path for European stratifiedpopulationsA path for European stratifiedpopulations
Alignment of European programs
• Million genomes declaration• EMBL-EBI and ELIXIR (ESFRI) as genomic infrastructure• IMI programs as an instrument to foster cross-
institutional, trans-national, public-private partnerships
Engagement with Nation state Healthstrategy• Practical “on the ground” implementation is in the hands
of the operations and regulation of the healthcaresystems in Europe• Source of EHR information
• Source of genomic information
• Fundamental need to have >100 million person cohortswill drive trans-national work• Clear for smaller countries that between country federation
is needed
• Clear for rare disease in all countries; will become relevantto more diseases
Engagement with global structures
• Europe has to tackle trans-national coordination farearlier than the US or Chinese systems• Similar opportunity as mobile phone GSM standards – the
need for ultimately trans-national access places Europe asthe leader in how to solve this• Legal and ethical components (GDPR)
• Technical components
• Leadership in global bodies, such as GA4GH (GlobalAlliance for Genomics and Health)
EMBL-EBIFollow me on twitter @ewanbirney
Thank you!Thank you!
1/11/2019 25
Our mission
Deliverexcellentresearch
Train thenext
generation of
scientists
Engagewith
Europeanindustry
Coordinatebioinformatics in Europe
Deliverscientificservices
Life science: many data typesGenes, genomes & variation
Gene, protein & metabolite expression
Protein sequences, families & motifs
Macromolecular structures
Interactions, reactions & pathways
Chemogenomics & metabolomics
Phenotypes
Data resources at EMBL-EBI
Literature &ontologies• Experimental Factor
Ontology• Gene Ontology• BioStudies• Europe PMC
Chemicalbiology• ChEBI• ChEMBL• SureChEMBL
Molecular structures• Protein Data Bank in
Europe• Electron Microscopy Data
Bank
Gene, protein & metaboliteexpression• Expression Atlas• Metabolights• PRIDE• RNA Central
Proteinsequences,families & motifs• InterPro• Pfam• UniProt
Genes, genomes & variation• Ensembl• Ensembl Genomes• GWAS Catalog• Metagenomics portal
Systems• BioModels• BioSamples• Enzyme Portal• IntAct• Reactome
Molecular Archives• European Nucleotide Archive• European Variation Archive• European Genome-phenome Archive• ArrayExpress
~410 peopleWorldwide collaborations
See the live map at www.ebi.ac.uk/about/our-impact
Global reference data
Big data, big demand
~27 millionrequests to EMBL-EBI websites every
day
Sustainable FundingOver 40 difference funding agencies worldwide
Forward commitment of over £100 million
EMBL-EBI delivered
1-5 US$ billionin efficiency savings worldwide
Scientists at over
3.2 millionunique IP addresses use
EMBL-EBI websites
NickGoldman
OliverStegle
JohnMarioni
JanetThornton
ZaminIqbal
EvangeliaPetsalaki
VirginieUhlmann
DanielZerbino
PaulFlicek
MoritzGerstung
RobFinn
AlvisBrazma
PedroBeltrao
AlexBateman
EwanBirney
AndrewLeach
Research groups at EMBL-EBI
Research data at EMBL-EBI
Proteomic & RNA comparisonEvolution ofphosphorylation sites
Mutations affecting proteinsimplicated in rare diseases
< Modelingunwanted
variation insingle-cell
transcriptome studies
Genomics ofinfectious disease
>
Single Cell Genomics
Translational bioinformatics
EMBL Research Community
• Research group picture
~170 people~50 visitors / year
Medical GenomicsMedical Genomics
Serious efforts on way• Genomics England
• 100,000 Genomes by end of 2019 (35,000 done now)• Long term 60K-100K from “routine healthcare” across NHS
• Plan France Génomique• ~100,000 genomes / year by 2025, first sites selected
• Iceland• 40% of the population genotyped/sequenced + imputed
• Switzerland• SPRT program to promote genomic medicine
• Finland• at least ~10% (0.5 million) of the population with sequence data by 2020
• US – Complex payer/insurance lead market• Mixture of HMO (Geisgner) and NIH (All of Us – mainly a cohort)
Genomics: from research to healthcare
Research
• English language• Light-weight legal• Similar systems• Open data• Publications• Grant funding
Practicing Medicine
• National language• Heavy legal framework• Different systems• Closed data• Not published• Contract funding
Bridges need at least two anchors
Long-term goals
• Ideal: “Institute for Biomedical informatics” in eachcountry
• Large nations/populations: Distributed network with aclear centre of gravity
• EMBL-EBI & ELIXIR handle research data: referencecollections and sharing amongst researchers (includingclinical)
• Institute for Biomedical Informatics:• Responsible for exploiting molecular reference data• Provides the national link and point of reference (eg, around
legislation)• Broker for research data (back to EMBL-EBI, NCBI &
ELIXIR)
France EMBL-EBIFrance EMBL-EBI
Basic Research• Working collaboratively with Elixir-France
• Orphanet, CAZy
• Support training in bioinformatics
• Ensuring French scientists and institutes exploit EMBL-EBI• Seamless APIs to allow submission of data driven by
institutes (less complexity for user/scientist, use EMBL-EBIas backup)• GDR Mediatec for Chemical Ecology -> Metabolights
• Genscope DNA data -> ENA
• Research work with French research scientists• Institute Pasteur, Institute Curies links
• French Embassy internships at EMBL-EBI
Applied Research : Medicine
• Ensuring transfer of skills and expertise to the Frenchmedical system• France’s medical genomics must be run and delivered in
France (obviously!)
• Technical aspects, eg, Archiving DNA data at scalenationally
• Reference human biology resource• Orphanet
• Infectious epidemiology/bacterial genome sequencing?
• Working with Elixir-France and others for internationalstandards• ELIXIR’s role in GA4GH standards
Big numbers!
Global standards: the GA4GH
• GA4GH is THE standards-settingbody for genomics and healthcare• Embraces federated approach
• Setting community standards early
• Cloud: Analysis carried out where the data ‘lives’• “You’re already using it!”: SAM/BAM/CRAM/VCF formats• Tools: htsget – the first step away from file-based access• Rare disease diagnoses: Matchmaker Exchange• Federated discovery: GA4GH Beacons
Federation
Open research data Healthcare datawith research use
analysis analysis
Aggregate data globally
Download, analyse locally
Analyse data locally (via VMs)
Collate analyses
top related