developing an efficient infrastruture, standards and data-flow for metabolomics

62
Developing an Efficient Infrastructure, Standards and Data-Flow for Metabolomics Christoph Steinbeck European Bioinformatics Institute (EMBL-EBI)

Upload: christoph-steinbeck

Post on 23-Jan-2018

233 views

Category:

Science


3 download

TRANSCRIPT

Page 1: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

Developing an Efficient Infrastructure, Standards and Data-Flow for Metabolomics

Christoph Steinbeck

European Bioinformatics Institute(EMBL-EBI)

Page 2: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

The European Bioinformatics Institute

(EBI)

Page 3: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

The European Bioinformatics Institute

(EBI)

Page 4: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

The European Bioinformatics Institute

(EBI)

Page 5: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

The European Bioinformatics Institute

(EBI)

Page 6: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

The European Molecular Biology Laboratory

(EMBL)

A basic research institute funded by public research monies from 20 member states.

Page 7: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

European Bioinformatics Institute (EBI)

Page 8: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

European Bioinformatics Institute (EBI)Genes, genomes & variation

Literature & ontologies Europe PubMed Central Gene Ontology Experimental Factor Ontology Molecular structures

Protein Data Bank in Europe Electron Microscopy Data Bank

European Nucleotide Archive 1000 Genomes

Gene, protein & metabolite expression

Protein sequences, families & motifs

Chemical biology

Reactions, interactions & pathways Systems

Ensembl Ensembl Genomes

European Genome-phenome Archive Metagenomics portal

Page 9: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

European Bioinformatics Institute (EBI)Genes, genomes & variation

Literature & ontologies Europe PubMed Central Gene Ontology Experimental Factor Ontology Molecular structures

Protein Data Bank in Europe Electron Microscopy Data Bank

European Nucleotide Archive 1000 Genomes

Gene, protein & metabolite expression

Protein sequences, families & motifs

Chemical biology

Reactions, interactions & pathways Systems

Ensembl Ensembl Genomes

European Genome-phenome Archive Metagenomics portal

Page 10: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

European Bioinformatics Institute (EBI)Genes, genomes & variation

Literature & ontologies Europe PubMed Central Gene Ontology Experimental Factor Ontology Molecular structures

Protein Data Bank in Europe Electron Microscopy Data Bank

European Nucleotide Archive 1000 Genomes

Gene, protein & metabolite expression

Protein sequences, families & motifs

Chemical biology

Reactions, interactions & pathways Systems

Ensembl Ensembl Genomes

European Genome-phenome Archive Metagenomics portal

Page 11: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

Nutrition

Exercise

Disease

AgeDrugs

Environment

Phenome/Exposome

Page 12: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

The Metabolome is the most accessible and

dynamically changing Molecular Phenotype

Page 13: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

Organism Parts

Page 14: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

Nuclear Magnetic Resonance (NMR)

Mass Spec

Metabolomics uses a wide-range of analytical techniques

Page 15: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

What do the EBI databases do? Labs around the world send us their data and

we…

Archive it

Classify itShare it with other data providers

Analyse it

…provide tools to help researchers

use it

A collaborative enterprise

Page 16: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

MetaboLights

http://www.ebi.ac.uk/metabolights

open-access, cross-species, cross-application,long-term supported

Salek, R.M., Haug, K. and Steinbeck, C. (2013) Dissemination of metabolomics results: role of MetaboLights and COSMOS. Gigascience, 2:8.

Page 17: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

MetaboLights Database

Experimental Repository

Reference Layer

Chemistry Spectroscopy Biology

Ana

lysi

s To

ols

Primary Literature

Primary data and Meta-Data, Spectra, Protocols, Synopses, ...

Page 18: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics
Page 19: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

www.ebi.ac.uk/metabolights (metabolights.org, metabolights.eu)

Page 20: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

Data growth in EBI data repositories

Page 21: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

Data growth in EBI data repositories

3-month doubling time

for Metabolomics

Page 22: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

Data growth in EBI data repositories

3-month doubling time

for Metabolomics

MetaboLights is now the recommended

repositoryfor the Nature journals,

EMBO journal, PLOS journals, Metabolomics

Journal and others

Page 23: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

MetaboLights Stats May 2016

Page 24: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

Global Standards and

Data Exchange in

Metabolomics

Page 25: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

COSMOS COrdination of Standards in MetabolOmicS

European FP7 coordination action coordinated by us at

EMBL-EBI, Hinxton, Cambridge

• Create missing standards & formats

• Define workflows for dissemination

• Create world-wide data network

Page 26: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

MetabolomeXchange 2014

• Global network for exchange and discoverability of metabolomics data

• Includes study as well as reference data

Page 27: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics
Page 28: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics
Page 29: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

The MetaboLights Reference Layer

Page 30: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics
Page 31: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

•8.7 mio eukaryotic species on earth (+- 1.3mio)

Page 32: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

•8.7 mio eukaryotic species on earth (+- 1.3mio)•1.2 mio species identified and classified

Page 33: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

•8.7 mio eukaryotic species on earth (+- 1.3mio)•1.2 mio species identified and classified•3000 - 4000 complete species genomes sequenced

Page 34: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

•8.7 mio eukaryotic species on earth (+- 1.3mio)•1.2 mio species identified and classified•3000 - 4000 complete species genomes sequenced

Page 35: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

•8.7 mio eukaryotic species on earth (+- 1.3mio)•1.2 mio species identified and classified•3000 - 4000 complete species genomes sequenced

What about completed metabolomes?

Page 36: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

•8.7 mio eukaryotic species on earth (+- 1.3mio)•1.2 mio species identified and classified•3000 - 4000 complete species genomes sequenced

What about completed metabolomes?

Page 37: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

Species Metabolomes are being assembled on the fly

right now through data sharing in Metabolomics

Page 38: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

Repository Entry

Page 39: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

Repository Entry

Page 40: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

Reference Layer

Page 41: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics
Page 42: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

7 most annotated metabolomes in MetaboLights

Page 43: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics
Page 44: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

Current and Future Work

Page 45: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

•500 Million people in European Union•Full Genomes (soon for less than $1000 p. P.)•Urine/Blood Metabolome < 20 Euros per Patient

Page 46: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

Phenome Centres founded all over the world

• London

• Birmingham

• Shanghai

• NIH RCMRCs

• …

Page 47: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics
Page 48: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

> 100,000 patient samples / year> Several PetaBytes/year

=> ExaBytes of human data at moderate scale-up

Page 49: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

Large Scale Computing with Medical Metabolomics Data

• EBI lead• H2020• 3 Years• 13 Partners• 8 Mio €• 830 PM• Kick-off 9/15• H2020 e-infra

Page 50: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

Large Scale Computing with Medical Metabolomics Data

Page 51: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

Large Scale Computing with Medical Metabolomics Data

Page 52: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

Large Scale Computing with Medical Metabolomics Data

Page 53: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics
Page 54: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

Large Scale Computing with Medical Metabolomics Data

Page 55: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

Networking Activities - Ecosystem

ELIXIR cloud activities

BioMedBridges

CO

RBEL

BBMRIPhenoMeNal

Euro

pean

Ope

n Sc

ienc

e cl

oud

Indi

go D

ata

Clo

ud Phenomics User Community

EGI GCE EC2 OpenStack

i~H

D

Industry-grade orchestration

Page 56: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

Networking Activities -EOSC

AspartofEOSCandGOFAIR,PhenoMeNalispositioningitselfashubforverifyingFAIRmetabolomicsdata

Page 57: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

The Next 5 Years

• Standardised dissemination and analysis of big data in Metabolomics

• Cloud-based workflows for Phenomics

• Assembly of model species metabolomes

• Literature-mining

• Comprehensive structure elucidation of unknown metabolites

Page 58: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

The Next 5 Years for MetaboLights

• Maintenance and improvement

• Advanced metadata-based data analysis and visualisation

• Slice and Dice

• Improved reference layer

• Web services access

• MetaboLights Cloudified Version

• Online creation of MetaboLights ISA-Tab studies

• Standardisation, Training and Outreach

Page 59: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

Funding and CollaboratorsUK Research Councils (BBSRC, MRC) European Commission

Page 60: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics
Page 61: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

Slides on http://www.slideshare.net/csteinbeck

Page 62: Developing an Efficient Infrastruture, Standards and Data-Flow for Metabolomics

[email protected]

Thank you!