workflow4metabolomics : infrastructure pour l’analyse des

32
Workflow4Metabolomics : Infrastructure pour l’analyse des données de métabolomique BiLille octobre 2019 Jean-François Martin et la Core team

Upload: others

Post on 30-Jul-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Workflow4Metabolomics : Infrastructure pour l’analyse des

Workflow4Metabolomics :

Infrastructure pour l’analyse des données de métabolomique

BiLille octobre 2019

Jean-François Martin et la Core team

Page 2: Workflow4Metabolomics : Infrastructure pour l’analyse des

Outline

• METABOLOMIC

• Principle• Analytic tools

• W4M CONTEXT • History• Galaxy

• W4M Ecosystem• Tools• Services

Page 3: Workflow4Metabolomics : Infrastructure pour l’analyse des

METABOLOMIC

3

Page 4: Workflow4Metabolomics : Infrastructure pour l’analyse des

Omics…

4

Genotype Phenotype

Page 5: Workflow4Metabolomics : Infrastructure pour l’analyse des

Metabolomic workflow

5

Biologicalhypothesis

Metabolomics

workflow

Analyticalanalyses

LC-MS GC-MS

Analytical

chemistry

NMR

Metaboliteannotation

DatabasesAnalytical

chemistry

Pre-processing

Data

matrix

Statistics Statistics

Biology,

Medecine,

Biochemistry

Pathwayinterpretation

Bioinformatics

Page 6: Workflow4Metabolomics : Infrastructure pour l’analyse des

6

Target

Untargeted metabolomic :

- Used to detect unexpected changes in

metabolite concentrations; the aim is to detect

a maximum number of metabolites in order to

observe unexpected changes.

- Hundreds to thousands of metabolites can be

measured.

- No absolute quantification

- Needs multiple analytical devices

Semi targeted metabolomic :- In between, this approach seek a set of

known metabolites in an untargeted analysis.

- hundreds of metabolites.

- No absolute quantification

- Lipidome

- Exposome

- Microbiome

- Epigenome

Targeted metabolomic :

- Small number of metabolites,

- Biochemically annotated with known biological function

- Quantification of the metabolite is performed using

chemical standards.

Page 7: Workflow4Metabolomics : Infrastructure pour l’analyse des

Reso 1000 Reso >10000

Mass Spectrometry coupled with liquid (LC-MS) or gaz (GC-MS) chromatography

• Great sensitivity

• Relative quantification

• Low repetability, noisy

• Destructive

• Several ions for 1 molecule

NMR

• Low sensitivity

• Quantification ~absolute

• Robust and good repetability

• Non destructive

• Several chemical shifts for 1 molecule

7

Analytical technics

Page 8: Workflow4Metabolomics : Infrastructure pour l’analyse des

Biological matrix

• Urines

• Plasma

• liver

• Cells

• Fecal water

• Skin extract

Some numbers

• Sample prep

• 40min /injection

• 1 day for files conversion

• 1 day for pre-processing

• x days for statistics

• n days for annotation

• k days for interpretation

8

Analytical technics

Wishart D. PLOS ONE 2017

Page 9: Workflow4Metabolomics : Infrastructure pour l’analyse des

Drawback

• Black box extraction software

• MS signal drift

• Semi quantificative

• Problem to automate the annotation process

• Needs Inhouse databases

9

Page 10: Workflow4Metabolomics : Infrastructure pour l’analyse des

CONTEXT

10

Page 11: Workflow4Metabolomics : Infrastructure pour l’analyse des

Brief History

• 2005 Galaxy project

• 2006 Few bioinformatics tools

– Packages R xcms et CAMERA

– Incomplete database MS information (KEGG, metlin, HMDB,…)

– No annotation tools

• 2010 Scattered french inhouse tools

• 2013 MetaboHub & IFB french national infrastructures

• 2015 Giacomoni et al. doi:10.1093/bioinformatics/btu813

• 2017 Guitton et al (2017). IJBC, doi:10.1016/j.biocel.2017.07.002

11

Page 12: Workflow4Metabolomics : Infrastructure pour l’analyse des

W4M Metabolomic workflow

12

Biologicalhypothesis Analytical

analyses

LC-MSGC-MS

NMR

Metaboliteannotation

Pre-processing

Data matrix

Statistics

Pathwayinterpretation

Page 13: Workflow4Metabolomics : Infrastructure pour l’analyse des

Result : an online infrastructure for Metabolomics

Based on the Galaxy Framework

• Modular : ~40 modules• Reproducible approach • Sharing : data, workflkow, etc.• Sustainable :

– permanent staff– several funding : >100 PM non permanent staff– strong support from our 2 national communities

(IFB & MetaboHUB) with permanent staff

Page 14: Workflow4Metabolomics : Infrastructure pour l’analyse des

Online analysis

14

User interface and

results

Page 15: Workflow4Metabolomics : Infrastructure pour l’analyse des

Galaxy workflow

15

Page 16: Workflow4Metabolomics : Infrastructure pour l’analyse des

• 15 bioinformaticians

• 6 metabolomics platforms

• 2 French infrastructures:

16

W4M Core Team and Help desk

Page 17: Workflow4Metabolomics : Infrastructure pour l’analyse des

The W4M core team

17

Page 18: Workflow4Metabolomics : Infrastructure pour l’analyse des

Galaxy As A Gateway

Playing an important role in community building process

Synergy with the French Galaxy Working Group

Page 19: Workflow4Metabolomics : Infrastructure pour l’analyse des

The W4M ecosystem

W4MTraining

Workflow4Experimenters

5 sessions since 2014 (5 days)

10 trainers for 20/25 Trainees

Half theory - Half tutoring

“Bring your own data”

Help Desk

[email protected]

Tools

Building, running, saving and

sharing functionalities

On-line analysis

https://galaxy.workflow4metabolomics.org

Page 20: Workflow4Metabolomics : Infrastructure pour l’analyse des

Tools & Workflow

20

40 tools

Page 21: Workflow4Metabolomics : Infrastructure pour l’analyse des

Pre-processing

• LC-MS : based on xcms and CAMERA R packages

• GC-MS : based on metaMS package

• FIA-MS : fully developped for W4M

• NMR : fully devolopped for W4M21

• Data Extraction : From acquisition data files to dataMatrix

Page 22: Workflow4Metabolomics : Infrastructure pour l’analyse des

Normalization, filtration, correction

• Normalizations

– Internal standard

– Sum of intensities

– PQN (most probable quotient)

• Filtration

– based on correlation between dilutedquality control pooled samples

– based on CV of ions among samplesand among quality control pooledsamples.

• Correction of signal drift (MS) based on loess regression on quality control pooled samples

22

Raw Corrected

Page 23: Workflow4Metabolomics : Infrastructure pour l’analyse des

Statistical common tools

23

Page 24: Workflow4Metabolomics : Infrastructure pour l’analyse des

Annotation

• NMR

• MS

– Match with public database via webservices

• HMDB, KEGG, lipidmaps, chemspider…

• PeakForest

– In house database

24

Page 25: Workflow4Metabolomics : Infrastructure pour l’analyse des

Building comprehensive workflows

25

Page 26: Workflow4Metabolomics : Infrastructure pour l’analyse des

Developpment tools

Developments– Stick to IUC standards– A GitHub: github.com/workflow4metabolomics– Conda / Planemo / TravisCI

Page 27: Workflow4Metabolomics : Infrastructure pour l’analyse des

Digital object identifier DOI

– W4M provides DOI to reference histories (including data (from RAW to statistical/annotation results), tools and their parameters and workflow

– Usable in papers

– How : http://workflow4metabolomics.org/referenced_W4M_histories

27

Page 28: Workflow4Metabolomics : Infrastructure pour l’analyse des

Online analysis

28

Oct.2018 – Oct.2019

600 000 jobs/year

1500 registered users

Page 29: Workflow4Metabolomics : Infrastructure pour l’analyse des

Contribution Get, Push

You can get our tools- Download our VM W4M - All tools are publicly available in GitHub

You can push your tools

- Can be integrated and hosted within the main W4M instance

- Tools must stick IUC standards

- Support must be done by the developers themselve

https://github.com/workflow4metabolomics/workflow4metabolomics#how-to-contribute

Page 30: Workflow4Metabolomics : Infrastructure pour l’analyse des

PERSPECTIVES

30

Page 31: Workflow4Metabolomics : Infrastructure pour l’analyse des

Perspectives

• Interoperability

• MS/MS Julien Saint-Vanne

Yann Guitton, Gildas Le Corguillé

• Annotation RMN 2D Cécile Canlet, Franck Giacomoni et Marie Tremblay Franco

• Visualisation …

31

Page 32: Workflow4Metabolomics : Infrastructure pour l’analyse des

32

@workflow4metabo

github.com/workflow4metabolomics

Merci et à bientôt sur W4Mhttps://galaxy.workflow4metabolomics.org

[email protected]