Bioen Workshop on Metabolomics of Sugarcane
FAPESP, Sao Paulo, 07 December 2009
Lothar Willmitzer
Max-Planck-Institut für Molekulare Pflanzenphysiologie
14476 Potsdam-Golm, Germany
Biomass and Metabolomics at the
MPI for Molecular Plant Physiology
The Max Planck Society
Founded 1948is a research organization supposed to work complementary to Universities
78 Institutes
276 Departments (led by Max Planck Directors)
12,200 employees (include. 4,200 scientists)
+ 9,600 junior scientists
Budget: 1.3 BEuro = ≤2% of total R&D in Germany
government + 16 states (1:1)
“Knowledge must precede application”
our research activities are
(Max Planck)
- focused on basic research
- curiosity-based and hypothesis-driven
- open to but not focused on applications
17 Nobel prizes since 1948
2007 Chemistry: Gerd Ertl
2005 Physics: Theodor Hänsch
1995 Chemistry: Paul Crutzen
1995 Medicine: Christiane Nüsslein-Volhard
1991 Medicine: Erwin NeherBert Sakmann
1988 Chemistry: Johann Deisenhofer Robert HuberHartmut Michel
1986 Physics: Ernst Ruska
1985 Physics: Klaus von Klitzing
1984 Medicine: Georges Köhler
1973 Medicine: Konrad Lorenz
1967 Chemistry: Manfred Eigen
1964 Medicine: Feodor Lynen
1963 Chemistry: Karl Ziegler
1954 Physics: Walter Bothe
Commercial success = Technology transfer
„Max Planck Innovation GmbH“
since 1979: 2,025 patents filed
1,186 licence contracts
154 M€ patent licences
60 Start Up companies
Potsdam-Golm Campus harbors
University Potsdam, Fraunhofer Research
Institutes and Max-Planck Institutes
MPI für Molekulare Pflanzenphysiologie
Structure of the MPI-MP
• 3 departments hosting 12 research groups
mostly led by young scientists on limited
contracts
• 2 independent Max-Planck research
groups
• 3 infrastructure groups
• 2 university guest groups
• 2 Systems Biology guest groups
Structure of the MPI-MP
• Approx 350 employees
• 90+ post-doc´s and 90+ PhD students
• More than 50% coming from abroad covering 25+ countries
• Working language is english
Biological Processes studied at the MPI-MP
Plant Energetics
(Photosynthesis, Respiration)
Abiotic stress
(temperature, nutrients)
Metabolism and Growth
Cell wall biosynthesis
Biomass
Genotyp
es
Environment Analy
sis
Create genetic diversity,
grow it in defined environmental conditions
and subject it to broad molecular phenotyping
Genotypes Environment
Analysis
RNA analysis (arrays, RT-PCR)
Proteomics facility ( LC-MS, 2D)
Enzyme HTP platform
Single cell analysis
Metabolite analysis
Subcellular analysisTransgenics
Natural Variation
Arabidopsis, tomato,rice
Data Mining
(Bioinformatics)
Metabolomics
• The metabolome comprises all small molecules
present in a given biological system
• Metabolomics aims at the quantitative
determination of all small molecules
• The metabolome contains molecules hugely
varying in three parameters : concentration,
structure and chemical behaviour
• In contrast to RNA and protein metabolites are
non-linear to the DNA
Metabolomics - Challenges
• Increase the fraction of molecules we see..
• Annotate more of them …
• Quantitate them ( relative and absolute) ..
• Discriminate a true metabolite from a contaminant …
• Make sure that what we measure reflects the situation in the cell …. Sampling
• subcellular distribution, labelling ( fluxomics) ….
Metabolomics - Extraction
• Metabolomics aims at a complete
coverage of all small molecules
• Thus as little prefractionation as possible
is applied
• We start from a simple
methanol/chloroform/water extract which is
exposed to three different platforms
robotic derivatisation & full extract injection:
GC-MS : the workhorse for 100 – 150 metabolites
mostly from primary metabolism
Acquired on 09-Dec-1998 at 12:59:56
7.500 10.000 12.500 15.000 17.500 20.000 22.500 25.000 27.500 30.000 32.500 35.000 37.500 40.000 42.500 45.000
rt0
100
%
Scan EI+
T IC
1.20e8
8343AO01
„What Is This Peak?“
LTQ FT Ultra
• Resolution
– > 1 000 000
• Mass Range
– m/z 50-2000
• Dynamic Range
– 1 000
0.26 ppm
+ 0.000045.93 ppm
- 0.00103
102 ppm
- 0.017
Mass Accuracy
Phenylalanine [M+H+]+
[M+H+]+ Mass 181.07066
allowed
chemical elements: C = 30
H = 50
N = 5
O = 10
P = 5
S = 5
500 ppm = 0.09085 Da Error 268 predicted Formulas
100 ppm = 0.01817 Da Error 54 predicted Formulas
10 ppm = 0.00181 Da Error 6 predicted Formulas
1 ppm = 0.00018 Da Error 1 predicted Formula C6H12O6
Formula calculation is depending on mass accuracy and resolution
Arabidopsis 12CO2 Arabidopsis 13CO2
•UPLC FT-ICR MS
•Peak Extraction
•DB Search
12C chemical formula
and retention time
13C chemical formula
and retention time
Matched 12C/13C formulas
with RT tolerance 0.05 minunambiguous
Annotated formula
ambiguous
•Isotope pattern
•MS/MS analysisunambiguous
Metabolite Extraction
MeOH:CHCl3:H2O
Whole metabolome 13CO2 labelling strategy
Parallel analysis of 12/13C labelled compounds allows
discrimination of biological from non-biological material
and annotation of number of carbon atomsx:\ag willmitzer\...\at_13c_pos1 10/7/2008 10:23:16 PM
RT: 0.00 - 13.01
0 1 2 3 4 5 6 7 8 9 10 11 12 13
10
20
30
40
50
60
70
80
90
100
10
20
30
40
50
60
70
80
90
100 NL:1.62E7
Base Peak MS At_12C_pos1
NL:1.57E7
Base Peak MS at_13c_pos1
13C
12C382.14236
383.14574 384.13859
382.14225
383.14599384.13865
X:\AG Willmitzer\...\At_12C_pos1 10/7/2008 8:41:49 PM
382.0 382.5 383.0 383.5 384.0
m/z
0
10
20
30
40
50
60
70
80
90
1000
10
20
30
40
50
60
70
80
90
100 NL: 7.52E6
At_12C_pos1#582 RT: 8.21 AV: 1 T: FTMS + p NSI Full ms [100.00-1300.00]
NL: 1.56E6
at_13c_pos1#591 RT: 8.27 AV: 1 T: FTMS + p NSI Full ms [100.00-1300.00]
7.27min8.23min
X:\AG Willmitzer\...\At_12C_pos1 10/7/2008 8:41:49 PM
730 735 740 745 750 755 760 765 770 775 780 785
m/z
0
10
20
30
40
50
60
70
80
90
1000
10
20
30
40
50
60
70
80
90
100 NL: 9.74E6
At_12C_pos1#515 RT: 7.33 AV: 1 T: FTMS + p NSI Full ms [100.00-1300.00]
NL: 8.58E6
at_13c_pos1#514 RT: 7.30 AV: 1 T: FTMS + p NSI Full ms [100.00-1300.00]
13C
12C
C33
741.22238
774.33350
Sample 1 Sample 2 Sample 3 13CO2 Sample
•Metabolite Extraction
•Spike fixed amount (1:1)•UPLC FT-ICR MS•Peak Extraction
•Spectra Alignment•Differential Peak detection
•DB Search
1:2 1:4 1:20
Sample 1
1:1
A
B
1:1 1:1 1:1 1:2 1:2 1:2 1:4 1:4 1:4 1:20 1:20 1:20
05
1015
2025
3035
12C/13C dilution
Mixing Ratio 12C/13C
Ratio
13C/1
2C
Relative quantitation via spiking with a 13C labeled metabolome
IRMPD
ESI neg
100 200 300 400 500 600 700 800 900 1000
m/z
0
10
20
30
40
50
60
70
80
90
100
Re
l. A
bu
nd
an
ce [%
]
301.03397
609.14234
C15H9O7
C27H28O16
636.23571
13C27H28O16316.08536
13C15H9O7
[M-H]-
m/z 318m/z 303
MS/MS for elucidation of structures
chloroform
phase
aqueous
phase
UPLC separation
UPLC peak detection,
peak lists extraction
and differential database
searches
Compound annotation
and result interpretation
Arabidopsis12CO2 grown
Arabidopsis13CO2 grown
Metabolite Extraction
MeOHl:CHCl3:H2O
neg.
mode
Lipids
pos.
mode
pos.
mode
neg.
mode
UPLC fractionation
Strategic Overview of the Platform
GC-TOF
FT-ICR-MS
Summary Metabolomics
• Starting from a MeOH-H2O/CHCl3 extract
three platforms allow :
• analysis of 150 primary metabolites (GC-TOF)
• analysis of 500 – 1000 secondary metabolites
(UPLC-FT-ICR-MS)
• analysis of about 140 complex lipids
Metabolomics : Applications
• Metabolomics and Genetics : Annotation of gene function
• Metabolomics and Diagnostics : Discrimination/Identification/Prediction of states
• Metabolomics and Systems Biology : Integration of different data sets
Metabolomics and Genetics :
Annotation of gene function using KO
populations and overexpressors
Metabolic Genomics: From Gene to
Function - Directly
Arabidopsis thalianaThe “domestic pet” about 27.000 Genes
Genetic blueprint known
Small
Grows rapidly
Large progenyFunctional
Analysis
Plant
Collections
Metabolic AnalysesPerformance
Metabolic Genomics at metanomics: Process Overview
Transformants B
Transfer of
additional Genes
“Gene.........................
Donors”:
Transformants A
Knockout
Wild Type
Fig.4
1 20.2
X-Fold ratio in comparison
to Wild Type control
Plant
Lines
Metabolites
Metabolomics and Diagnostics :
Discrimination/Identification/Prediction
of states
Metabolic profiling applied to two ecotypes
and one mutant each
Col-2 wild type dgd1 mutant C24 wild type sdd1 mutant
Arabidopsis Profiling: PCA Cluster Analysis
Factor1
Factor4 Factor2
C24 sdd
C24 WT
Col2 dgd1
Col2 WT
Medical applications of Metabolomics
• Differentiation of diabetics/nondiabetics
• Diagnostics of cancer and non-cancer tissues (kidney cancer)
• Prediction of health risk ( myocardial infarct)
• Differentiation of responder/non-responder (antidepressiva)
Separation of diabetic and nondiabetic
male and female patients
Metabolic profiling allows assigment of
normal and clear cell kidney carcinoma
Prediction of myocardial infarction by metabolic signature.
0 20 40 60 80 100
100-Specificity
100
80
60
40
20
0
Se
nsitiv
ity
Standard Model and five metabolites
Standard Model
Age, Sex and five metabolites
0 20 40 60 80 100
100-Specificity
100
80
60
40
20
0
Se
nsitiv
ity
Standard Model and five metabolites
Standard Model
Age, Sex and five metabolites
Standard Model and five metabolites
Standard Model
Age, Sex and five metabolites
Prediction was estimated by calculation of receiver-operating-curves (ROC). Prediction of a model including age, sex and the here identified five
metabolites (red ROC; AUC 0.844) was not inferior to a standard model including all established risk factors (age, sex, smoking, education, alcohol
consumption, physical activity, hormone replacement therapy, BMI, hypertension, diabetes, HDL/total-cholesterol-ratio and C-reactive protein) (blue
ROC; AUC 0.86; p=0.385). A model with all established risk factors (standard model) and the five metabolites (AUC 0.89) outperformed the
standard model alone (p=0.033).
Medical applications of Metabolomics
• Differentiation of diabetics/nondiabetics
• Diagnostics of cancer and non-cancer tissues (kidney cancer)
• Prediction of health risk ( myocardial infarct)
• Differentiation of responder/non-responder (antidepressiva)
Metabolomics and systems biology:
Network analysis and integration
with transcript data
stress
M T
M T
M T
M TM TM T
Optical D
ensity 6
00nm
time
Adaptation phase
Heat
Cold
Oxidative stress
Glucose-lactose shift
control
E. coli response towards different stresses on the transcriptome and metabolome
Network construction by correlation
analysis (metabolite A vs B)
B B B
AAA
Cold stress
Heat stress
Lactose shift
Oxidative stress
control growth
overlap
Superimposition of stable networks
and metabolic pathways
Differential network properties identify
candidate molecules
Canonical correlation analysis allows identification
of transcripts correlating with metabolite changes
Coordinated expression of
transcripts and metabolites
Conclusion
• Metabolomics is rapidly developing into
mature technology already now covering
several thousand small molecules
• Combined with genetic diversity it allows
the high-throughput annotation of gene
function
• Probably due to its vicinity to the ultimate
phenotype it has a surprisingly high
predictive power
• It is an integral part of systems biology
Thank you for your attention!
Patrick Giavalisco
Bettina Seiwert
Aenne Eckardt
Szymon Josefcuk
Jedrzey Szymanski
Alvaro Inostroza
Sebastian Klie
Zoran Nikoloski Goforsys
Joachim Selbig UPThanks to FAPESP for support!