genomica)funzionale) genomica ii- lezione v... · integration of metabolomics with other...

32
Genomica Funzionale 1014 Febbraio 2014

Upload: others

Post on 24-Jan-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic

Genomica  Funzionale  

10-­‐14  Febbraio  2014  

Page 2: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic

Topics  covered  

-­‐  Metabolic  Engineering;  

-­‐  Concept  of  Metabolomics;  

-­‐  Metabolomic  plaAorms  (LC-­‐MS,  GC-­‐MS,  NMR,  ICP-­‐MS  etc);    -­‐  Set  up  of  a  metabolomic  protocol  and  database;  

-­‐  ApplicaLons  in  plant-­‐/food  science  field;  

-­‐  BioinformaLcs  applied  to  Metabolomic  data.  

Page 3: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic

Metabolic engineering of plant volatiles (aromas)

2007

Page 4: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic

Metabolic engineering of plant volatiles (defense)

Page 5: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic

Metabolomics in Association Mapping Studies

Glucosinolate  pathway  

Jansen  et  al,  12  

Page 6: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic

Mass Spectrometry Imaging Technique  used  in  mass  spectrometry  to  visualize  the  spaLal  distribuLon  of  e.g.    

compounds,  biomarker,  metabolites,  pepLdes  or  proteins  by  their  molecular  masses.  

-­‐  SIMS  -­‐  MALDI  -­‐  DESI  

Page 7: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic
Page 8: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic

Integration of metabolomics with other ‘omics’ fields

•  Integrating genomics and metabolomics for engineering plant metabolic pathways - Kirsi-Marja Oksman-Caldentey and Kazuki Saito (2005)‏;

•  Proteomic and metabolomic analysis of cardioprotection: Interplay between protein kinase C epsilon and delta in regulating glucose metabolism of murine hearts;

•  Plant studies (2005) to integrate transcriptomics, proteomics and

metabolomics in an effort to enhance production efficiency under stressful conditions of grapes.

Page 9: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic

How  to  beYer  invesLgate  metabolic  engineered  plant  products?  

2007  

In  vivo  studies  

Phenomic  data  

Fluxomics  

SYSTEMS  BIOLOGY  

Page 10: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic

potato  oligo  array  42150  probes  

1.  TranscripLonal  Profiling  

Transcriptomic,  metabolomic  and  phenomic  profiling  

2.  Metabolic  Profiling  

GC-­‐ToF-­‐MS  and  LC-­‐MS  

Data  analysis  Transcriptome  +  Metabolome  +  Phenome  

3.  Phenotyping  

Instron,    Penetrometer  etc  

Mapping  Soawares     Heatmap-­‐Clustering   CorrelaLon/Network  biology  

Page 11: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic

Mapping  of  transcript/metabolite  data  

Mapman  representaLon  of  5.000  gene+metabolite  data  in  2  transgenic    lines  

Page 12: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic

Metabolome  alteraLons  in  “Golden” potatoes  

Krebs  cycle  

(+)  

FaYy  acids  

Carotenoids  

Tocopherols  

AA  

Sugars  

Arom  AA  

Org.  acids  

Sugars  

Phytosterols  

AA  

(-­‐)  

Page 13: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic

Principal Component Analysis (PCA)

•  Unsupervised •  Multivariate analysis based on projection methods •  Main tool used in chemometrics •  Extract and display the systematic variation in the data •  Each Principle Component (PC) is a linear combination of

the original data parameters •  Each successive PC explains the maximum amount of

variance possible, not accounted for by the previous PCs •  PCs Orthogonal to each other •  Conversion of original data leads to two matrices, known as

scores and loadings •  The scores(T) represent a low-dimensional plane that

closely approximates X. Linear combinations of the original variables. Each point represents a single sample spectrum.

•  A loading plot/scatter plot(P) shows the influence (weight) of the individual X-variables in the model. Each point represents a different spectral intensity.

•  The part of X that is not explained by the model forms the residuals(E)

•  X = TPT = t1p1T + t2p2

T + ... + E

Page 14: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic

Metabolomic Microarray

Principal  Component  Analysis  

Urbanczyk- Wochniak et al., 03

Page 15: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic

Soft Indipendent Modeling of Class Analogy (SIMCA)

•  Supervised learning method based on PCA

•  Construct a seperate PCA model for each known class of observations

•  PCA models used to assign the class belonging to observations of unknown class origin

•  Boundaries defined by 95% class interval

•  Recommended for use in one class case or for classification if no interpretation is needed

CLASS SPECIFIC STUDIES n  One-class problem: Only disease observations

define a class; control samples are too heterogeneous, for example, due to other variations caused by diseases, gender, age, diet, lifestyle, etc.

n  Two-class problem: Disease and control observations define two seperate classes

Page 16: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic

Partial Least Square Discriminant Analysis (PLS)

•  Supervised learning method. •  Recommended for two-class cases instead of

using SIMCA. •  Principles that of PCA. But in PLS, a second

piece of information is used, namely, the labeled set of class identities.

•  Two data tables considered namely X (input data from samples) and Y (containing qualitative values, such as class belonging, treatment of samples)‏

•  The quantitive relationship between the two tables is sought.

•  X = TPT + E •  Y = TCT + E •  The PLS algorithm maximizes the covariance

between the X variables and the Y variables •  PLS models negatively affected by systematic

variation in the X matrix not related to the Y matrix (not part of the joint correlation structure between X-Y.

Page 17: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic

OPLS

•  OPLS method is a recent modification of the PLS method to help overcome pitfalls •  Main idea to seperate systematic variation in X into two parts, one linearly related to Y and one unrelated

(orthogonal). •  Comprises two modeled variations, the Y-predictive (TpPp

T) and the Y-orthogonal (ToPoT) compononents.

•  Only Y-predictive variation used for modeling of Y. •  X = TpPp

T + ToPoT + E

•  Y = TpCpT + F

•  E and F are the residual matrices of X and Y •  OPLS-DA compared to PLS-DA

Page 18: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic
Page 19: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic

Method  of  cluster  analysis    which  seeks  to  build  a  hierarchy  of  clusters.    

DireYo  et  al.,  10  

“Local”  Clustering  

“Global” Clustering  

Hierarchical  clustering  

Page 20: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic

CorrelaLon  coefficients  

Pairwise  correla-on  analysis  

Heat-­‐Map   Clustering   Network  

How  (and  how  much)  does  data  correlate?      

Measures  of  dependence  

Page 21: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic

Ascobate/CONSTANS Lysine/ WRKY6

Sucrose/Sucrose Transporters 4-Aminobutric Acid/ Glutamate Decarboxylase

Expected  correla-ons  

Unintended  correla-ons  

Urbanczyk- Wochniak et al., 03

Pairwise  correlaLon  analysis  

Page 22: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic

CorrelaLon  matrix  

Carrari  et  al.,  06  

the  matrix  of  Pearson  product-­‐moment  correlaLon  coefficients    between  each  of  the  random  variables  in  the  random  vector  {X}  

Page 23: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic

InteracLon  Network  

VirtualPlant  

Libourel  and  Shachar-­‐Hill,  08  

Page 24: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic

CorrelaLon  Networks  (“local”  biology)  CrtI PSY1 PSY2 PDS ZDS CrtISO LCY-b LCY-e CHY1 CHY2 CYP97A CYP97C ZEP NXS Lutein Zea Anthera Viola Neo

CrtI 1 PSY1 -0.984 1 PSY2 -0.98 0.932 1 PDS 0.994 -0.997 -0.955 1 ZDS 0.194 -0.361 0 0.293 1 CrtISO 0.941 -0.868 -0.988 0.901 -0.148 1 LCY-b -0.9 0.962 0.799 -0.94 -0.6 -0.701 1 LCY-e -0.339 0.498 0.151 -0.434 -0.988 -0.002 0.714 1 CHY1 0.688 -0.552 -0.816 0.61 -0.577 0.892 -0.305 0.447 1 CHY2 -0.28 0.109 0.461 -0.18 0.887 -0.587 -0.164 -0.807 -0.888 1 CYP97A 0.973 -0.998 -0.911 0.992 0.411 0.839 -0.975 -0.544 0.505 -0.054 1 CYP97C 0.982 -0.935 -0.999 0.958 0.008 0.987 -0.804 -0.159 0.811 -0.453 0.914 1 ZEP -0.252 0.08 0.435 -0.151 0.9 -0.564 -0.193 -0.824 -0.875 0.999 -0.025 -0.427 1 NXS 0.187 -0.014 -0.374 0.085 -0.927 0.508 0.257 0.859 0.841 -0.995 -0.04 0.366 -0.997 1 Lutein 0.683 -0.799 -0.528 0.754 0.848 0.396 -0.932 -0.918 -0.058 0.509 0.831 0.536 0.534 -0.588 1 Zea 0.188 -0.356 0.005 0.288 0.999 -0.154 -0.596 -0.987 -0.582 0.889 0.406 0.003 0.902 -0.929 0.845 1 Anthera 0.899 -0.81 -0.967 0.85 -0.253 0.994 -0.621 0.104 0.935 -0.67 0.777 0.965 -0.649 0.597 0.296 -0.258 1 Viola 0.999 -0.983 -0.981 0.994 0.189 0.942 -0.898 -0.335 0.692 -0.284 0.972 0.983 -0.256 0.192 0.679 0.183 0.901 1 Neo 0.983 -0.999 -0.93 0.997 0.366 0.865 -0.964 -0.503 0.547 -0.103 0.998 0.933 -0.074 0.008 0.803 0.361 0.807 0.982 1

CorrelaLon  Matrix  

ns=node  strength=Σ⏐ρ⏐of  a  node/n  NS=Network  Strength=Σ(ns)/n  n=number  of  nodes  

Network  CorrelaLon  file  

pP-­‐I;  n=19;  NS=0.62   pP-­‐BI;  n=20;  NS=0.79   pP-­‐YBI  n=24;  NS=0.79  

Transgenes  +  Carotenoid  genes  +  Carotenoids  

NegaLve  CorrelaLon   PosiLve  CorrelaLon  Gene  Carotenoid  

Transgene  

Only  correlaLons  ⏐ρ⏐>0.6  are  shown  

PosiLve  hub  

NegaLve  hub  

NegaLve  hub  

PosiLve  hub  

Page 25: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic

Correlation Network for fishing candidates…

•  Node size according ns •  Only correlations ⏐ρ⏐>0.65 are shown •  Edge width according ⏐ρ⏐

Gene Metabolite

Negative Correlation Positive Correlation ns=node  strength=  AVG⏐ρ⏐  

n=number  of  nodes  

NS=network  strength=  AVG  ns  

lycopene   β-­‐carotene   Total  Carotenoids  

Page 26: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic

CorrelaLon  Network  of  carotenoids  +  100  volaLles  (I)  CrtI PSY1 PSY2 PDS ZDS CrtISO LCY-b LCY-e CHY1 CHY2 CYP97A CYP97C ZEP NXS Lutein Zea Anthera Viola Neo

CrtI 1 PSY1 -0.984 1 PSY2 -0.98 0.932 1 PDS 0.994 -0.997 -0.955 1 ZDS 0.194 -0.361 0 0.293 1 CrtISO 0.941 -0.868 -0.988 0.901 -0.148 1 LCY-b -0.9 0.962 0.799 -0.94 -0.6 -0.701 1 LCY-e -0.339 0.498 0.151 -0.434 -0.988 -0.002 0.714 1 CHY1 0.688 -0.552 -0.816 0.61 -0.577 0.892 -0.305 0.447 1 CHY2 -0.28 0.109 0.461 -0.18 0.887 -0.587 -0.164 -0.807 -0.888 1 CYP97A 0.973 -0.998 -0.911 0.992 0.411 0.839 -0.975 -0.544 0.505 -0.054 1 CYP97C 0.982 -0.935 -0.999 0.958 0.008 0.987 -0.804 -0.159 0.811 -0.453 0.914 1 ZEP -0.252 0.08 0.435 -0.151 0.9 -0.564 -0.193 -0.824 -0.875 0.999 -0.025 -0.427 1 NXS 0.187 -0.014 -0.374 0.085 -0.927 0.508 0.257 0.859 0.841 -0.995 -0.04 0.366 -0.997 1 Lutein 0.683 -0.799 -0.528 0.754 0.848 0.396 -0.932 -0.918 -0.058 0.509 0.831 0.536 0.534 -0.588 1 Zea 0.188 -0.356 0.005 0.288 0.999 -0.154 -0.596 -0.987 -0.582 0.889 0.406 0.003 0.902 -0.929 0.845 1 Anthera 0.899 -0.81 -0.967 0.85 -0.253 0.994 -0.621 0.104 0.935 -0.67 0.777 0.965 -0.649 0.597 0.296 -0.258 1 Viola 0.999 -0.983 -0.981 0.994 0.189 0.942 -0.898 -0.335 0.692 -0.284 0.972 0.983 -0.256 0.192 0.679 0.183 0.901 1 Neo 0.983 -0.999 -0.93 0.997 0.366 0.865 -0.964 -0.503 0.547 -0.103 0.998 0.933 -0.074 0.008 0.803 0.361 0.807 0.982 1

CorrelaLon  Matrix  

ns=node  strength=  AVG⏐ρ⏐  NS=Network  Strength=  AVG  (ns)  n=number  of  nodes  

CorrelaLon    Network  file  

Carotenoids  

Carotenoid-­‐vol.  

Terpenoid-­‐vol.  

Lipid-­‐vol.  

Aminoacid-­‐vol.  

NegaLve  CorrelaLon  PosiLve  CorrelaLon  

Only  correlaLons  ⏐ρ⏐>0.85  are  shown    

•   Node  size  according  ns  •   Node  shape  according    the  metabolic  class  

Up-­‐regulaLon  Dw-­‐regulaLon  

Carotenoids  

Aminoacid-­‐vol.  

Carotenoid-­‐vol.  

Lipid-­‐vol.  

Terpenoid-­‐vol.  

Network  Strength  =  NS=  0.78  

CorrelaLon  Networks  (“global”  biology)  

Page 27: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic

Carotenoids  

Carotenoid-­‐vol.  

Terpenoid-­‐vol.  

Lipid-­‐vol.  

Aminoacid-­‐vol.  

Rank  1  cluster:  

Rank  3  cluster:  Rank  2  cluster:  

Significant  modules  in  a  correlaLon  network…  

Page 28: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic

Correlation network analysis of the main regulatory “hubs” in “Golden” fruits

Negative Correlation Positive Correlation

Up-regulation Dw-regulation ns=node  strength=  AVG⏐ρ⏐  

n=number  of  nodes  

•  Node size according ns •  Only correlations ⏐ρ⏐>0.90 are shown

Gene Metabolite

Phenotype Enzyme

•  Edge width according ⏐ρ⏐

Ethylene

ABA

Lycopene

β-carotene

NS=network  strength=  AVG  ns  

n=  176  NS=  0.89  

Page 29: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic
Page 30: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic

Network Reconstruction of Cell Metabolism

Leucine  and  faYy  acid  metabolism  in  A.  thaliana  

Page 31: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic

Conclusions  –  Data  IntegraLon  

-­‐   Systems  Biology  data  integra-on  allow  to  increase  knowledge  about  all  the  modifica-ons  accoun-ng  global  metabolism    -­‐   Bioinforma-c/sta-s-c  tools  can  point  the  aMen-on  on  the  “major  players” involved  in  a  biological  process    -­‐   Era  of  iden-fica-on  of  Master  Nodes  started,  but  metabolic  boMleneck  overcome  is  s-ll  far…    

RaLonal  Design  of  future  crops  is  a  sLll  far  away,  but  possible,    DREAM…  

Page 32: Genomica)Funzionale) Genomica II- lezione V... · Integration of metabolomics with other ‘omics’ fields • Integrating genomics and metabolomics for engineering plant metabolic

THANK  YOU  and  GOOD  LUCK!!!!  

Contact:  [email protected]