kate dreher the carnegie institution for science stanford, ca (cimmyt, mexico)
Post on 24-Feb-2016
71 Views
Preview:
DESCRIPTION
TRANSCRIPT
Introduction to the Plant Metabolic Network: 18 Databases and
Omics-Level Tools for Analysis and Discovery
kate dreher
The Carnegie Institution for ScienceStanford, CA
(CIMMYT, Mexico)
Free access to high quality, curated data promotes beneficial research on plant metabolism
Plants provide crucial benefits to humanity and the ecosystem
A better understanding of plant metabolism may contribute to:
More nutritious foods More pest-resistant plants More stress-tolerant crops Higher photosynthetic capacity and higher yield in agricultural and biofuel crops New pharmaceutical sources . . . many more applications
These efforts benefit from access to high quality plant metabolism data
Plant Metabolic Network goals
Transform published results into data-rich metabolic pathways
Create and deploy improved methods for predicting enzyme function and metabolic capacity using plant genome sequences
Facilitate data analysis
Support research, breeding, and education
Provide public resources : PlantCyc AraCyc 16 additional species-specific databases
www.plantcyc.org
Plant Metabolic Network collaborators
SRI International – BioCyc project
Provide Pathway Tools Software
Maintain and update MetaCyc
Other collaborators / contributors include: MaizeGDB GoFORSYS TAIR SoyBase Sol Genomics Network (SGN) / Boyce Thompson Institute Gramene MedicCyc / Nobel Foundation PlantMetabolomics group . . . and more
SoyBase
Editorial BoardCommunity
Submissions
17 PMN species are phylogenetically and "functionally" diverse
Eudicots
Cereals
Basal land plants
Green alga
• Major crops
• Model species
• Legume
• Woody and herbaceous species
• Annuals and perennials
• C3 and C4 species
AraCyc 11.5 Arabidopsis thaliana
CassavaCyc 3.0 Manihot esculenta
ChineseCabbageCyc 1.0 Brassica rapa (+ spp.)
GrapeCyc 3.0 Vitis vinifera
PapayaCyc 2.0 Carica papaya
PoplarCyc 6.0 Populus trichocarpa (+ spp.)
SoyCyc 4.0 Glycine max
BarleyCyc 1.0 Hordeum vulgare
BrachypodiumCyc 1.0 Brachypodium distachyon
CornCyc 4.0 Zea mays
OryzaCyc 1.0 Oryza sativa (+ spp.)
SetariaCyc 1.0 Setaria italica
SorghumBicolorCyc 1.0 Sorghum bicolor
SwitchgrassCyc 1.0 Panicum virgatum
MossCyc 2.0 Physcomitrella patens
SelaginellaCyc 2.0 Selaginella moellendorffii
ChlamyCyc 3.5 Chlamydomonas reinhardtii
Database Species Pathways* PlantCyc 8.0 400+ 1050AraCyc 11.5 Arabidopsis thaliana 597
CassavaCyc 3.0 Manihot esculenta 491ChineseCabbageCyc 1.0 Brassica rapa (+ spp.) 499
GrapeCyc 3.0 Vitis vinifera 479PapayaCyc 2.0 Carica papaya 481PoplarCyc 6.0 Populus trichocarpa (+ spp.) 505
SoyCyc 4.0 Glycine max 520BarleyCyc 1.0 Hordeum vulgare 465
BrachypodiumCyc 1.0 Brachypodium distachyon 473CornCyc 4.0 Zea mays 508OryzaCyc 1.0 Oryza sativa (+ spp.) 482
SetariaCyc 1.0 Setaria italica 477SorghumBicolorCyc 1.0 Sorghum bicolor 480
SwitchgrassCyc 1.0 Panicum virgatum 479MossCyc 2.0 Physcomitrella patens 416
SelaginellaCyc 2.0 Selaginella moellendorffii 421ChlamyCyc 3.5 Chlamydomonas reinhardtii 349
PlantCyc provides access to important pathways not found in species-specific databases
PlantCyc contains over 1000 pathways and information from over 400 plant species
1050pathways
477pathways(average)
PlantCyc provides access to numerous specialized metabolic pathways
Many PlantCyc-specific metabolic pathways produce or break-down specialized ("secondary") metabolites
Many of the enzymes in these pathways have experimental evidence
Caffeine biosynthesis I
(Caffea arabica)Morphine biosynthesis(Papaver somniferum)
Taxol biosynthesis(cancer drug)
(Taxus brevifolia)
Raspberry ketone biosynthesis
(Rubus idaeus)
Vicianin bioactivation (defense compound)
(Vicia sativa)
Alliin degradation (garlic odor)
(Allium sativum)
Species-specific databases require predictions
Annotated Genomee.g. Populus trichocarpa
PathoLogic
SoftwareReference PathwayDatabase (MetaCyc)
Reactions
Pathways
compounds
Gene products genes
Pathway/Genome Database (PoplarCyc)
E2P2
The PMN predicts enzyme functions from sequenced proteomes
To improve enzyme functional predictions:
A high-confidence Reference Protein Sequence Database was built RPSD 2.0 contains 34,269 enzymes and 82,216 non-enzymes
The Ensemble Enzyme Prediction Pipeline (E2P2) uses the RPSD to predict functions based on protein sequence
E2P2 can predict reactions that are not fully defined in the EC system by incorporating MetaCyc reaction IDs
Pathway predictions are refined through a SAVI pipeline
Annotated Genomee.g. Populus trichocarpa
PathoLogic
SoftwareReference PathwayDatabase (MetaCyc)
Reactions
Pathways
compounds
Gene products genes
Pathway/Genome Database (PoplarCyc)
E2P2
SAVI
Validated – Publicly Released
Pathway/Genome Database (PoplarCyc)
SAVI pipeline Decision rules and criteria for each pathway are generated by curators
The Semi-automated validation / incorporation pipeline uses the rules to:
Identify pathways with curated experimental evidence*
Bring in “Ubiquitous Plant Pathways” that were not predicted Calvin cycle, glycolysis, etc.
Remove predicted non-plant / non-PMN pathways Glycogen biosynthesis (non-plant) Proteolysis (not small molecule metabolism)
Check key reactions and expected phylogenetic range to automatically assess many other predicted pathways
Highlight pathways that require manual validation
Expert input welcome!! To submit data, report an error, volunteer to validate, or ask a question
Send an e-mail: curator@plantcyc.org
Use our feedback form:
Meet with me at the end of the workshop
Schedule an individual meeting with me at PAG
Community gratitude We thank you publicly!
Together, we can make valuable, high quality databases
The PMN databases provide data and tools for analysis
Information, curated summaries, high quality predictions, and experimentally supported information about: Pathways Enzymes Reactions Compounds
Tools General and specific searches Comparative analysis tools BLAST against enzymes in the PMN or the RPSD
OMICs-level data analysis
Data analysis with the Metabolic Map / Omics Viewer Display experimental data on a metabolic map
Data types: Genes - transcriptomics Enzymes – proteomics Reactions - fluxomics Compounds – metabolomics
Data inputs: Single or multiple values for each object Absolute or relative values
gene IDs
relative expression levels at different timepoints
Visualizing quantitative data
Visualizing quantitative datainput file
color gradient
data columns
type of data
scalerelative
or absolute
data display options
Easily identify altered pathways
Navigate to items of interest on the metabolic map
Working with Groups
Opportunities
Create custom data sets
Explore experimental results
Perform enrichment analyses
Share data
Requires free registration
Generate custom datasets
Create groups from searches
modify content
paint on map
compare to other groups
export to Excel
share with
others
Plant metabolic NETWORKING Please use our data
Please use our tools
Please come explore our new species databases coming in 2014
Please help us to improve our databases!
Please contact us if we can be of any help!
curator@plantcyc.org
www.plantcyc.org
PMN AcknowledgementsCurator:- kate dreher
Post-docs:- Lee Chae- Ricardo Nilo Poyanco- Chuan Wang
Interns- Ashley Joseph
Tech Team Members:- Bob Muller - Garret Huntress
Rhee Lab Members:- Flavia Bossi- Hye-in Nam- Taehyong Kim- Meng Xu- Jim Guo- Jue Fan- Caryn Johansen
Peifen Zhang (Director and curator)
Sue Rhee (PI)
Collaborators:
SRI - Peter Karp - Ron Caspi - Hartmut Foerster- Suzanne Paley- SRI Tech Team
MaizeGDB- Mary Schaeffer- Lisa Harper- Jack Gardiner- Taner Sen
ChlamyCyc- Patrick May- Dirk Walther
- Lukas Mueller (SGN)- Rex Nelson (Soybase)- Gramene and MedicCyc
PMN Alumni:- A. S. Karthikeyan (curator)- Christophe Tissier (curator) - Hartmut Foerster (curator)- Eva Huala (co-PI)- Tam Tran (intern)- Varun Dwaraka (intern)- Damian Priamurskiy (intern)- Ricardo Leitao (intern)- Michael Ahn (intern)- Purva Karia (intern)- Anuradha Pujar (SGN curator)
Tech Team Alumni- Anjo Chi- Cynthia Lee- Tom Meyer- Larry Ploetz- Shanker Singh- Bill Nelson- Vanessa Kirkup- Chris Wilks- Raymond Chetty
Please use our data
Please use our tools
curator@plantcyc.org
www.plantcyc.org
Sue Rhee (PI)
Peifen Zhang (Director)
We're here to help . . .
top related