annotating microarray data with the mged ontology
DESCRIPTION
Annotating Microarray Data with the MGED Ontology. NCI Center for Bioinformatics April 15, 2004 P. L. Whetzel, A. Pizarro, E. Manduchi, J. Liu, H. He, G. Grant, M. Mailman, C. Stoeckert Center for Bioinformatics University of Pennsylvania. Science 298:601-604, 2002. - PowerPoint PPT PresentationTRANSCRIPT
Annotating Annotating Microarray Data Microarray Data with the MGED with the MGED
OntologyOntologyNCI Center for BioinformaticsNCI Center for Bioinformatics
April 15, 2004April 15, 2004P. L. Whetzel, A. Pizarro, E. Manduchi, J. Liu, H. P. L. Whetzel, A. Pizarro, E. Manduchi, J. Liu, H.
He, G. Grant, M. Mailman, C. StoeckertHe, G. Grant, M. Mailman, C. Stoeckert
Center for BioinformaticsCenter for Bioinformatics
University of PennsylvaniaUniversity of Pennsylvania
Science 298:601-604, 2002
Science 298:597-600, 2002
To compare experiments, you need some To compare experiments, you need some minimum information about the microarray minimum information about the microarray
experiments.experiments.Ivanova et al. Science 2003
Microarray Information to be Microarray Information to be SharedShared
Figure from:David J. Duggan et al. (1999) Expression Profiling using cDNA microarrays. Nature Genetics 21: 10-14
The Computational View of Microarray The Computational View of Microarray Information Information
MGED SocietyMGED Society
International organizationInternational organization Comprised of biologists Comprised of biologists
computer scientists, and computer scientists, and data analystsdata analysts
Aims to facilitate the sharing Aims to facilitate the sharing and evaluation of microarray and evaluation of microarray data data
Establish standards for Establish standards for microarray data annotationmicroarray data annotation
Create microarray Create microarray databasesdatabases
Promote sharing of high Promote sharing of high quality, well-annotated dataquality, well-annotated data
Generalize to data Generalize to data generated by functional generated by functional genomics and proteomics genomics and proteomics experimentsexperiments
www.mged.org
MGED Standardization MGED Standardization EffortsEfforts
MIAMEMIAME The formulation of the minimum information about a microarray The formulation of the minimum information about a microarray
experiment required to interpret and verify the results. (Brazma experiment required to interpret and verify the results. (Brazma et al. Nature Genetics 2001)et al. Nature Genetics 2001)
MAGE-OMMAGE-OM The establishment of a data exchange format and object model for The establishment of a data exchange format and object model for
microarray experiments. (Spellman et al. Genome Biol. 2002)microarray experiments. (Spellman et al. Genome Biol. 2002)
MGED OntologyMGED Ontology The development of an ontology for microarray experiment The development of an ontology for microarray experiment
description and biological material (biomaterial) annotation in description and biological material (biomaterial) annotation in particular. (Stoeckrt & Parkinson, Comp. Funct. Genom. 2003)particular. (Stoeckrt & Parkinson, Comp. Funct. Genom. 2003)
TransformationsTransformations The development of recommendations regarding microarray data The development of recommendations regarding microarray data
transformations and normalization methods.transformations and normalization methods.
MGED Ontology (MO)MGED Ontology (MO)
PurposePurpose Provide standard terms for the annotation of Provide standard terms for the annotation of
microarray experiments microarray experiments Not to model biology but to provide descriptors for Not to model biology but to provide descriptors for
experiment components experiment components BenefitsBenefits
Unambiguous description of how the experiment was Unambiguous description of how the experiment was performedperformed
Structured queries can be generatedStructured queries can be generated
Ontology concepts derived from the MIAME Ontology concepts derived from the MIAME guidelines/MAGE-OMguidelines/MAGE-OM
MGED Ontology MGED Ontology developmentdevelopment
http://mged.sourceforge.net/ontologies/MGEDonhttp://mged.sourceforge.net/ontologies/MGEDontology.php tology.php
OILedOILed File formatsFile formats
DAML fileDAML file HTML fileHTML file NCI DTS BrowserNCI DTS Browser
ChangesChanges NotesNotes Term TrackerTerm Tracker
Relationship of Relationship of MO to MAGE-OMMO to MAGE-OM
MO class hierarchy follows that of MO class hierarchy follows that of MAGE-OMMAGE-OM Association to OntologyEntryAssociation to OntologyEntry
MO provides terms for these MO provides terms for these associations by: associations by: Instances internal to MOInstances internal to MO Instances from external ontologiesInstances from external ontologies
Take advantage of existing ontologiesTake advantage of existing ontologies
MGED Ontology MGED Ontology Class HierarchyClass Hierarchy
MGED CoreOntologyMGED CoreOntology Coordinated development Coordinated development
with MAGE-OMwith MAGE-OM Ease of locating Ease of locating
appropriate class to appropriate class to select terms fromselect terms from
MGED MGED ExtendedOntologyExtendedOntology Classes for additional Classes for additional
terms as the usage of terms as the usage of genomics technologies genomics technologies expandexpand
MAGE and MOMAGE and MO
MAGE and MOMAGE and MO
Main focus of MGED Main focus of MGED OntologyOntology
Structured and Structured and rich description rich description of BioMaterialsof BioMaterials
BioMaterial
OntologyEntry
+characteristics
+associations
MO and References to MO and References to External OntologiesExternal Ontologies
MO and references to MO and references to External OntologiesExternal Ontologies
Use MGED Ontology for Use MGED Ontology for Structured Descriptions Structured Descriptions
(MAGE-ML)(MAGE-ML)
http://www.sofg.org
Desirable Microarray Desirable Microarray Queries Queries
Return all experiments with species X Return all experiments with species X examined at developmental stage Yexamined at developmental stage Y Sort by platform typeSort by platform type Which are untreated? Treated?Which are untreated? Treated?
Treated with what compound?Treated with what compound? How comparable are these?How comparable are these?
What can these experiments tell me?What can these experiments tell me?
MO and Structured MO and Structured QueriesQueries
RAD: RNA Abundance Database http://www.cbil.upenn.edu/RAD
RAD is part of GUS (Genomics Unified Schema)The GUS platform maximizes the utility of stored data by
warehousing them in a schema that integrates the genome, transcriptome, gene regulation and networks, ontologies and controlled vocabularies, gene expression
Relational schema (implemented in Oracle)Stores data from gene expression arrays and SAGEComes with a suite of web-annotation forms (Study-
Annotator)MAGE-RAD Translator (MR_T) generates MAGE-ML files
for exportsManduchi et al. 2004 Bioinformatics 20:452-459.
GUS (Genomics Unified Schema)http://www.gusdb.org
OntologiesShared
ResourcesSRes
MIAME/MAGE-OMGene ExpressionRAD
GrammarsGene regulationTESS
DocumentationData ProvenanceCore
Central dogmaSequence and
annotationDoTS
FeaturesDomainNamespace
About 65 tables and 30 viewsAssay to Quantification tablesStudy Design tablesBioMaterials tablesPlatform tablesQuantification Result tablesProcessing tablesAnalysis Result tablesMisc tables: Protocol, Contact*, Ontologies*Meta tables*: data privacy and for history trackingIntegrity Checks tables
* These are used by RAD, but belong to common GUS components
RAD Schema
Tables populated bythe Study-Annotator
RAD Study-Annotator
Covers all relevant parts of the MIAME checklistExploits the MGED OntologyAllows entering of very specific details of an
experimentWeb-based forms:
Modular structureWritten in PHPFront-end data integrity checks using JavaScript
Manages Data Privacy based on Project/Group selections present in GUS schema
Available at http://www.cbil.upenn.edu/RAD/RAD-installation.htm
RAD Study-AnnotatorLogical Flow
Study
From Assay to Quantification
Study Design
Login
BioMaterials(samples, treatments)
New User Registration
Module I Module II Module III
Data Preferences(Project, Group)
Misc
Experiment Annotation:Experiment Annotation:Study DesignStudy Design
BioMaterial Annotation: BioMaterial Annotation: Conceptual View Conceptual View
RAD Study Annotator: RAD Study Annotator: BioMaterial ModuleBioMaterial Module
RAD Study Annotator: RAD Study Annotator: BioSource FormBioSource Form
RAD Study Annotator: RAD Study Annotator: Treatment FormTreatment Form
Using the Ontologies
OntologyEntry
ExternalDatabases
RAD Study-Annotator
MGED OntologyAnatomy
DevelopmentalStageDiseaseLineage
PATOAttributePhenotype
Taxon
SRES
RAD
MGED Ontology
Ontology instances propagated to annotation web forms
new terms can be proposed
Sources of New Terms in Sources of New Terms in OntologyEntryOntologyEntry
MGED OntologyMGED Ontology Continued development of new Continued development of new
classes and termsclasses and terms Shared Resources (SRes)Shared Resources (SRes)
Contains controlled vocabularies and Contains controlled vocabularies and ontologiesontologies
External Database SourcesExternal Database Sources Annotated term provided by user Annotated term provided by user
Adding New TermsAdding New Terms
1 Add term from SRes
2 Add term from External Database
Future IssuesFuture Issues
Burning IssuesBurning Issues Developing MO in synch with related Developing MO in synch with related
efforts (MAGE-OM v.2.0)efforts (MAGE-OM v.2.0) Use/presentation in annotation formsUse/presentation in annotation forms Coverage of other technologies and Coverage of other technologies and
biological domainsbiological domains Flame retardant structureFlame retardant structure
ExtendedOntologyExtendedOntology Space to add new classes, terms and their Space to add new classes, terms and their
relationship to one anotherrelationship to one another
A Functional Genomics A Functional Genomics ViewView
A. Jones et al. submitted
A Functional Genomics A Functional Genomics Object Model (FGE-OM)Object Model (FGE-OM)
Separate out common Separate out common components from components from technology-specific technology-specific onesones
Allow new domains to Allow new domains to be added as new be added as new modules to the modelmodules to the model
Incorporate ideas Incorporate ideas from SysBio-OM from SysBio-OM (Xirasgur et al. (Xirasgur et al. Bioinformatics in Bioinformatics in press)press) Jones et al. Bioinformatics in press
ProteomicsStandards
FunctionalGenomicsStandards
MicroarrayStandards
MIAME MAGE-OM MGED Ontology
MIAPEPedro
PedroMIAPE-OMFGE-OM
MIAMEMIAME-Tox
MIAPE
FGE-OM MGED Ontology
Informal specification Formal specification
Immutable type systemStrong type system
Use Cases
Proposed Development of FGE-OMProposed Development of FGE-OM
AcknowledgementsAcknowledgements
MGED Ontology Working GroupMGED Ontology Working Group Chris Stoeckert, Trish Whetzel Chris Stoeckert, Trish Whetzel
(Penn)(Penn) Helen Parkinson (EBI)Helen Parkinson (EBI) Joe White (TIGR)Joe White (TIGR) Gilberto Fragoso, Liju Fan, Mervi Gilberto Fragoso, Liju Fan, Mervi
Heiskanen (NCI)Heiskanen (NCI) Many others!Many others!