the mged ontology workshop mged 7 september 8, 2004 chris stoeckert center for bioinformatics &...

36
The MGED Ontology The MGED Ontology Workshop Workshop MGED 7 MGED 7 September 8, 2004 September 8, 2004 Chris Stoeckert Chris Stoeckert Center for Bioinformatics & Dept. of Center for Bioinformatics & Dept. of Genetics Genetics University of Pennsylvania University of Pennsylvania

Upload: leo-yearsley

Post on 28-Mar-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

The MGED Ontology The MGED Ontology WorkshopWorkshop

MGED 7MGED 7

September 8, 2004September 8, 2004Chris StoeckertChris Stoeckert

Center for Bioinformatics & Dept. of GeneticsCenter for Bioinformatics & Dept. of Genetics

University of PennsylvaniaUniversity of Pennsylvania

Page 2: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

MGED Ontology MGED Ontology Workshop AgendaWorkshop Agenda

What is the MGED Ontology (MO)?What is the MGED Ontology (MO)? Building MO: the processBuilding MO: the process Using MOUsing MO Future development of MOFuture development of MO Joe White (TIGR): MO applications Joe White (TIGR): MO applications

from MAGE Jamboreefrom MAGE Jamboree

Page 3: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

MGED Standardization MGED Standardization EffortsEfforts MIAMEMIAME

The formulation of the minimum information about a microarray experiment The formulation of the minimum information about a microarray experiment required to interpret and verify the results. (Brazma et al. Nature Genetics required to interpret and verify the results. (Brazma et al. Nature Genetics 2001)2001)

MAGE-OMMAGE-OM The establishment of a data exchange format and object model for microarray The establishment of a data exchange format and object model for microarray

experiments. (Spellman et al. Genome Biol. 2002)experiments. (Spellman et al. Genome Biol. 2002) MGED OntologyMGED Ontology

The development of an ontology for microarray experiment description and The development of an ontology for microarray experiment description and biological material (biomaterial) annotation in particular. (Stoeckrt & biological material (biomaterial) annotation in particular. (Stoeckrt & Parkinson, Comp. Funct. Genom. 2003)Parkinson, Comp. Funct. Genom. 2003)

TransformationsTransformations The development of recommendations regarding microarray data The development of recommendations regarding microarray data

transformations and normalization methods.transformations and normalization methods. RSBIRSBI

Reporting Structure for Biological Investigations (toxicogenomics, Reporting Structure for Biological Investigations (toxicogenomics, environmental genomics, metabol/nomics) environmental genomics, metabol/nomics)

Page 4: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

MGED Ontology (MO)MGED Ontology (MO)

PurposePurpose Provide standard terms for the annotation of Provide standard terms for the annotation of

microarray experiments microarray experiments Not to model biology but to provide descriptors for Not to model biology but to provide descriptors for

experiment components experiment components BenefitsBenefits

Unambiguous description of how the experiment was Unambiguous description of how the experiment was performedperformed

Structured queries can be generatedStructured queries can be generated

Ontology concepts derived from the MIAME Ontology concepts derived from the MIAME guidelines/MAGE-OMguidelines/MAGE-OM Also incorporating concepts from Transformations and Also incorporating concepts from Transformations and

RSBIRSBI

Page 5: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

Relationship of Relationship of MO to MAGE-OMMO to MAGE-OM

MO class hierarchy follows that of MO class hierarchy follows that of MAGE-OMMAGE-OM Association to OntologyEntryAssociation to OntologyEntry

MO provides terms for these MO provides terms for these associations by: associations by: Instances internal to MOInstances internal to MO Instances from external ontologiesInstances from external ontologies

Take advantage of existing ontologiesTake advantage of existing ontologies

Page 6: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

MGED Ontology MGED Ontology Class HierarchyClass Hierarchy

MGED CoreOntologyMGED CoreOntology Coordinated development Coordinated development

with MAGE-OMwith MAGE-OM Ease of locating Ease of locating

appropriate class to appropriate class to select terms fromselect terms from

MGED MGED ExtendedOntologyExtendedOntology Classes for additional Classes for additional

terms as the usage of terms as the usage of genomics technologies genomics technologies expandexpand

Page 7: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

MAGE and MOMAGE and MO

Page 8: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

MAGE and MOMAGE and MO

Page 9: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

Main focus of MGED Main focus of MGED OntologyOntology

Structured and Structured and rich description rich description of BioMaterialsof BioMaterials

BioMaterial

OntologyEntry

+characteristics

+associations

Page 10: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania
Page 11: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania
Page 12: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

MO and References to MO and References to External OntologiesExternal Ontologies

Page 13: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

MO and references to MO and references to External OntologiesExternal Ontologies

Page 14: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

http://www.sofg.org

Page 15: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

Standards and Ontologies for Standards and Ontologies for Functional Genomics 2Functional Genomics 2

October 23-26, 2004October 23-26, 2004held at the University of Pennsylvania held at the University of Pennsylvania

Medical SchoolMedical Schoolwww.jax.org/courses/eventswww.jax.org/courses/events

Funded in part byNHGRINCRRNERCGSK

Co-Hosted byThe Jackson Laboratory

University of Pennsylvania

European Bioinformatics

Institute------------------------

Student Scholarships Available

--------------------------------------------------------

Photo by R. Kennedy, B Trist, R. Tarver, for GPTMC

Page 16: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

http://mged.sourceforge.net/ontologies/http://mged.sourceforge.net/ontologies/index.phpindex.php

Page 17: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

Use MGED Ontology for Use MGED Ontology for Structured Descriptions Structured Descriptions

(MAGE-ML)(MAGE-ML)

Page 18: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

MGED Ontology MGED Ontology developmentdevelopment

http://mged.sourceforge.net/ontologies/MGEDonhttp://mged.sourceforge.net/ontologies/MGEDontology.php tology.php

OILedOILed File formatsFile formats

DAML fileDAML file HTML fileHTML file NCI DTS BrowserNCI DTS Browser

ChangesChanges NotesNotes Term TrackerTerm Tracker

Page 19: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

MGED Ontology Working MGED Ontology Working GroupGroup

Virtual Ontology WorkshopsVirtual Ontology Workshops Chris Stoeckert, Trish Whetzel (Penn)Chris Stoeckert, Trish Whetzel (Penn) Helen Parkinson, Susanna Sansone (EBI)Helen Parkinson, Susanna Sansone (EBI) Joe White (TIGR)Joe White (TIGR) Gilberto Fragoso, Liju Fan, Mervi Gilberto Fragoso, Liju Fan, Mervi

Heiskanen (NCI)Heiskanen (NCI) Helen Causton, Laurence Game (ICL)Helen Causton, Laurence Game (ICL) Chris Taylor (PSI, EBI)Chris Taylor (PSI, EBI)

Mged-ontologies mailing list Mged-ontologies mailing list

Page 20: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

Desirable Microarray Desirable Microarray Queries Queries

Return all experiments with species X Return all experiments with species X examined at developmental stage Yexamined at developmental stage Y Sort by platform typeSort by platform type Which are untreated? Treated?Which are untreated? Treated?

Treated with what compound?Treated with what compound? How comparable are these?How comparable are these?

What can these experiments tell me?What can these experiments tell me?

Page 21: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

MO and Structured MO and Structured QueriesQueries

Page 22: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

RAD: RNA Abundance Database http://www.cbil.upenn.edu/RAD

RAD is part of GUS (Genomics Unified Schema)The GUS platform maximizes the utility of stored data by

warehousing them in a schema that integrates the genome, transcriptome, gene regulation and networks, ontologies and controlled vocabularies, gene expression

Relational schema (implemented in Oracle)Stores data from gene expression arrays and SAGEComes with a suite of web-annotation forms (Study-

Annotator)MAGE-RAD Translator (MR_T) generates MAGE-ML files

for exportsManduchi et al. 2004 Bioinformatics 20:452-459.

Page 23: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

RAD Study-Annotator

Covers all relevant parts of the MIAME checklistExploits the MGED OntologyAllows entering of very specific details of an

experimentWeb-based forms:

Modular structureWritten in PHPFront-end data integrity checks using JavaScript

Manages Data Privacy based on Project/Group selections present in GUS schema

Available at http://www.cbil.upenn.edu/RAD/RAD-installation.htm

Page 24: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

BioMaterial Annotation: BioMaterial Annotation: Conceptual View Conceptual View

Page 25: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

RAD Study Annotator: RAD Study Annotator: BioMaterial ModuleBioMaterial Module

Page 26: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

RAD Study Annotator: RAD Study Annotator: BioSource FormBioSource Form

Page 27: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

Other Sites Using MOOther Sites Using MO

See posters for more details on these!

Page 28: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

Future Development of Future Development of MOMO

Areas of DevelopmentAreas of Development Ongoing maintenance Ongoing maintenance Ontology languageOntology language Non-array technologiesNon-array technologies Biological domain extensions Biological domain extensions MO v2. developmentMO v2. development

Page 29: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

Proposed methods for MO Proposed methods for MO developmentdevelopment

Ongoing maintenanceOngoing maintenance Addition of new instance terms to Addition of new instance terms to

existing classesexisting classes Fixing typographical errorsFixing typographical errors Adding missing associationsAdding missing associations These represent minor changes that These represent minor changes that

should largely not affect software should largely not affect software applications that are based on the MOapplications that are based on the MO

Page 30: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

Proposed methods for MO Proposed methods for MO developmentdevelopment

Ontology languageOntology language Planned changes in the primary language Planned changes in the primary language

format (from DAML to OWL).format (from DAML to OWL). Planned changes in the primary ontology Planned changes in the primary ontology

editing tool (from OILed to Protégé).editing tool (from OILed to Protégé). These should represent fairly minor These should represent fairly minor

differences as far as applications based on differences as far as applications based on the MO are concerned. the MO are concerned.

Some minor name changes will be needed to adjust Some minor name changes will be needed to adjust for differences in allowed characters. for differences in allowed characters.

New functionalities such as the availability of New functionalities such as the availability of synonyms may be used to enrich the MO further.synonyms may be used to enrich the MO further.

Page 31: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

Proposed methods for MO Proposed methods for MO developmentdevelopment

Non-array technologiesNon-array technologies Standards efforts for proteomics (PSI) and metabol/nomics Standards efforts for proteomics (PSI) and metabol/nomics

(SMRS) would like to add terms for their specific needs.(SMRS) would like to add terms for their specific needs. Classes that are needed for new technologies can be placed Classes that are needed for new technologies can be placed

under the MGEDExtendedOntology and linked to under the MGEDExtendedOntology and linked to MGEDCoreOntology classes through properties MGEDCoreOntology classes through properties

(i.e., MGEDExtendedOntologyClass has_property (i.e., MGEDExtendedOntologyClass has_property (MGEDCoreOntologyClass). (MGEDCoreOntologyClass).

Such development would not impact the MGEDCoreOntology and therefore Such development would not impact the MGEDCoreOntology and therefore allow addition of non-array technology classes allow addition of non-array technology classes

Instances that are needed for new technologies may be most Instances that are needed for new technologies may be most appropriate for existing classes in the MGEDCoreOntologyappropriate for existing classes in the MGEDCoreOntology

The policy for adding and defining instances regarding technology-related The policy for adding and defining instances regarding technology-related terms is to provide a generic name and definition but to supply technology-terms is to provide a generic name and definition but to supply technology-specific examples (in the definition). specific examples (in the definition).

Page 32: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

A Functional Genomics A Functional Genomics ViewView

Courtesy of Andy Jones

Page 33: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

Proposed methods for MO Proposed methods for MO developmentdevelopment

Biological domain extensions Biological domain extensions Areas (e.g., toxicogenomics) where the current specification of Areas (e.g., toxicogenomics) where the current specification of

Experiment and Biomaterial is not sufficient to fully capture Experiment and Biomaterial is not sufficient to fully capture descriptions of experimentsdescriptions of experiments

Extensions should fit within the MAGE-OM v1.1 and so ultimately could go Extensions should fit within the MAGE-OM v1.1 and so ultimately could go into the MGEDCoreOntology. into the MGEDCoreOntology.

However, as the new classes, subclasses, properties, and instances are However, as the new classes, subclasses, properties, and instances are under development (and therefore not stable), they should be placed in the under development (and therefore not stable), they should be placed in the MGEDExtendedOntology until mature enough to be migrated over to the MGEDExtendedOntology until mature enough to be migrated over to the MGEDCoreOntology. MGEDCoreOntology.

The MGED Reporting Structure for Biological Investigations The MGED Reporting Structure for Biological Investigations (RSBI)Working Group representing biological domain extensions (RSBI)Working Group representing biological domain extensions in toxicogenomics, environmental genomics, and nutrigenomics in toxicogenomics, environmental genomics, and nutrigenomics will take this approach. will take this approach.

Hear more about this from Jennifer Fostel next!Hear more about this from Jennifer Fostel next!

Page 34: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

Proposed methods for MO Proposed methods for MO developmentdevelopment

MO v2 developmentMO v2 development Reflect the reorganization planned for the Reflect the reorganization planned for the

MAGE-OM and its new major version (v2).MAGE-OM and its new major version (v2). MAGE v2 will have major structural changes from MAGE v2 will have major structural changes from

MAGE v1.1 and is likely to require major changes MAGE v1.1 and is likely to require major changes in the MO. in the MO.

With a MO v2 developed in parallel this should not With a MO v2 developed in parallel this should not conflict with the stated plans of the MO to be conflict with the stated plans of the MO to be consistent with MAGE as it will be tied to the new consistent with MAGE as it will be tied to the new version.version.

Page 35: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

A Functional Genomics A Functional Genomics Object Model (FGE-OM)Object Model (FGE-OM)

Separate out common Separate out common components from components from technology-specific technology-specific onesones

Allow new domains to Allow new domains to be added as new be added as new modules to the modelmodules to the model

Incorporate ideas Incorporate ideas from SysBio-OM from SysBio-OM (Xirasgur et al. (Xirasgur et al. Bioinformatics in Bioinformatics in press)press) Jones et al. Bioinformatics 2004

Page 36: The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

Proposed Development of MGED Proposed Development of MGED OntologyOntology

MO 1.x

MO 2.x

Sept. 2004 Jan. 2005 March 2005 Sept. 2005

Move to OWL/Protege

Proteomics in ExtendedOntology

RSBI in ExtendedOntologyRSBI in CoreOntology