the mged ontology: providing descriptors for microarray data trish whetzel department of genetics...

25
The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania

Upload: trinity-peterson

Post on 28-Mar-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania

The MGED Ontology: Providing Descriptors for

Microarray Data

Trish WhetzelDepartment of Genetics

Center for Bioinformatics

University of Pennsylvania

Page 2: The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania

• CBIL– Chris Stoeckert– Angel Pizarro– Elisabetta Manduchi

• EBI– Helen Parkinson– Susanna Sansone

• TIGR– Joe White

• Stanford– Cathy Ball

Acknowledgements

• NCICB– Gilberto Fragoso– Liju Fan– Mervi Heiskanen

• Others– Paul Spellman– John Matese– Helen Causton

• Ontology Mailing List

Page 3: The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania

MGED Society

• International organization• Comprised of biologists

computer scientists, and data analysts

• Aims to facilitate the sharing of functional genomics data generated by microarray and proteomics experiments– Establish standards for

microarray data annotation– Create microarray databases– Promote sharing of high

quality, well-annotated data

www.mged.org

Page 4: The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania

MGED Standardization Efforts

• MIAME– The formulation of the minimum information required about a

microarray experiment in order to interpret and verify the results.

• MAGE– The establishment of a data exchange format (MAGE-ML) and

an object model (MAGE-OM) for microarray experiments.

• Ontololgy Working Group– The development of an ontology to describe microarray

experiments and in particular the biological material (biomaterial) used in these experiments.

• Transformations– The development of recommendations regarding microarray

data transformations and normalization methods.

Page 5: The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania

Microarray Information to be Shared

Figure from:David J. Duggan et al. (1999) Expression Profiling using cDNA microarrays. Nature Genetics 21: 10-14

Page 6: The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania

MGED Ontology (MO)

• Purpose– Provide standard terms for the annotation of

microarray experiments

• Benefits– Unambiguous description of how the

experiment was performed– Structured queries can be generated

• MGED Ontology concepts derived from the MIAME guidelines/MAGE-OM

Page 7: The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania

MGED Ontology developmenthttp://mged.sourceforge.net/ontologies/MGEDontology

.php

• Oiled• File formats

– Html file– Daml file– NCI DTS Browser

Page 8: The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania

MGED Ontology Class Hierarchy

• MGED CoreOntology– In synch with MAGE v.1– Stable class structure

• MGED ExtendedOntology– Classes for additional

terms as the usage of MO expands for genomics technologies

Page 9: The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania

Relationship ofMO to MAGE-OM

• MO class hierarchy follows that of MAGE-OM– Association to OntologyEntry

• MO provides terms for these associations by: – Instances internal to MO– Instances from external ontologies

• Take advantage of existing ontologies

Page 10: The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania

Relationship ofMO and MAGE-OM

Page 11: The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania

MO and References to External Ontologies

Page 12: The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania

MO and References to External Ontologies

Page 13: The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania

Desirable Microarray Queries

• Return all experiments with species X examined at developmental stage Y– Sort by platform type– Which are untreated? Treated?

• Treated with what compound?• How comparable are these results?

• These questions can be asked of all experiments annotated using the MGED Ontology.

Page 14: The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania

MO and Structured Queries

Page 15: The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania

Future Work

• Convert to OWL– W3C standard ontology language– Expressivity

• Add terms to describe– Data transformation and normalization

methods– Protocol types used by the Protein Data

Bank

Page 16: The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania

Future Work cont.

• Expand the MGED Extended Ontology by adding classes and terms to describe new domains and technologies– Toxicogenomics, ecotoxicogenomics and

pharmacogenomics …• A public forum for developing internationally

compatible and public infrastructure for reporting array-based toxicogenomics.

– Protein Standards Initiative• Defines community standards for data

representation in proteomics to facilitate data comparision, exchange and verification.

Page 17: The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania

Links

• mged.org• http://mged.sourceforge.net/

ontologies/MGEDontology.php

Page 18: The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania
Page 19: The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania
Page 20: The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania
Page 21: The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania

The Computational View of Microarray Information

Need an ontology to unambiguously represent this information.

Page 22: The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania

Issues to Discuss

• Burning Issues– Developing MO in synch with related efforts

(MAGE-OM v.2.0)– Use/presentation in annotation forms– Coverage of other technologies and

biological domains

• Flame retardant structure– ExtendedOntology

• Space to add new classes, terms and their relationship to one another

Page 23: The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania

Relationship of MO and MAGE-OM

Page 24: The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Microarray Information to be Shared

Page 25: The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania

Microarray Information to be Shared

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

ExperimentSample

RNA Extract

Labeled nucleic acid

Protocols

Hybridizations

Genes

Array Design

Microarray

Gene expression data matrix

normalization

integration