mapping phenotype ontologies for obesity and diabetes

Post on 11-May-2015

1.793 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Mapping Phenotype Ontologies

Chris MungallMonarch Initiative

http://monarchinitiative.org Lawrence Berkeley National Laboratory

PhenoBridges Workshop 2013

Outline

• Problem: multiple ontologies of relevance to the obesity/diabetes domain– By species– By category – How can we bring these together?

• Bridging ontologies using OWL axioms– Enables cross-domain semantic queries

• Integrated ontology-data views in Monarch• Challenges

– Modeling strategies– Tools

Ontologies for phenotype and disease

Tools:• OWLSim• BOQA • PhenoDigm / MouseFinder• Phenomizer• Phenomenet

We want to bridge species

Washington, N. L., Haendel, M. A., Mungall, C. J., Ashburner, M., Westerfield, M., & Lewis, S. E. (2009). Linking Human Diseases to Animal Models Using Ontology-Based Phenotype Annotation. PLoS Biol, 7(11). doi:10.1371/journal.pbio.1000247

Bridging across species requires bridging across ontologies

MP

HPHP:0012093Abnormality of endocrine pancreas physiology

MP:0009165abnormal endocrine pancreas morphology

??

Mammalian Phenotype OntologySmith, C. L., Goldsmith, Carroll-A. W., & Eppig, J. T. (2005). The Mammalian Phenotype Ontology as a tool for annotating, analyzing and comparing phenotypic information. Genome Biol, 6(1). doi:10.1186/gb-2004-6-1-r7

Used to annotate and query:• Genotypes• Alleles• GenesIn mice

Human Phenotype Ontology

Robinson, P. N. P. N., Koehler, S., Bauer, S., Seelow, D., Horn, D., Mundlos, S., K{"o}hler, S., et al. (2008). The Human Phenotype Ontology: a tool for annotating and analyzing human hereditary disease. American Journal of Human Genetics, 83(5), 610-615. Elsevier. doi:10.1016/j.ajhg.2008.09.017

Used to annotate:• Patients• Disorders• Genotypes• Genes• Sequence variantsIn human

Issue: bridging across categories/perspectives

MP:0005217abnormal pancreatic beta-cell morphology

GO:0003309pancreatic B cell differentiation

?

MPabnormal phenotypes

GO‘normal’ molecular,cellular and physiologicalprocesses

Phenotypes require more than “phenotype ontologies”

glucose metabolism (GO:00060

06)

Gene/protein function

data

glucose(CHEBI:17

234)

Metabolomics,

toxicogenomics

Data

Disease & phenotyp

e data

type II diabetes mellitus

(DOID:9352)

pyruvate(CHEBI:15

361)

DISEASE GO CHEBI

pancreatic beta cell

(CL:0000169)

transcriptomic data

CL

Bridging via lexical methods

• Approach– Create pairwise mappings between

ontologies– Use lexical methods and/or curation

• Advantages:– Large body of tools and willing text miners

• Disadvantages:– Semantics-free– Machine doesn’t understand text– Inexact matches

MPHP

Enhance lexical approach with OWL bridging axioms

• Key idea:– Describe the phenotype in a machine-

interpretable way• Break it down into digestible chunks!• Logical definition

– The machine will then be able to help you• Match phenotypes• Automate ontology checking and addition of new terms

• Approach:– Use Web Ontology Language (OWL), a description

logic to describe phenotypes– Use OWL reasoning to find connections

Mungall, C. J., Gkoutos, G., Washington, N., & Lewis, S. (2007). Representing Phenotypes in OWL. In C. Golbreich, A. Kalyanpur, & B. Parsia (Eds.), Proceedings of the OWLED 2007 Workshop on OWL: Experience and Directions. Innsbruck, Austria. http://www.webont.org/owled/2007/PapersPDF/paper_40.pdf

MPUberon(Anatomy)

CL(cell types)

PATO(qualities)

PATO(qualities)

Class: ‘abnormal pancreatic beta cell mass’EquivalentTo: ‘abnormal phenotype’ and has_entity some ‘type B pancreatic cell’ and has_quality some mass

MPHP

‘abnormal phenotype’ and has_entity some ‘type B pancreatic cell’ and has_quality some amount

‘abnormal phenotype’ and has_entity some ‘type B pancreatic cell’ and has_quality some ‘reduced amount’

Mungall, C. J., Gkoutos, G., Smith, C., Haendel, M., Lewis, S., & Ashburner, M. (2010). Integrating phenotype ontologies across multiple species. Genome Biology, 11(1), R2. doi:10.1186/gb-2010-11-1-r2

Köhler, S., Doelken, S. C., Ruef, B. J., Bauer, S., Washington, N., Westerfield, M., Gkoutos, G., et al. (2013). Construction and accessibility of a cross-species phenotype ontology along with gene annotations for biomedical research. F1000Research, 1–12. doi:10.3410/f1000research.2-30.v1

7181 / 9022 MPTerms are described

Phenotypes to metabolites

glucose homeostasis(GO:0042593) ≡

homeostasis(GO:0042592)

glucose(CHEBI:17234)

∃.has_participant

http://wiki.geneontology.org/index.php/Ontology_extensions

abnormal glucose homeostasis

(MP:0002078)

Linking cell types to proteins via GO

secretion(GO:0046903)

insulin(PR:000009054)

∃.has_output

⊑pancreatic beta cell(CL:0000169)

Insulin secretion(GO:0046903)

∃.capable_of

INS_HUMAN - P01308

Meehan, T., Masci, A. M., Abdulla, A., Cowell, L., Blake, J., Mungall, C. J., & Diehl, A. (2011). Logical Development of the Cell Ontology. BMC Bioinformatics, 12(1), 6. doi:10.1186/1471-2105-12-6

Uberon bridges single species anatomy ontologies

Mungall, C. J., Torniai, C., Gkoutos, G. V., Lewis, S. E., & Haendel, M. A. (2012). Uberon, an integrative multi-species anatomy ontology. Genome Biology, 13(1), R5. doi:10.1186/gb-2012-13-1-r5

Lexical methods• Obol :

grammar approach

• Entity matching

Curation• Edit bridge files• Edit source ontologies

OWL Reasoning• Elk• GULO• Jenkins

Iterative development and deployment

Kohler, S., Bauer, S., Mungall, C. J., Carletti, G., Smith, C. L., Schofield, P., Gkoutos, G. V, et al. (2011). Improving ontologies by automatic reasoning and evaluation of logical definitions. BMC Bioinformatics, 12(1), 418. doi:10.1186/1471-2105-12-418Mungall, C. J., Dietze, H., Carbon, S. J., Ireland, A., Bauer, S., & Lewis, S. (2012). Continuous Integration of Open Biological Ontology Libraries. Bio-Ontologies 2012 http://bio-ontologies.knowledgeblog.org/405

Integrated views in Monarch

http://monarchinitiative.org

Linking model systems tohuman diseases

Integrated views in Monarch

http://monarchinitiative.org

glucose metabolism (GO:00060

06)

Gene/protein function

data

glucose(CHEBI:17

234)

Metabolomics,

toxicogenomics

Data

Disease & phenotyp

e data

type II diabetes mellitus

(DOID:9352)

pyruvate(CHEBI:15

361)

DISEASE/PHENOTYPE

GO CHEBI

pancreatic beta cell

(CL:0000169)

transcriptomic data

CL

Roadblocks and pitfalls

• Lack of tool support for ontology development

• Many tools for ‘mapping after the fact’–Mapping should not be retrospective–Must be integrated into ontology

development lifecycle

• OWL Modeling pitfalls– Over-modeling– Under-modeling

Overcoming ontology development bottlenecks with TermGenie

• Developed for GO– Instant compositional terms for curators– OWL axioms are added at time of term

creation

• We are rolling out pheno-ontology instances– Trial run on FYPO and plant traits

http://termgenie.org

Modeling confusion and analysis paralysis

• absent pancreatic beta cells (MP:0009174)– Tempting to use OWL cardinality

• Does this represent the biology?

• decreased pancreatic beta cell number (MP:0003339)– Can’t do this with OWL cardinality!

• Lesson: don’t over-model in OWL

Modeling temporal progression

• How did there come to be absence of beta cells in the pancreas?

• What are the downstream effects?• Changes with ages– Hyperglycemic hypoglycemic

• Existing phenotype ontologies steer clear of causality– Next frontier

What I haven’t talked about

• Quantitative phenotypes• Assay vs phenotype• Behavioral phenotypes• Environments• Mining disease phenotypes from the literature• Clinical vocabularies (see Nathalie’s talk)• Modeling other model systems• The data!• Making use of the data and OWL axioms for

analysis (see Damian’s talk)• …a lot more

Questions/Summary

• Approaches to mapping– OWL bridging axioms

• Roadblocks and pitfalls– OWL modeling analysis paralysis– Lack of tool support– Need to push upstream in ontology engineering lifecycle– Modeling complex phenomena

• From observation to temporal progression and models of causality

• Tools– CrossSpeciesPheno– Available:

• GULO, TermGenie, OBO-Edit, Protégé 4, OWL Reasoners, Onto-Jenkins

– Required: integration upstream

• Charite– Sebastian Kohler– Sandra Doelken– Sebatian Bauer– Peter Robinson

• U of Oregon– Barbara Ruef– Monte Westerfield

• OHSU– Carlo Torniai– Nicole Vasilesky– Shahim Essaid– Matt Brush– Melissa Haendel

• Sanger– Anika Oehlrich– Damian Smedley

• University of Cambridge– George Gkoutos– Rob Hoehndorf– Paul Schofield

• Lawrence Berkeley– Nicole Washington– Suzanna Lewis

• UCSD– Amarnath Gupta– Jeff Grethe– Anita Bandrowski– Maryann Martone

• U of Pitt– Chuck Borneo– Harry Hochheiser

• JAX– Terry Meehan– Cynthia Smith

Acknowledgments

top related