ea 3888 – conceptual modeling of biomedical knowledge faculty of medicine - university of rennes 1...

27
EA 3888 – Conceptual Modeling of Biomedical Knowledge Faculty of Medicine - University of Rennes 1 http://www.ea3888.univ-rennes1.fr Integrating and querying disease and pathway ontologies: building an OWL model and using RDFS queries Julie Chabalier, Olivier Dameron, Anita Burgun

Upload: robert-pollock

Post on 28-Mar-2015

218 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: EA 3888 – Conceptual Modeling of Biomedical Knowledge Faculty of Medicine - University of Rennes 1  Integrating and querying

EA 3888 – Conceptual Modeling of Biomedical KnowledgeFaculty of Medicine - University of Rennes 1

http://www.ea3888.univ-rennes1.fr

Integrating and querying disease and pathway ontologies:

building an OWL model and using RDFS queries

Julie Chabalier, Olivier Dameron, Anita Burgun

Page 2: EA 3888 – Conceptual Modeling of Biomedical Knowledge Faculty of Medicine - University of Rennes 1  Integrating and querying

EA 3888 – University of Rennes 1

Introduction

Disease description in current medical ontologies • Clinical features

• Etiology

• Location

• Morphology

Example: SNOMED Clinical Terms® (SNOMED CT®)

DiseaseDefinitional manifestation causative agent

finding siteassociated morphology

http://www.snomed.org/

Page 3: EA 3888 – Conceptual Modeling of Biomedical Knowledge Faculty of Medicine - University of Rennes 1  Integrating and querying

EA 3888 – University of Rennes 1

Introduction

Characterization of diseases : biological knowledge required

• Genes A gene mutation may result in a disease

• Metabolic pathways - A pathway may be shared by different phenotypes

• Biological processes- Different processes may explain different grades of a disease

Biological knowledge Absent from medical ontologies

Page 4: EA 3888 – Conceptual Modeling of Biomedical Knowledge Faculty of Medicine - University of Rennes 1  Integrating and querying

EA 3888 – University of Rennes 1

Objectives

Integration of disease and pathway ontologies

• Ontology integration Identify candidate ontologies

Get candidate ontologies in an adequate formalism

Integrate formalized ontologies

• Querying the resulting ontology Consistency checking

Exploiting biomedical knowledge

Page 5: EA 3888 – Conceptual Modeling of Biomedical Knowledge Faculty of Medicine - University of Rennes 1  Integrating and querying

EA 3888 – University of Rennes 1

Candidate ontologies

KEGG Orthology (KO) hierarchy

• Organization of metabolic pathway and disease maps in the KEGG knowledge base

• DAG of four levels

Page 6: EA 3888 – Conceptual Modeling of Biomedical Knowledge Faculty of Medicine - University of Rennes 1  Integrating and querying

EA 3888 – University of Rennes 1

Candidate ontologies

~ 20000 terms organized according to 3 hierarchies :

- Molecular Function

- Cellular Component

- Biological Process

Used to enrich the KO pathway definitions

Gene Ontology (GO) the Gene Ontology

Page 7: EA 3888 – Conceptual Modeling of Biomedical Knowledge Faculty of Medicine - University of Rennes 1  Integrating and querying

EA 3888 – University of Rennes 1

Candidate ontologies

SNOMED-CT: clinical description of diseases

Alzheimer's disease

findingSite

Intracranial glioma

Brain structure

Disorder of brain

Dementia

Cerebral structure

Used to enrich the KO disease definitions

findingSite

Organic mental disorder

Neoplasm of brain

Page 8: EA 3888 – Conceptual Modeling of Biomedical Knowledge Faculty of Medicine - University of Rennes 1  Integrating and querying

EA 3888 – University of Rennes 1

Formalism

OWL as a common formalism• Unambiguous combination of several ontologies (URI, namespaces)

• Defined semantics

• Expressiveness (e.g disjointness)

Getting candidate ontologies in OWL-DL• KO: conversion of the 3 upper levels (available in text)

• GO: extraction of Biological Process hierarchy (available in OWL)

• SNOMED: extraction and conversion of the relevant concepts and relations (from UMLS)

Page 9: EA 3888 – Conceptual Modeling of Biomedical Knowledge Faculty of Medicine - University of Rennes 1  Integrating and querying

EA 3888 – University of Rennes 1

Ontology integration

Setting up relationships between ontologies

• Aligning: defining relationships between terms (is-a, part-of, etc.)

• Mapping: defining equivalence relationships between terms

Page 10: EA 3888 – Conceptual Modeling of Biomedical Knowledge Faculty of Medicine - University of Rennes 1  Integrating and querying

EA 3888 – University of Rennes 1

Integration framework

BioMed Ontology

GOBiologicalProcesses

DiseaseandPathwaydescriptions

KOPathwaysDiseases

SNOMEDDiseases

Pathwaydescriptions

Diseasedescriptions

Page 11: EA 3888 – Conceptual Modeling of Biomedical Knowledge Faculty of Medicine - University of Rennes 1  Integrating and querying

EA 3888 – University of Rennes 1

Mapping GO processes – KO pathways

GObiologicalprocesses

KOPathwaysDiseases

Metamap program*: lexical mapping (labels and synonyms)

KO: Metabolism

KO: Carbohydrate metabolism

GO: Metabolism

GO: Macromolecule metabolism

GO: Carbohydrate metabolism

KO: Fructose and mannose metabolism

SNOMEDDiseases

*Aronson, A.R. (2001) Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program, Proceedings of the AMIA Symp., 17-21

Page 12: EA 3888 – Conceptual Modeling of Biomedical Knowledge Faculty of Medicine - University of Rennes 1  Integrating and querying

EA 3888 – University of Rennes 1

Aligning GO processes – KO pathways

GO: Carbohydrate metabolism

GO: Cellular carbohydrate metabolism

GO: Monosaccharide metabolism

GO: Hexose metabolism

GO: Fructose metabolism

GO: Mannose metabolism

KO: Carbohydrate metabolism

KO: Fructose and mannose metabolism

GO: atomic concepts KO: composite concepts

Patterns to segment and recompose KO terms before the mapping

KO: Fructose and mannose metabolism

Fructose mannosemetabolism metabolism

Page 13: EA 3888 – Conceptual Modeling of Biomedical Knowledge Faculty of Medicine - University of Rennes 1  Integrating and querying

EA 3888 – University of Rennes 1

Mapping & aligning GO processes – KO pathways

GO: Carbohydrate metabolism

GO: Cellular carbohydrate metabolism

GO: Monosaccharide metabolism

GO: Hexose metabolism

GO: Fructose metabolism

GO: Mannose metabolism

KO: Carbohydrate metabolism

KO: Fructose and mannose metabolism

Page 14: EA 3888 – Conceptual Modeling of Biomedical Knowledge Faculty of Medicine - University of Rennes 1  Integrating and querying

EA 3888 – University of Rennes 1

Mapping of KO diseases and SNOMED diseases

GObiologicalprocesses

KOPathwaysDiseases

SNOMEDDiseases

Metamap program

SN: Alzheimer's disease

SN: Organic mental disorder

SN: Dementia

KO: Human diseases

KO: Neurodegenerative disorders

KO: Alzheimer's disease

SN: Disorder of brain

Page 15: EA 3888 – Conceptual Modeling of Biomedical Knowledge Faculty of Medicine - University of Rennes 1  Integrating and querying

EA 3888 – University of Rennes 1

Alignment of pathways and diseases

GOBiologicalProcesses

KOPathwaysDiseases

• Condition of alignment : if, at least, one gene is involved in both a disease D and a pathway P :

12

SNOMEDDiseases

• Alignment: inferring relationships between :

1 - GO processes and KO diseases

2 - KO pathways and KO diseases

D PhasPathway

Page 16: EA 3888 – Conceptual Modeling of Biomedical Knowledge Faculty of Medicine - University of Rennes 1  Integrating and querying

EA 3888 – University of Rennes 1

1

Alignment of GO processes and KO diseases

GOBiologicalProcesses

KOPathwaysDiseases 2

SNOMEDDiseases

KEGG mapping (KEGG geneId - Uniprot id) GOA

Genes

Uniprot id

GO id

1hasPathway

Page 17: EA 3888 – Conceptual Modeling of Biomedical Knowledge Faculty of Medicine - University of Rennes 1  Integrating and querying

EA 3888 – University of Rennes 1

GOBiologicalProcesses

KOPathwaysDiseases

12

SNOMEDDiseases

Alignment of KO pathways and KO diseases

KO: MetabolismKO: Carbohydrate metabolism

KO: Glycolysis/GluconeogenesisKO: gene1KO: gene3

KO: MetabolismKO: Carbohydrate metabolism

KO: Glycolysis/GluconeogenesisKO: gene1KO: gene3

KO: Human diseases

KO: Neurodegenerative disorders KO: Alzheimer's disease

KO: gene1KO: gene2

KO: Human diseases

KO: Neurodegenerative disorders KO: Alzheimer's disease

KO: gene1KO: gene2 hasPathway

Page 18: EA 3888 – Conceptual Modeling of Biomedical Knowledge Faculty of Medicine - University of Rennes 1  Integrating and querying

EA 3888 – University of Rennes 1

Integration result

BioMed Ontology

13982 classes:

• 13555 classes from GO

• 281 classes from KO- 252 pathways classes- 19 disease classes

• 146 classes from SNOMED

Page 19: EA 3888 – Conceptual Modeling of Biomedical Knowledge Faculty of Medicine - University of Rennes 1  Integrating and querying

EA 3888 – University of Rennes 1

Integration results

• 144 KO pathways associated with GO processes (57%)

• 15 KO diseases associated with SNOMED Diseases (94%)

• 15 KO diseases associated with 836 distinct pathways (GO & KO)

3144 hasPathway relationships

BioMed Ontology

Page 20: EA 3888 – Conceptual Modeling of Biomedical Knowledge Faculty of Medicine - University of Rennes 1  Integrating and querying

EA 3888 – University of Rennes 1

Querying the BioMed Ontology

Exploiting knowledge and checking consistency

• Taking into account the explicit relationships

• RDFS is sufficient

RDF query language : SeRQL

• Implementation of SeRQL in Sesame is able to exploit RDFS semantics

• Exploitation of explicit relationships

Page 21: EA 3888 – Conceptual Modeling of Biomedical Knowledge Faculty of Medicine - University of Rennes 1  Integrating and querying

EA 3888 – University of Rennes 1

SeRQL queries

Example of an exploiting queryWhich pathways are shared by 2 neurological disorders : glioma &

Alzheimer’s disease?

SELECT DISTINCT Pathway, label(PathwayName) FROM

{kpath:ko05010} rdfs:subClassOf {SuperClass}, {SuperClass} rdf:type {owl:Restriction}, {SuperClass} owl:onProperty {ea3888hp:hasPathway}, {SuperClass} owl:someValuesFrom {Pathway}, {Pathway} rdfs:label {PathwayName}

INTERSECT SELECT DISTINCT Pathway, label(PathwayName) FROM

{kpath:ko05214} rdfs:subClassOf {SuperClass}, {SuperClass} rdf:type {owl:Restriction}, {SuperClass} owl:onProperty {ea3888hp:hasPathway}, {SuperClass} owl:someValuesFrom {Pathway}, {Pathway} rdfs:label {PathwayName}

SELECT DISTINCT Pathway, label(PathwayName) FROM

{kpath:ko05010} rdfs:subClassOf {SuperClass}, {SuperClass} rdf:type {owl:Restriction}, {SuperClass} owl:onProperty {ea3888hp:hasPathway}, {SuperClass} owl:someValuesFrom {Pathway}, {Pathway} rdfs:label {PathwayName}

INTERSECT SELECT DISTINCT Pathway, label(PathwayName) FROM

{kpath:ko05214} rdfs:subClassOf {SuperClass}, {SuperClass} rdf:type {owl:Restriction}, {SuperClass} owl:onProperty {ea3888hp:hasPathway}, {SuperClass} owl:someValuesFrom {Pathway}, {Pathway} rdfs:label {PathwayName}

Page 22: EA 3888 – Conceptual Modeling of Biomedical Knowledge Faculty of Medicine - University of Rennes 1  Integrating and querying

EA 3888 – University of Rennes 1

Query resultsWhich pathways are shared by 2 neurological disorders : glioma & Alzheimer’s disease? 37 pathways:

MAPK signaling pathwayFocal adhesionInsulin signaling pathwayMelanogenesisB cell receptor signaling pathwayheart developmentcentral nervous system developmentaxon guidancepeptidyl-serine phosphorylationprotein amino acid phosphorylationcell cyclecell-cell signalingcell cycle arrestlipid catabolic processlipid metabolic processubiquitin cycletransport

ErbB signaling pathwayWnt signaling pathwayprotein tetramerizationintracellular signaling cascadeprotein modification processglycogen metabolic processanageninduction of apoptosisnegative regulation of apoptosisapoptosisanti-apoptosisNatural killer cell mediated cytotoxicitycell proliferationDNA replicationchromosome organization and biogenesis calcium ion homeostasissignal transductionresponse to UVnegative regulation of cell growthcytoskeleton organization and biogenesis

Page 23: EA 3888 – Conceptual Modeling of Biomedical Knowledge Faculty of Medicine - University of Rennes 1  Integrating and querying

EA 3888 – University of Rennes 1

hasPathway

Query results

By leveraging the pathway hierarchy: 66 pathways (37 + 29)

Alzheimer’s disease

Intracellular protein transport

Protein transport into nucleus, translocation

GliomahasPathway

Page 24: EA 3888 – Conceptual Modeling of Biomedical Knowledge Faculty of Medicine - University of Rennes 1  Integrating and querying

EA 3888 – University of Rennes 1

Query results

Example of a consistency query:

• Detect if a specific pathway and a more general one are associated with a same disease

Disease1 Pathway1

Pathway2

hasPathway

hasPathway

Removal of redundant relationships

Page 25: EA 3888 – Conceptual Modeling of Biomedical Knowledge Faculty of Medicine - University of Rennes 1  Integrating and querying

EA 3888 – University of Rennes 1

Conclusion

Biomed Ontology project Integration

• Automatic method of integration of biomedical ontologies Deals with the huge quantity of biomedical data

Takes into account the frequent updates of biomedical sources

• BioMed ontology Integrates 3 biomedical ontologies (KO, GO, SNOMED)

Takes into account the formal evolution of the biomedical ontologies (OWL)

Querying• RDFS queries are enough:

to detect some basic inconsistencies of the BioMed ontology to exploit the BioMed ontology

Page 26: EA 3888 – Conceptual Modeling of Biomedical Knowledge Faculty of Medicine - University of Rennes 1  Integrating and querying

EA 3888 – University of Rennes 1

Perspectives

• Biological evaluation: study of glioma

• Increase the number of integrated biomedical sources (e.g. OMIM, BioPax)

• Improve the mapping/alignment techniques by taking into account the semantics in the patterns

• Associate a degree of confidence to the Disease/Pathway relationships (based for example on the GO evidence code)

Page 27: EA 3888 – Conceptual Modeling of Biomedical Knowledge Faculty of Medicine - University of Rennes 1  Integrating and querying

EA 3888 – University of Rennes 1

BioMed ontology project :

http://www.ea3888.univ-rennes1.fr/biomed_ontology/

[email protected]

[email protected]

[email protected]