ontoquest: exploring ontological data made easy
DESCRIPTION
OntoQuest: Exploring Ontological Data Made Easy. Authors: Li Chen, Maryann Martone, Amarnath Gupta , Lisa Fong, Mona Wong-Barnum. Background. Many application domains in the natural sciences are rapidly building ontologies To attempt to standardize the vocabulary of their domains - PowerPoint PPT PresentationTRANSCRIPT
OntoQuest: Exploring Ontological Data Made Easy
Authors: Li Chen, Maryann Martone, Amarnath Gupta, Lisa Fong, Mona Wong-Barnum
Background
Many application domains in the natural sciences are rapidly
building ontologies
To attempt to standardize the vocabulary of their domains
To record known relationships that have been established from
years of scientific research in the discipline
To use the ontology as the common framework to exchange,
assimilate and compare information
Experimental data collected by research groups
Curated data compiled from the literature
To establish relationships with data and ontologies from other
domains to achieve interoperability and information integration
The Problem / Requirement Need a system
To explore the ontology itself To relate the terms and relationships in an ontology to data
sources To explore multiple data sources as part of the ontology
exploration process To update the databases through the ontology exploration
tool To update the ontology and propagate the effects of the
update to the mappings between data sources and the ontology
The Problem / Requirement Need a system
To explore the ontology itself (OWL) To relate the terms and relationships in an ontology to data
sources (RDBMS, RDF, XML) To explore multiple data sources as part of the ontology
exploration process (instance inference) To update the databases through the ontology exploration
tool (instance Inference triggered by update) To update the ontology and propagate the effects of the
update to the mappings between data sources and the ontology (mapping change triggered by update)
OntoQuest Ingests any OWL-expressed ontology
Uses IBM’s IODT tool (modified) to shred the OWL ontology to a schema
Instances of ontology classes may reside locally or accessed from remote sources
Provides the ability for ontology exploration By traversal of any transitive relationship By SPARQL queries
Allows data exploration through ontology classes Allows single instance updates
OntoQuest Builds on IODT
Our system is developed on top of an IBM integrated ontology toolkit implements a high performance ontology repository built on
relational database A subset of W3C’s OWL and SPARQL query language Uses description logic reasoner for class-level inference and a
set of logic rules translated from DLP for instance-level inference
Hence, inference completeness and soundness on DLP can be guaranteed
Back-end database schema design supports efficient querying and inference, performance superior compared to Jena, Sesame etc.
IBM ToolKit
SKIL APIs
Biologist-Friendly GUI
Query Mediator
SQL
Cache
Updater Reasoner
. .
SQL
System Development Facts
OntoQuest has a domain user friendly GUI and a library of customized APIs Updater: enable inserting classes and instances incrementally
into the ontology repository Query Mediator: form user’s request as a query against the
global view; decompose it into sub-queries in forms of SQL and SPARQL and send to CCDB and CKB; reassemble the results and render an appropriate view (e.g. graphic) for the user
Reasoner: execute rules to compute indirect class memberships and properties
Cache: further enhance the system efficiency by caching or prefetching frequent query results
The system is still under development – some of the functionalities are not completed or need to be improved e.g., propagation of ontology updates
Data Integration with OntoQuest For every class,
the ids of the instances of the class are tracked from the respective data stores and maintained locally
a mapping is used to fetch instances of the class from the relevant store to a local instance store on demand
only the properties that are associated with the ontology classes are retrieved in a GAV fashion
all other properties are obtained (for now) only allowing the user to query the data source directly
The Application Setting for this Demo The Ontology
Developed by the neuroscientists in our group describes the subcellular anatomy of the nervous system, including cell types and their
subcellular properties and multicellular domains
The knowledge base was constructed as a directed graph using the open source tool Protégé (http://protege.stanford.edu), a freely available knowledge management tool written in Java.
The ontology is expressed in OWL-DL Since OWL-DL supports description logic, inferences are made from the property
rules e.g., protein Kv3.2 is located in the plasma membrane; if an instance of axon terminal
expresses Kv3.2, then it must have a plasma membrane.
Data Sources A Derby data store for literature-curated instances of subcellular anatomy
(CKB) A relational (MySQL) source containing experimental data from CCDB
Subcellular Ontology
Intercellular Junction
Multi-cellular Domain
Pinceau Node of Ranvier
Extracellular Space
Glomerulus NeuropilSynaptic Cleft
Subcellular Space
Nerve Cell
Neuron
Glia
Microglia Macroglia
Compartment
Dendrite Axon Cell body Spine
Dendritic Spine
Component
Post synaptic
Component
PSD
SER
Actin Filament
Ribosome
Orientation
Distribution
Property
Morphometrics
Shape
Compartment
Compartment
Shaft
Cytoplasm
Organelle
Cytoskeleton
Cilium
Specialization
Inclusion
Plasma Membrane
Component
Orientation
Distribution
Property
Morphometrics
Shape
Moleculesubclass
has-a
LEGEND
Demo Scenarios
Step 0: startup screen
Step 1: click to show subclass hierarchy by default
Step 1: other options for expanding different types of hierarchies e.g., the compartment types for Neuroepithelial_Cell and those for Neuron
Step 2: get the detailed info (instances and properties) of the subclass Dendrite of Neuron_Compartment
Step 2a: accessing the property values for the selected class
Step 2b: the CCDB image page corresponding to the selected instance Dendritic_Tree_1 is shown here
Step 2’: some concept (like Cellular_Dependent_Continuant here) has properties but no instances in CKB
Step 3: right click on a concept in the hierarchy pops up a list of view functions to choose from
Step 4: aggregate the has_Component values of all Dendrite instances; the last row shows statistics summary
You may also have noticed that instances of Dendrite include those of its subclasses (such as Dendrite_Tree)
Step 5: drill down to view instances of Dendrite_Tree, aggregate on several numeric type of property values
SPARQL Query
Add an Instnace
Edit Instance Properties
Ontology Store Properties
•What are the cellular components of a dendrite?29 instances of dendrite
1. Microtubules2. Mitochondria3. Hypolemmal cisternae4. Plasma membrane5. Smooth endoplasmic reticulum6. Rough endoplasmic reticulum7. Polyribosomes8. Neurofilaments
Average diameter = 3.2 umAverage length = 150 um
•How many dendrites does a Purkinje cell have?
3 instances of Purkinje cell dendritic tree1. Avg branch order = 222. Number of primary dendrites = 1.33. Avg number of branches = 760
**Computes aggregate properties from instances
“Rules” for cellular assembly