alzpharm: an rdf use case for semantic web in neuroscience

21
AlzPharm: an RDF Use Case for Semantic Web in Neuroscience Yale Center for Medical Informatics (YCMI)

Upload: petra

Post on 19-Jan-2016

30 views

Category:

Documents


3 download

DESCRIPTION

Yale Center for Medical Informatics (YCMI). AlzPharm: an RDF Use Case for Semantic Web in Neuroscience. SeS2006 Workshop, Beijing, China (Sept. 3, 2006). Authors Hugo Y.K. Lam ( CB&B Ph.D. Program) Kei Cheung (YCMI) Luis Marenco (YCMI) Perry Miller (YCMI) Nian Liu (YCMI) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: AlzPharm: an RDF Use Case for Semantic Web in Neuroscience

AlzPharm: an RDF Use Case for Semantic Web in Neuroscience

Yale Center for Medical Informatics(YCMI)

Page 2: AlzPharm: an RDF Use Case for Semantic Web in Neuroscience

SeS2006 Workshop, Beijing, China (Sept. 3, 2006)

• Authors– Hugo Y.K. Lam (CB&B Ph.D. Program) – Kei Cheung (YCMI)– Luis Marenco (YCMI) – Perry Miller (YCMI)– Nian Liu (YCMI)– Chiquito Crasto (YCMI)– Tim Clark (Mass. General Hospital, Harvard

University)– Yong Gao (Partners)– June Kinoshita (AlzForum)– Elizabeth Wu (AlzForum)– Gwen Wong (AlzForum)– Gordon Shepherd (Yale Neurobiology)– Tom Morse (Yale Neurobiology)– Susie Stephens (Oracle)

Page 3: AlzPharm: an RDF Use Case for Semantic Web in Neuroscience

Overview• Most of the neuroscience databases are

neither integrated nor interoperating

• A domain ontology is insufficient for integrating neuroscience data spanning multiple domains

• We present a Semantic Web approach to building an e-Neuroscience data integration framework, which involves using RDF as a standard data model to facilitate representation and integration of data

Page 4: AlzPharm: an RDF Use Case for Semantic Web in Neuroscience

e-Neuroscience• Involves developing tools, technologies,

and infrastructure to support multidisciplinary and collaborative science enabled by the Internet

• Aims to address data integration problem in neuroscience

• Fits the informatics-oriented goal of the Human Brain Project initiated by NIH

• Provides a better understanding of brain function by integrating different levels of brain data.

Page 5: AlzPharm: an RDF Use Case for Semantic Web in Neuroscience

Current Issues•Registry

–Keyword-based search approach suffers from the problem of specificity and sensitivity

–Centralized approach to registering resources may not be scalable. E.g. NDG

Page 6: AlzPharm: an RDF Use Case for Semantic Web in Neuroscience

Current Issues (cont’d)• No Links

between related databases

Page 7: AlzPharm: an RDF Use Case for Semantic Web in Neuroscience

Semantic Web Approach

Representing and Integrating Data

Page 8: AlzPharm: an RDF Use Case for Semantic Web in Neuroscience

Semantic Web• Exposes the semantics of Web-accessible

data in a standard machine-readable way so that the data can be more easily interpreted and integrated by computer programs (or Web agents)

• Components of the Semantic Web technologies:

– Ontology

– Ontological Languages

– Semantic-Web-aware Tools (e.g., databases)

Page 9: AlzPharm: an RDF Use Case for Semantic Web in Neuroscience

RDF Data Modeling• Uses the Oracle RDF Data Model (which

is installed on a Linux server) to build a Semantic Web data warehouse for integrating datasets extracted from two independently-developed neuroscience databases:

– BrainPharm (a subdatabase of SenseLab)

– SWAN (Semantic Web Applications in Neuromedicine)

Page 10: AlzPharm: an RDF Use Case for Semantic Web in Neuroscience

RDF Data Modeling• BrainPharm

– A database under development to support research on drugs for the treatment of different neurological disorders (http://senselab.med.yale.edu/senselab/BrainPharm/alzData.asp)

– Contains pharmacological agents that act on neuronal receptors and signal transduction pathways in the normal brain and in nervous disorders such as Alzheimer’s Disease (AD)

– Enables searches for drug actions at the level of key molecular constituents, cell compartments and individual cells

Page 11: AlzPharm: an RDF Use Case for Semantic Web in Neuroscience

RDF Data Modeling• SWAN (http://swan.mindinformatics.org/)

– A project to develop knowledge management tools and resources for Alzheimer Disease (AD) researchers, based on an ecosystem model of scientific discourse

– Uses an upper ontology, including the following components: scientists, experiments, publications, bibliographic databases, research collaborations, scientific web communities, and etc

– Implemented using Semantic Web technology

– Represents data in RDF format,– Currently stores a subset of data obtained

from the Alzheimer Research Forum (http://www.alzforum.org)

Page 12: AlzPharm: an RDF Use Case for Semantic Web in Neuroscience

Data Conversion & Loading• The drug-related (chemical) information

are extracted from BrainPharm • The SWAN hypotheses and publications

are extracted from Alzforum• SWAN data are already available in

RDF format • BrainPharm exports data in its own

XML format called EDSP (Electronic DataSet Protocol)

• Convert the EDSP/XML format into the corresponding RDF/XML format using XSL Transformation (XSLT)

• Load both the SWAN and BrainPharm RDF datasets into the Oracle RDF Data Model using its data loader program, which takes RDF data in N-triple format

Page 13: AlzPharm: an RDF Use Case for Semantic Web in Neuroscience

RDF Based Queries• Oracle has extended SQL to provide

support for an RDF query language, which allows users to perform queries against multiple RDF datasets

• The following two examples illustrate how such queries can be made to retrieve and integrate data from BrainPharm and SWAN

Page 14: AlzPharm: an RDF Use Case for Semantic Web in Neuroscience

RDF Based Queries• Example One

– Target

• Query BrainPharm to group and count AD drugs based on their molecular targets

– Result

• There are 2 groups of drugs for AD.

• The first one contains 5 drugs that act as acetylcholinesterase inhibitors.

• The second one contains only 1 drug that is a N-methyl-D-aspartic acid (NMDA) receptor antagonist.

Page 15: AlzPharm: an RDF Use Case for Semantic Web in Neuroscience

RDF Based Queries•Example One

–Query

Page 16: AlzPharm: an RDF Use Case for Semantic Web in Neuroscience

RDF Based Queries• Example One

– Remarks• Current implementations of RDF

query languages (RQL) by specialized RDF stores do not support aggregate functions (e.g., COUNT, SUM and AVERAGE) via “GROUP BY”

• The Oracle RDF query language supports such functions, as it is a hybrid between RQL and SQL

Page 17: AlzPharm: an RDF Use Case for Semantic Web in Neuroscience

RDF Based Queries• Example Two

– Target• Retrieves the information (stored

in BrainPharm) about the AD drug “Donepezil” and publications (stored in SWAN) whose titles or abstracts contain the term “Donepezil” (case-insensitive)

• Demonstrates the use of RDF inferencing based on the parent-child (is-a) relationship between the Publication class (e.g., original articles retrieved from PubMed) and ARFPublication class (e.g., PubMed articles that have been commented by researchers/curators associated with Alzforum) as defined in the SWAN RDF Schema

Page 18: AlzPharm: an RDF Use Case for Semantic Web in Neuroscience

RDF Based Queries•Example Two

–Query (Partial)

Page 19: AlzPharm: an RDF Use Case for Semantic Web in Neuroscience

RDF Based Queries• Example Two

– Result• With the is-a inference rule

incorporated into the query, it finds a total of 19 publications that are linked to claims and/or hypotheses that have to do with the effect of Donepezil on AD treatment

• Among these publications, one of them belongs to the ARFPublication class (i.e., one of the 19 publications is ARF-commented)

• Given the ID (PubMed ID) of the commented publication, the user can retrieve the detailed comments through the Alzforum Web site

Page 20: AlzPharm: an RDF Use Case for Semantic Web in Neuroscience

RDF Based Queries• Example Two

– Remarks• In the SWAN dataset, publications

(e.g., those retrieved from PubMed) are treated as instances of the Publication class

• We define publications that have been commented in the Alzforum as instances of the ARFPublication class

• The Oracle RDF Data Model allows us to create rules for hierarchical relationships from the RDFS for the data so that it enables us to find out all the publications and its subclasses (e.g. the ARF publications)

Page 21: AlzPharm: an RDF Use Case for Semantic Web in Neuroscience

AlzPharm

Oracle/RDF SWAN

BrainPharmAlzForum