Ontology for Clinical Investigations (OCI): Representation of clinical research data in the
framework of a formal biomedical investigation ontology
Richard H. Scheuermann
U.T. Southwestern Medical Center
Outline
• Motivation - CTSA• Ontologies and OBO Foundry• Ontology for Biomedical Investigations (OBI)• Ontology for Clinical Investigations (OCI)
– Approach – Current status– Future direction
Clinical and Translational Science Award (CTSA)
Implementing biomedical discoveries Implementing biomedical discoveries made in the last 10 years demands an made in the last 10 years demands an evolution of clinical science.evolution of clinical science.
New prevention strategies and treatments New prevention strategies and treatments must be developed, tested, and brought must be developed, tested, and brought into medical practice more rapidly.into medical practice more rapidly.
CTSA awards will lower barriers between CTSA awards will lower barriers between disciplines, and encourage creative, disciplines, and encourage creative, innovative approaches to solve complex innovative approaches to solve complex medical problems.medical problems.
These clinical and translational science These clinical and translational science awards will catalyze change -- breaking awards will catalyze change -- breaking silos, breaking barriers, and breaking silos, breaking barriers, and breaking conventions.conventions.
Trial Design
Advanced Degree-Granting
Programs
Participant& CommunityInvolvement
RegulatorySupport
Biostatistics
ClinicalResources
BiomedicalInformatics
ClinicalResearch
Ethics
CTSACTSAHOMEHOME
NIH & other government
agencies
Healthcare organizations
IndustryIndustry
Each academic health center will create a home for clinical and translational science
Building a National CTSA Consortium
• Data management - to develop a comprehensive controlled information system infrastructure to capture and manage clinical and translational research data
• Data integration - to integrate clinical and translational research data with data and knowledge from external public database resources
• Data analysis - to support clinical and translational research data analysis by providing state-of-the-art software analytical tools
• Support - to provide training and support for CRIS use
Clinical Research Information System - utCRIS
High Level Design Vision
External Collaborators
External BioinformaticsData Sources - Entrez Gene - Uniprot - dbSNP - GEO/Array
Data Mining
Reporting
XML FeedsWeb Forms
HL7 IE
ETL Virtual Web Community
UTSW Researchers
Clinical Data
Reference Data
Experiment Data
Proposal Development & Tracking
Trial Recruiting
Protocol Management
Clinical Trials Management
CRF Development
Clinical Research
Data WarehouseutCR-DW
CTMSData
SecurityBiostatistics
Security
Security
Security
Patient Registries
Tissue Data Banks
Private Clinical Data
PACS
Other Clinical Data
Requirements• Accurate Representation
– therapeutic drug as a design variable vs. medical history– DNA as a therapeutic agent vs. analysis specimen
• Interoperability– unambiguous data exchange between research sites– effective data exchange between software applications
• Customization– support of study-specific details
• Dynamics– Role changes throughout and between studies
• Inference– Semantic queries (e.g. patients with autoimmune disease)
• Meta-analysis– Studies with common features (e.g. all studies where flu vaccine was
evaluated as a conditional variable)
Constraints
• Essential to build upon and extend, or map to, existing and emerging data standards (e.g. HL7, CDISC, ICD, UMLS, Epoch, RCT Schema, NCI Thesaurus, SNOMED-CT, etc.)
• Recognize the difference between Health IT and Research IT• Support wide variety of different clinical and translational
study types - reduce complexity by modeling commonalities• Support needs of multiple stakeholders - different uses of
same data• Standards should be easy to implement and use• Standards need to be easily and logically extensible• Support clinical research data use cases
Need for standard representations
• Minimum information sets• Standard vocabularies/ontologies• Standard data models
Definition of “Ontology”
Philosophical• “The study of that which exists” (ISMB 2005)• “The science of what is: of the kinds and structures of the objects, and their
properties and relations in every area of reality” (ISMB 2005)
Information/computer scientists• “A shared, common, backbone taxonomy of relevant entities, and the
relationships between them, within an application domain” (ISMB 2005)• “A computable representation of biological reality” (ISMB 2005)• “A structured vocabulary”• “A formal way of representing knowledge in which concepts are described both
by their meaning and their relationship to each other” (Bard 2004)• “A data model that represents a domain and is used to reason about the objects
in that domain and the relations between them” (Wikipedia)
• Provide clear thinking about how to structure information
• Support data integration, modeling, query processing, user interface development, data exchange/export
• To enforce data correctness
• To be able to map to database management systems
• To enables a computer to reason over the data
• To provide the capability to infer relationships that have not been explicitly defined
Ontology Goals
Problems with existing ontologies
• Overlapping domains• Development within a vacuum• Interoperability – ontologies should be able to work together and be
used by other ontologies• Current ontologies do not deal well with time and space• Lack of well-defined relationships• Lack of widespread use and acceptance• Built based on varying principles
Defining ontology principles: The OBO Foundry - 2006
The OBO foundry is a set of interoperable ontologies that adhere to a growing set of principles set forth for best practices in ontology development
The OBO Foundry
a voluntary initiative of developers of consensus biomedical ontologies
designed to be interoperable, logically coherent, biologically accurate and subject to update in light of scientific advance
15
The OBO Foundry
17
RELATION TO TIME
GRANULARITY
CONTINUANT OCCURRENT
INDEPENDENT DEPENDENT
ORGAN ANDORGANISM
Organism(NCBI
Taxonomy / placeholder
)
Anatomical Entity(FMA, CARO)
OrganFunction
(placeholder) Phenotypic
Quality(PaTO)
Biological Process
(GO)CELL AND CELLULAR
COMPONENT
Cell(CL)
Cellular Compone
nt(FMA, GO)
Cellular Function
(GO)
MOLECULEMolecule
(ChEBI, SO,RnaO, PrO)
Molecular Function(GO)
Molecular Process
(GO)
Initial OBO Foundry Ontologiesbuilding out from the original GO
18
Mature OBO Foundry ontologies (now undergoing reform)
Cell Ontology (CL)Chemical Entities of Biological Interest (ChEBI)Foundational Model of Anatomy (FMA)Gene Ontology (GO)Phenotypic Quality Ontology (PaTO)Relation Ontology (RO)Sequence Ontology (SO)
19
Ontologies being built to satisfy Foundry principles ab initio
Common Anatomy Reference Ontology (CARO)Environment Ontology (EnvO / GEO) Ontology for Biomedical Investigations (OBI)Ontology for Clinical Investigations (OCI, part of OBI)Protein Ontology (PRO)RNA Ontology (RnaO)
20
Foundry ontologies all work in the same way
– we have data– we need to make this data available for
semantic search and algorithmic processing– we create a consensus-based ontology for
annotating the data– and ensure that it can interoperate with
Foundry ontologies for neighboring domains
OBO Foundry provides a suite of basic science Reference Ontologiesdesigned to serve as modules for re-use in Application Ontologies such as:
Infectious Disease OntologyImmunology Ontology
Multiple Sclerosis Ontology
Mammalian Adult Neurogenesis Ontology
21
Ontology for BioMedical InvestigationsOBI
(previously FuGO)
Name of the presenter here
On behalf of the
OBI Coordination Committee
Name of the meeting here
OBI - Overview International collaboration (since 2006)
• Communities developing ontologies/terminologies- Unambiguous description of how the investigation was performed- Consistent annotation, powerful queries and data integration
Describe the laboratory workflow• Set of universal terms
- Investigation (organization, intent, design etc) - Material (biological and chemical, manipulation and transformation)- Protocols and instrumentations- Data generated and types of analysis performed on it
• Set of biological and technological domain-specific terms - To meet the annotation requirements of any given community
Part of the Open Biomedical Ontology (OBO) Foundry• Orthogonality and x-referencing with existing bio-ontologies• 'Interoperable by construction' with those under the Foundry
- Including Unit, Quality (PATO), Environment and Chemical (ChEBI) ontologies