c. torniai , m. brush, n. vasilevsky , e. segerdell ,

20
Developing an application ontology for biomedical resource annotation and retrieval: challenges and lessons learned C. Torniai, M. Brush, N. Vasilevsky, E. Segerdell, M. Wilson, T. Johnson, K. Corday, C. Shaffer and M. Haendel ICBO 2011

Upload: bethan

Post on 23-Feb-2016

42 views

Category:

Documents


0 download

DESCRIPTION

Developing an application ontology for biomedical resource annotation and retrieval: challenges and lessons learned . C. Torniai , M. Brush, N. Vasilevsky , E. Segerdell , M. Wilson, T. Johnson, K. Corday, C. Shaffer and M. Haendel ICBO 2011. Outline. eagle- i project Aims - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: C. Torniai , M. Brush, N.  Vasilevsky , E.  Segerdell ,

Developing an application ontology for biomedical resource annotation

and retrieval:challenges and lessons learned

C. Torniai, M. Brush, N. Vasilevsky, E. Segerdell, M. Wilson, T. Johnson, K. Corday, C. Shaffer and M. Haendel

ICBO 2011

Page 2: C. Torniai , M. Brush, N.  Vasilevsky , E.  Segerdell ,

eagle-i project Aims Ontology role

eagle-i ontology Requirements Implementation

Implementation choices Challenges

Outline

c o n s o r t i u m

Page 3: C. Torniai , M. Brush, N.  Vasilevsky , E.  Segerdell ,

eagle-iNIH funded pilot project working to make scientific resources more visible via a federated network of nine institutional repositories

Index invisible resources

reagents, protocols, techniques, instruments, expertise, organisms, software, training, human studies, biological specimens, etc.

Ontology-driven approach to research resource annotation and discovery

Facilitate development of shared semantic entities that can be referenced in publications, databases, experiments, etc.

c o n s o r t i u m

Page 4: C. Torniai , M. Brush, N.  Vasilevsky , E.  Segerdell ,

c o n s o r t i u m

1) Represent collected resource information

2) Use the set of ontologies to control the data collection and search applications user-interface (UI) and logic

3) Build a set of ontologies that are reusable and interoperable with other ontologies and existing efforts for representing biomedical entities

Ontology development drivers

Page 5: C. Torniai , M. Brush, N.  Vasilevsky , E.  Segerdell ,

c o n s o r t i u m

Ontology role in eagle-i architecture

eagle-i ontologies

Federated Network

Repositories (RDF)

NIF, PubMed Entrez Gene

Search Application

Data Collection Application

Resource informationcollection

Page 6: C. Torniai , M. Brush, N.  Vasilevsky , E.  Segerdell ,

Ontology/Method Scope/Purpose

Basic Formal Ontology (BFO) Upper ontology

Information Artifact Ontology (IAO) Ontology metadata

Relation Ontology (RO) Common properties

Minimum Information to Represent External Ontology

Terms (MIREOT)

Reuse classes and properties from external ontologies

Implementation

c o n s o r t i u m

Page 7: C. Torniai , M. Brush, N.  Vasilevsky , E.  Segerdell ,

c o n s o r t i u m

Ontology layersGoal: to decouple research resources representation from information used for application appearance and behavior

Application specific moduleClasses, annotation properties and individuals required to drive the UIs

eagle-i core ontologyClasses and properties used to represent information about biomedical research resources

MIREOT filesExternally sourced classes and properties

Page 8: C. Torniai , M. Brush, N.  Vasilevsky , E.  Segerdell ,

c o n s o r t i u m

eagle-i core and MIREOTed sourceseagle-i core ontology: 1283 classes, 56 object

properties, and 61 data properties.

External Ontologies Purpose/subsets Classes

Ontology of Biomedical

Investigations (OBI)research material entities, processes, devices, roles 509

NCBI Taxonomy Organisms taxa 192

VIVO ontology people, organization, publications 20

Ontology of Clinical Research (OCRe)

human study designs and facets 19

Biomdedical Resource Ontology (BRO) instruments 13

Page 9: C. Torniai , M. Brush, N.  Vasilevsky , E.  Segerdell ,

c o n s o r t i u m

Application-specific module

UI Annotation Definition fileDefinition of UI annotation properties and sets of values for these properties

UI Annotations file Holds annotations made on eagle-I core and MIREOTed classes and properties

Contains properties and classes required to drive the UIs of the data collection and search applications

Melissa Haendel
Page 10: C. Torniai , M. Brush, N.  Vasilevsky , E.  Segerdell ,

c o n s o r t i u m

Examples of annotation values and use

Label Description Example

Primary Resource TypeDenotes classes for which instances are

collected ‘instrument’,

‘biospecimen’, ‘protocol’

Data Model Exclude

Denotes classes or properties that are not included in the model

used for the data tool or the search tool UIs

BFO classes such ‘continuant’ or

‘occurrent’ or RO relations such ‘precedes’

Embedded Class

Denotes a class for which instances can only

be created in the context of an

embedding class

‘antibody immunogen’ created within

‘antibody’, ‘construct insert’ created within

‘plasmid’

Page 11: C. Torniai , M. Brush, N.  Vasilevsky , E.  Segerdell ,

Additional application-specific propertiesProperty

LabelDescription Example Property

Type

eagle-i domain constraint

Used to specify the domain of an imported property.

Each annotation will contain the URI of one

class

Value set to “OBI_0000245”

(‘organization’) for RO property

‘location_of’’

Data Property

eagle-i range constraint

Used to specify the range of an imported property.

Each annotation will contain the URI of one

class

Value set to “ERO_0000004”

(‘instrument’) for RO property

‘located_in’

Data Property

eagle-i preferred label

Defines the value of preferred label to display in the data collection tool

and search UIs

Capitalized ‘Organization’ for

OBI_0000245 (‘organization’)

Annotation Property

Page 12: C. Torniai , M. Brush, N.  Vasilevsky , E.  Segerdell ,

Classes annotated with ‘primary resource type’

Construct insert is an example of a resource annotated as an ‘embedded class’,

‘eagle-i preferred definition’ is used for tooltips

‘eagle-i preferred label’ is used for the display nameProperty annotated as ‘’primary property’

Technique is annotated as ‘referenced taxonomy’

Data Collection Application

Nicole Vasilevsky
I don't know if you'd have time- but adding animation to each of these points would make this slide less busy. But it could take more time...I do like this slide though =)
Page 13: C. Torniai , M. Brush, N.  Vasilevsky , E.  Segerdell ,

c o n s o r t i u m

Reuse of existent ontologies Ontology Layers

Application-specific module Community coordination and alignment Best practices and tools

Challenges and benefits

Page 14: C. Torniai , M. Brush, N.  Vasilevsky , E.  Segerdell ,

c o n s o r t i u m

BFO and the relation ontology (RO) OBO Foundry orthogonality principle

Advantages – Integration with other ontologies– Ease the design process– Data integration and publication (Linked Open Data)

Challenges– Need to exclude some classes (continuant, occurrent) from UI

visualization after the inferred module has computed– Domain and Range in RO not specified or not specific enough for an

application– Not all relevant ontologies are built using BFO and RO

Reuse of existent ontologies

Page 15: C. Torniai , M. Brush, N.  Vasilevsky , E.  Segerdell ,

c o n s o r t i u m

Advantages– Effective means to drive an application UI while

maintaining interoperability with external ontologies and data sources

– Facilitate parallel concurrent development Challenges– Keeping the annotations current with the core module– Risk of excessive proliferation of annotation properties

as quick way to simplify application development complexity

Ontology layers

Page 16: C. Torniai , M. Brush, N.  Vasilevsky , E.  Segerdell ,

c o n s o r t i u m

Requirements for bridging the gap between an application and domain-specific ontologies – Application-specific labels and definitions– Exclusion of sets of classes and properties from

the model used by the application– Restriction of domain and range for some

imported properties – Definition of display order of object and data

properties at class level

Application-specific module

Page 17: C. Torniai , M. Brush, N.  Vasilevsky , E.  Segerdell ,

c o n s o r t i u m

Commitment to collaboration with similar efforts aimed at resource modeling – Aligned high level models with NIF, RDS, VIVO– Service, instrument (device) implemented in OBI and reused by NIF and

eagle-i– Coordinated representation of reagents, biospecimens, and genotype

information (in progress)

Challenges– Process is time consuming and it requires extra implementation efforts

• Implement and import back from reference ontologies

– Application ontologies have peculiar requirements • Example: Service hierarchy in eagle-i based on type of process rather than

input and output of the process (OBI)

Community coordination

Page 18: C. Torniai , M. Brush, N.  Vasilevsky , E.  Segerdell ,

c o n s o r t i u m

Reusing/referencing existent ontologies– Ontofox, OWL module extractor, NCBO extractor service

Have tools integrated in ontology editors (Protégé)– Effective methods for managing and syncing MIREOTed terms

Have several “community views” or ‘slims’ that could be directly imported with different level of complexity

Best practices and tools

Page 19: C. Torniai , M. Brush, N.  Vasilevsky , E.  Segerdell ,

c o n s o r t i u m

Developing an ontology-driven application has been an important benchmark for usage of biomedical ontologies

We have designed a layered set of ontologies, consisting of a broadly applicable core ontology and application-specific module– Requirements and principles to inform a general design pattern

Future steps Refining, documenting and sharing requirements and lessons

learned Engage in efforts addressing the issues we have experienced

Conclusion

Page 20: C. Torniai , M. Brush, N.  Vasilevsky , E.  Segerdell ,

c o n s o r t i u m

Thank you

eagle-i core module: http://code.google.com/p/eagle-i/eagle-i search: http://eagle-i.netusername: ohsu-guest password: eagle-i-ohsu

Carlo [email protected]

Acknowledgments: Ted Bashor, Rob Frost, Larry Stone and Daniela BourgesProject funded through NIH/NCRR ARRA award #U24RR029825