eagle-i: a national network of biomedical research...

Post on 20-Sep-2020

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

www.eagle-i.org www.eagle-i.org

eagle-i: a national network of biomedical research resources

Cambridge Semantic Web Meetup, June 2011

Daniela Bourges-Waldegg eagle-i system architect, on behalf of the eagle-i Consortium

Outline

Introduction and motivation

•  The eagle-i consortium and network

•  Why eagle-i?

The eagle-i architecture and software stack

•  Layered ontology model

•  Ontology-driven development

Challenges of producing and consuming linked data

Concluding remarks

eagle-i Consortium – a national network 9 institutions diverse in geography, culture

and resources

Why eagle-i? the problem

Researcher A: Starting new project

needs:

1.  Expertise (technical skill set)

2.  Knowledge (understanding of domain)

3.  Material Resources (plasmids, antibodies, organisms, equipment, services…)

Why eagle-i? the problem

Researcher A: Starting new project

needs:

1.  Expertise (technical skill set) ---- ✔ 2.  Knowledge (understanding of domain)

3.  Material Resources (plasmids, antibodies, organisms, equipment, services…)

Why eagle-i? the problem

Researcher A: Starting new project

needs:

1.  Expertise (technical skill set) ---- ✔

2.  Knowledge (understanding of domain) ---- ✔ 3.  Material Resources (plasmids, antibodies,

organisms, equipment, services…)

Why eagle-i? the problem

Researcher A: Starting new project

needs:

1.  Expertise

2.  Knowledge

3.  Material Resources

•  Create

•  Purchase

•  Borrow/Collaborate

Why eagle-i? the problem

Researcher A: Starting new project

needs:

1.  Expertise

2.  Knowledge

3.  Material Resources

•  Create •  start now •  control quality •  time •  money

•  Purchase •  fast and easy •  costly •  may not be available

•  Borrow/Collaborate •  free •  faster than remaking •  collaborative •  uncertainty

$ ?"

Why eagle-i? the problem

Researcher B: Finishing a project

Has produced:

1.  Expertise

2.  Knowledge

3.  Material Resources

Why eagle-i? the problem

Researcher B: Finishing a project

Has produced:

1.  Expertise

2.  Knowledge

3.  Material Resources

Next Project, Publications

Why eagle-i? the problem

Researcher B: Finishing a project

Has produced:

1.  Expertise

2.  Knowledge

3.  Material Resources

1.  Deep Freeze •  always have it •  never know where to find

it 2.  Toss

•  reduce clutter •  save on space and energy •  Gone for good – may need

it again 3.  Organize

1.  always have it 2.  always find it 3.  easily share/collaborate 4.  save time and money in

long run 5.  takes time in the short run

Why eagle-i? the problem

1.  Deep Freeze •  always have it •  never know where to find

it 2.  Toss

•  reduce clutter •  save on space and energy •  Gone for good – may need

it again 3.  Organize

•  always have it •  always find it •  easily share/collaborate •  save time and money in

long run •  takes time in the short

run

1.  Create •  start now •  control quality •  time •  money

2.  Purchase •  fast and easy •  costly •  may not be available

3.  Borrow/Collaborate •  free •  faster than remaking •  collaborative •  uncertainty

The goal of eagle-i

Provide a mechanism to allow researchers who need, to connect to researchers who have.

Reduce redundancy in resource development.

Connect researchers with resources that they don’t know that they need.

JSU Data Center

eagle-i ontology

Search Application

Federated Network (SPIN)

Repository (RDF)

Data Tools

NIF, PubMed, Entrez Gene,

etc.

The eagle-i architecture

eagle-i design principles

Ontology-centric architecture

  Data collection and search user interfaces driven by ontology

  Repository performs certain types of ontology-based reasoning

  ETL components transform data to ontology-conformant instances Why?

 Applications can seamlessly adapt to ontology evolution without code changes

Data is stored as RDF and follows Linked Open Data principles

  Query any eagle-i repository via a SPARQL endpoint

  All eagle-i resource instances are linkable (an instance is simply an URI) Why?

  Storage model best-adapted to ontology-conformant data

  Flexibility, extensibility

The eagle-i software stack

Data collection

clients

Data tools

eagle-i ontology

Search Application

Sesame RDF store

REST API

The eagle-i software stack

Sesame RDF store

Data tools

Search Application

eagle-i-app-dataTools.owl

eagle-i-app.owl

Application- specific Ontologies

Ontology Memory Model

EIOntModel API

Jena/Pellet

Domain Ontologies

ero.owl

mesh-diseases.owl ro.owl iao.owl

Bfo.owl etc… Data

collection webapp (GWT)

Data management

webapp (GWT)

ETL

Lucene Search UI (GWT)

eagle-i ontology

eagle-i data collection tool

Type browser: allows navigation of an ontology branch

eagle-i primary types

Object property:

ontology term

Object property:

instance list

Embedded instance

Required property

Datatype property

eagle-i data collection tool

Workflow support

eagle-i search

Faceted search

Autocomplete from instances and ontology

eagle-i search

Instance pages with materialized properties

Layered ontology model

Modeling dichotomy

 Eagle-i ontology is a domain model aimed at capturing biological knowledge

 Application needs a model from which to derive behavior

Complexity

• Eagle-i ontology is interoperable; it builds on an upper ontology and imports numerous terms

• Not all ontology constructs translate into user-level constructs

Layered ontology model

• Application ontologies annotate domain ontologies with application-specific information and restrictions

Thing

Research Project

Human Study

Entity

Processual entity

Planned process

Occurrent

Epidemiological study Qualitative human study Quantitative human study

GWAS

Property 1

Property 2

Example

Ontology-driven development: process observations

Developing ontology-driven applications requires close collaboration between software developers and ontologists

•  Separation of concerns principle •  Process for owning, editing and annotating ontology files •  Annotations with a pure UI goal that require domain knowledge can be problematic

The applications provide ontology developers with a mechanism to rapidly test and refine their models for different usage scenarios

•  Data collection

•  Data retrieval

Challenges of producing and consuming linked data

Producing Linked Data

  Need to enforce ontology constraints

  ETL: in addition to producing ontology-conforming class instances, ETL processes need to inter-link them

Consuming Linked Data

•  Need to view the data through an ontology lens

•  Filter-out administrative and non-conforming triples

Concluding remarks

eagle-i is a proof-of-concept system

 A software suite

 A network of institutions

 An operational system with curated data

The eagle-i software and know-how are applicable to other problem spaces and domains

•  Ontology-driven framework goal: instantiate software stack for any ontology •  No code changes to core framework •  Annotate new domain ontology with eagle-i application ontology

eagle-i coming soon to open.med.harvard.edu

www.eagle-i.org www.eagle-i.org

Demo scenarios

Overview

o  Scenario description

o  Entry of data into the Web Tool

o  Curation and publishing of data

o  Searching on data in the repository

o  How ontology integration makes resources visible

Scenario

Primary Scenario: Relapsing Fever – Host-Pathogen Interactions & Human Exposure

Dr. Olivier Lucas studies mechanisms of and ecological risk for infection with Borrelia hermsii, the tick-borne Relapsing Fever agent. He believes he has identified a role for IL-17 in disease resolution in a mouse model and would like to examine contributing immune cell populations. He’s also hoping to begin a study assessing B. hermsii exposure/seroconversion within rural populations in Montana. Lastly, he has received some departmental funds to support a work-study position in his lab.

Dr. Lucas wants to…

1. Advertise his vacant work-study research opportunity.

2. Obtain an IL-17 receptor antibody for his mouse work.

3. Locate a source of human biospecimens from MT for his seroconversion study.

Supporting Scenarios

Supporting Scenario A: Mucosal Immunity and Th17 Populations

Dr. David Pascual studies mucosal immunity and contributing T cell populations. He has developed a monoclonal antibody for the IL-17 receptor and now that this work has been published, would like to share his antibody.

Dr. Pascual wants to…

1. Advertise his IL-17 receptor mAb to potential collaborators.

Supporting Scenario B: Lipid Profiles and Cardiovascular Disease Risk

Dr. Donna Williams is a human health researcher studying cardiovascular disease risk factors in rural, geographically-isolated Montana communities. Some time ago she completed a study in which blood draws were obtained to assess total lipid profiles. Sera from these individuals was collected and frozen back for a potential analysis of inflammatory mediators but she’s since shifted her research focus.

Dr. Williams wants to…

1. Put this frozen sera to good use.

top related