sonet (scientific observations network) and oboe (extensible observation ontology): mark...

19
SONet (Scientific Observations Network) and OBOE (Extensible Observation Ontology): Mark Schildhauer, Director of Computing National Center for Ecological Analysis and Synthesis Univ. Calif, Santa Barbara TDWG 2008, Fremantle AU Oct. 19-25 Facilitating data interoperability within the environmental and ecological sciences through advanced semantic approaches

Upload: elizabeth-patterson

Post on 18-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

SONet (Scientific Observations Network) and OBOE (Extensible Observation Ontology):

Mark Schildhauer, Director of ComputingNational Center for Ecological Analysis and SynthesisUniv. Calif, Santa Barbara

TDWG 2008, Fremantle AU Oct. 19-25

Facilitating data interoperability within the environmental and ecological sciences through

advanced semantic approaches

Motivation

An oncoming deluge of ecological data…

Motivation

And locating desired information is already quite difficult…

…Why is this?

Motivation

Ecological data are highly heterogeneous…• Variable syntax (csv, xls),

• Structures (tables, rasters, hierarchical) • Semantics (terminology, units, methods)

• Derived from many disciplines: genomic, cellular, physiology, morphology, biodiversity, populations, communities, ecosystems

• Need for abiotic data too: hydrology, geospatial, climatology

Our Semantic Approach

Climbing the semantic ladder:

Ontologies

Semantic Annotations

Metadata

Data

Our Semantic Approach

Method for linking elements of data objects (e.g., columns in a table) to consistent and potentially rich sets of concepts

Semantic Annotations link EML attributes to concepts defined in a Formal Ontology

Store and retrieve annotations and ontologies in Metacat

Document Relationships (semantic annotation)

OBOE Quick Overview

Extensible Observation Ontology (OBOE)

Based on the assumption that much of scientific data consists of observations

OBOE provides a high-level abstraction of scientific observations and measurements

Enables data (or metadata) structures to be linked to domain-specific ontology concepts

OBOE– Extensible Observation Ontology

Slide from Josh Madin

Observation Based Structured Query

• Both datasets contain “tree lengths” • Annotation search for “tree length” would return both datasets• Structured search allows the search to be limited by the observed entity (e.g. a tree or a tree branch)• Increase precision and recall

Emerging Observational Data Models

SONet: A Community-Driven Scientific Observations Network to achieve Semantic Interoperability of

Environmental and Ecological Data

Project Organizers

Mark Schildhauer1, Shawn Bowers2, Corina Gries3, Deborah McGuinness4, Philip Dibner5, Josh Madin6,

Matt Jones1, Luis Bermudez7, John Graybeal7

1NCEAS UC Santa Barbara, 2UC Davis Genome Center3CAP/LTER and Univ. of Arizona, 4McGuinness Associates,

5OGC Interoperability Institute, 6Macquarie University, 7Monterey Bay Aquarium Research Institute

Motivation

MANY different “semantic” efforts underway in earth/biodiversity/environmental sciences, all converging on use of OBSERVATIONAL data construct

SPECIALIZED needs and concerns of different domains may drive semantic technology solutions to be diverse and incompatible

OPPORTUNITY exists for communicating and coordinating among different domains to achieve greater interoperability of emerging semantic technology solutions

BENEFIT is providing cross-disciplinary scientists with more seamless and powerful access to a broad range of relevant data and information

Objectives of SONet

Broad Objectives

Address semantic interoperability issues in environmental and ecological data [sharing, discovery, integration]

Build a network of practioners (SONet), including domain scientists, computer scientists, and information managers

Build generic, cross-disciplinary data interoperability solutions

Immediate Goals to Develop

An extensible and open observations data model to unify existing domain-specific approaches

A semantic (ontology) framework for scientific terminology, and corresponding domain extensions

Demonstration prototypes using these to address current interoperability issues

Working Groups

Subgroup 1:Core Data Model for

Observations

Subgroup 2:Catalog of Common Field Observations

Subgroup 3:Scientist-Oriented Term Organization

Subgroup 4:Demonstration

Projects

Subgroup 1

Collect interoperability requirements Define common, unified data model Engage tool & data providers, data

consumers Subgroup 2

Identify and catalog common observation types (semantics)

Engage data providers and information managers

Subgroup 3

Define general extension ontologies of scientific terms

Focus work on outputs of group 2 Engage range of domain scientists Subgroup 4

Define and prototype demonstration projects

Ensure compatability of subgroups

• Each group consists of two team leads

• Postdoc funded to work on demonstration projects & help ensure compatibility across subgroups

Core SONetTeam

Workshops & Outreach

Community workshops

… to bring together project members, data managers, domain scientists, computer scientists, and members of the larger environmental informatics community

Workshop 1: Collect detailed requirements and use cases for each SONet subgroup

Workshop 2: Refine and extend use cases; Discuss and evaluate proposed data models and representations

Workshop 3: Present and discuss refined data models and representations; early evaluation and feedback

Workshop 4: Training; discuss and plan SONet sustainability

… continue from prior NSF workshop on observation data models … approximately 20-25 participants at each workshop

Initial Project Timeline

Workshops and meetings:Year 1: first community workshop, project meetingYear 2: second community workshop, project meetingYear 3: last two community workshops, including training

Project has just recently officially started

Year 1 Year 2 Year 3

Project Leaders Meeting (1)

(orientation & planning)

Project Leaders Meeting (2)

(evaluation & planning)

Community Workshop (1)

(requirements & use cases)

Community Workshop (2)

(use cases & modeling)

Community Workshop (3)

(modeling & refinement)

Community Workshop (4)

(training, sustainability)

setup project mgmt. infrastructure, Postdoc hiring

finalize communityparticipants,

meeting preparation

document results, begin implementation & interoperability

tests, setup network website

document results, continue impl. & interop. tests

continue impl. & interop. tests,

meeting preparation

finalize impl. & interop. tests,

sustainability planning

document results,execute plan

for sustainability

Observation standards for review

Opportunity for Collaboration

TDWG community interests and SONet?

Observations and Specimen Records Interest Group

Observations Task Group

Contact Steve Kelling (OSR); Matt Jones (Observations Task Group)

Biological Descriptions Interest Group (SDD)