sonet (scientific observations network) and oboe (extensible observation ontology): mark...
TRANSCRIPT
SONet (Scientific Observations Network) and OBOE (Extensible Observation Ontology):
Mark Schildhauer, Director of ComputingNational Center for Ecological Analysis and SynthesisUniv. Calif, Santa Barbara
TDWG 2008, Fremantle AU Oct. 19-25
Facilitating data interoperability within the environmental and ecological sciences through
advanced semantic approaches
Motivation
Ecological data are highly heterogeneous…• Variable syntax (csv, xls),
• Structures (tables, rasters, hierarchical) • Semantics (terminology, units, methods)
• Derived from many disciplines: genomic, cellular, physiology, morphology, biodiversity, populations, communities, ecosystems
• Need for abiotic data too: hydrology, geospatial, climatology
Our Semantic Approach
Method for linking elements of data objects (e.g., columns in a table) to consistent and potentially rich sets of concepts
Semantic Annotations link EML attributes to concepts defined in a Formal Ontology
Store and retrieve annotations and ontologies in Metacat
OBOE Quick Overview
Extensible Observation Ontology (OBOE)
Based on the assumption that much of scientific data consists of observations
OBOE provides a high-level abstraction of scientific observations and measurements
Enables data (or metadata) structures to be linked to domain-specific ontology concepts
Observation Based Structured Query
• Both datasets contain “tree lengths” • Annotation search for “tree length” would return both datasets• Structured search allows the search to be limited by the observed entity (e.g. a tree or a tree branch)• Increase precision and recall
SONet: A Community-Driven Scientific Observations Network to achieve Semantic Interoperability of
Environmental and Ecological Data
Project Organizers
Mark Schildhauer1, Shawn Bowers2, Corina Gries3, Deborah McGuinness4, Philip Dibner5, Josh Madin6,
Matt Jones1, Luis Bermudez7, John Graybeal7
1NCEAS UC Santa Barbara, 2UC Davis Genome Center3CAP/LTER and Univ. of Arizona, 4McGuinness Associates,
5OGC Interoperability Institute, 6Macquarie University, 7Monterey Bay Aquarium Research Institute
Motivation
MANY different “semantic” efforts underway in earth/biodiversity/environmental sciences, all converging on use of OBSERVATIONAL data construct
SPECIALIZED needs and concerns of different domains may drive semantic technology solutions to be diverse and incompatible
OPPORTUNITY exists for communicating and coordinating among different domains to achieve greater interoperability of emerging semantic technology solutions
BENEFIT is providing cross-disciplinary scientists with more seamless and powerful access to a broad range of relevant data and information
Objectives of SONet
Broad Objectives
Address semantic interoperability issues in environmental and ecological data [sharing, discovery, integration]
Build a network of practioners (SONet), including domain scientists, computer scientists, and information managers
Build generic, cross-disciplinary data interoperability solutions
Immediate Goals to Develop
An extensible and open observations data model to unify existing domain-specific approaches
A semantic (ontology) framework for scientific terminology, and corresponding domain extensions
Demonstration prototypes using these to address current interoperability issues
Working Groups
Subgroup 1:Core Data Model for
Observations
Subgroup 2:Catalog of Common Field Observations
Subgroup 3:Scientist-Oriented Term Organization
Subgroup 4:Demonstration
Projects
Subgroup 1
Collect interoperability requirements Define common, unified data model Engage tool & data providers, data
consumers Subgroup 2
Identify and catalog common observation types (semantics)
Engage data providers and information managers
Subgroup 3
Define general extension ontologies of scientific terms
Focus work on outputs of group 2 Engage range of domain scientists Subgroup 4
Define and prototype demonstration projects
Ensure compatability of subgroups
• Each group consists of two team leads
• Postdoc funded to work on demonstration projects & help ensure compatibility across subgroups
Core SONetTeam
Workshops & Outreach
Community workshops
… to bring together project members, data managers, domain scientists, computer scientists, and members of the larger environmental informatics community
Workshop 1: Collect detailed requirements and use cases for each SONet subgroup
Workshop 2: Refine and extend use cases; Discuss and evaluate proposed data models and representations
Workshop 3: Present and discuss refined data models and representations; early evaluation and feedback
Workshop 4: Training; discuss and plan SONet sustainability
… continue from prior NSF workshop on observation data models … approximately 20-25 participants at each workshop
Initial Project Timeline
Workshops and meetings:Year 1: first community workshop, project meetingYear 2: second community workshop, project meetingYear 3: last two community workshops, including training
Project has just recently officially started
Year 1 Year 2 Year 3
Project Leaders Meeting (1)
(orientation & planning)
Project Leaders Meeting (2)
(evaluation & planning)
Community Workshop (1)
(requirements & use cases)
Community Workshop (2)
(use cases & modeling)
Community Workshop (3)
(modeling & refinement)
Community Workshop (4)
(training, sustainability)
setup project mgmt. infrastructure, Postdoc hiring
finalize communityparticipants,
meeting preparation
document results, begin implementation & interoperability
tests, setup network website
document results, continue impl. & interop. tests
continue impl. & interop. tests,
meeting preparation
finalize impl. & interop. tests,
sustainability planning
document results,execute plan
for sustainability