the semantic web for spatial data search femke reitsma university of maryland – college park...

23
The Semantic Web for Spatial Data Search Femke Reitsma University of Maryland – College Park [email protected]

Upload: angel-gutierrez

Post on 27-Mar-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: The Semantic Web for Spatial Data Search Femke Reitsma University of Maryland – College Park femke@geog.umd.edu

The Semantic Web for Spatial Data Search

Femke Reitsma

University of Maryland – College Park [email protected]

Page 2: The Semantic Web for Spatial Data Search Femke Reitsma University of Maryland – College Park femke@geog.umd.edu

Why Ontologies: Could the Semantic Web Meet Discovery

Challenges?

Why Ontologies: Could the Semantic Web Meet Discovery

Challenges?

“The semantic web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation” (Tim Burners-Lee et al. 2001)

The semantic web makes web pages machine “understandable” rather than just human understandable.

Page 3: The Semantic Web for Spatial Data Search Femke Reitsma University of Maryland – College Park femke@geog.umd.edu

Semantic web components:

2. Moving up the semantic web layers:

1. Basic components:

Semantic Web layers presented by Tim Berners-Lee

ontology + semantically marked-up web page

= semantic web

We are here

Page 4: The Semantic Web for Spatial Data Search Femke Reitsma University of Maryland – College Park femke@geog.umd.edu

Semantic Web Languages

• Primary languages:

RDF (Resource Description Framework)

RDFS (Resource Description Framework Schema)

OWL (Web Ontology Language)

• Historical development:

XML provides the basic syntax

RDF and RDFS adds some tags to XML

DAML+OIL add some tags to RDF

OWL extends and replaces (almost) DAML+OIL

Page 6: The Semantic Web for Spatial Data Search Femke Reitsma University of Maryland – College Park femke@geog.umd.edu

What is an ontology?

Big “O” Ontology vs little “o” ontology:

Ontology = metaphysics, the essence of being, reality

ontology = “a logical theory which gives an explicit, partial

account of a conceptualization” (Guarino and Giaretta, 1995 )

Page 7: The Semantic Web for Spatial Data Search Femke Reitsma University of Maryland – College Park femke@geog.umd.edu

What does an ontology look like?

Page 8: The Semantic Web for Spatial Data Search Femke Reitsma University of Maryland – College Park femke@geog.umd.edu

What does semantic web page look like?

Page 9: The Semantic Web for Spatial Data Search Femke Reitsma University of Maryland – College Park femke@geog.umd.edu

Ontology ↔ Semantic Content

Ontology: Dublin core ontologySemantic Web page: http://owl.mindswap.org

Page 10: The Semantic Web for Spatial Data Search Femke Reitsma University of Maryland – College Park femke@geog.umd.edu

ontology + semantic web =

• Computer parsable

• Inference ability

State code > city code > address code

Computer agent could deduce that a Cornell University address, being in Ithaca, must be in New York State, which is in the U.S., and therefore must be formatted to U.S. standards.

Page 12: The Semantic Web for Spatial Data Search Femke Reitsma University of Maryland – College Park femke@geog.umd.edu

Objective:

Explore the potential of the Semantic Web

for distributing spatial data

Page 13: The Semantic Web for Spatial Data Search Femke Reitsma University of Maryland – College Park femke@geog.umd.edu

Current GCMD Search

North America?

2950 records matched your query

Page 14: The Semantic Web for Spatial Data Search Femke Reitsma University of Maryland – College Park femke@geog.umd.edu

North America?

North America [2950]

Limit search by:

- Spatial resolution

- Temporal resolution

- GCMD keywords

Explore results by:

Canada [1348]USA [1602]

GCMD keywords

……

Key = ability to determine relationships between keywords without explicitly encoding them

Future GCMD Search

Page 15: The Semantic Web for Spatial Data Search Femke Reitsma University of Maryland – College Park femke@geog.umd.edu

North America?

GCMD Database

Sesame Ontology

Java Application

Progressing Towards Level 1:

Sesame = Open Source RDF Schema-based Repository and Querying facility

Page 16: The Semantic Web for Spatial Data Search Femke Reitsma University of Maryland – College Park femke@geog.umd.edu

Keywords → Ontologies• Importance of careful specification of relationships for ontology.

CATEGORY > TOPIC > TERM > VARIABLE

• For purpose of Semantic Web, keyword structure may need modification. e. g.

Hydrosphere > Ground Water > Saltwater Intrusion

e.g. the Variable Fetch is a measurable property of the Term Ocean Waves; however, the Variable Fisheries is a sub-topic of the Term Agricultural Aquatic Sciences.

Keywords: Projects, Sensors, Sources, Locations, IDN Nodes, Data Centers, Science Keywords, Services Keywords, URL Content Types, Chronostratigraphic Units

Page 17: The Semantic Web for Spatial Data Search Femke Reitsma University of Maryland – College Park femke@geog.umd.edu

DIF Schema

• XSLT style sheet to create DIF schema in Semantic Web language

• Mapping terms to ontology

• Avoiding a monolithic ontology by mapping terms to other ontologies

– e.g. Dublin Core

Page 18: The Semantic Web for Spatial Data Search Femke Reitsma University of Maryland – College Park femke@geog.umd.edu

DIFs

• XSLT style sheet to convert DIFs to Semantic Language

• Mapping terms to ontology and DIF Schema• Recording keywords of finest granularity• Avoiding a monolithic ontology by mapping terms

to other ontologies– e.g. Dublin Core, Cyc, DAML-time

Page 19: The Semantic Web for Spatial Data Search Femke Reitsma University of Maryland – College Park femke@geog.umd.edu

Sesame:

• Middleware

• GUI or API

• Database: PostgreSQL or Oracle

HTTP Protocol Handler Soap Protocol Handler

Request Router

Export ModuleQuery ModuleAdmin Module

Repository Abstraction Layer

Client 1 Client 2 Client 3

GCMD Repository:

-RDF DIF files

-Ontologies

-DIF Schema

Sesame

HT

TP

HT

TP

SO

AP

HTTP Protocol Handler Soap Protocol Handler

Request Router

Export ModuleQuery ModuleAdmin Module

Repository Abstraction Layer

Client 1 Client 2 Client 3

GCMD Repository:

-RDF DIF files

-Ontologies

-DIF Schema

Sesame

HT

TP

HT

TP

SO

AP

Page 20: The Semantic Web for Spatial Data Search Femke Reitsma University of Maryland – College Park femke@geog.umd.edu

Advantages for the GCMD

• Semantic Web presents database structure in a machine parsable format

• Ability to search for the semantic relationships among any DIF terms within the ontology

• Do not need to change the database structure when new classes and relationships are added

• Real advantages = when ontology is enriched

Page 21: The Semantic Web for Spatial Data Search Femke Reitsma University of Maryland – College Park femke@geog.umd.edu

UWG Assistance• Gene Major

– How do we handle the scalability issue with regards to population of the DIFs. We have now over 15,000 entries; updates require much work to a) determine if data is still viable (b) make revisions. Database revisions such as phone numbers, etc. are easy; content revisions are more labor intensive.

– How do we handle the same data sets being delivered from multiple systems (not data centers)..like OPeNDAP, NOMADS, THREDDS, etc. All may deliver the same data set, but how do we point to all those catalogs? How to we index the DIFs to do that. We could use Related_URL, but is that the right solution.

– How can we get interaction between data sets and publishers. In other words, what mechanisms can we use to link data sets with the current literature.

– How do you feel about potential privacy concerns over contact information within GCMD DIFs/SERFs

Page 22: The Semantic Web for Spatial Data Search Femke Reitsma University of Maryland – College Park femke@geog.umd.edu

• Stephanie Leicester– Suggestions about how to encourage DIF Authors to review and

update their records regularly

– Direction and guidance on developing a metadata standard for archived samples

• Heather Weir– Suggestions about how to increase the number of SERFs

– Direction and guidance with the Learning Center and Astronomy keywords

• Scott Ritz– To spread the word in their community (to data users and

producers).

– Encourage data holders they meet to submit metadata.

– Closer interaction with Science Coordinators: new data notifications, contacts.

Page 23: The Semantic Web for Spatial Data Search Femke Reitsma University of Maryland – College Park femke@geog.umd.edu

• Monica Holland– Continue to spread the word about GCMD.

• Cheryl Solomon– Suggest University sources for ecological

datasets– Suggest international sources for metadata

• Tyler Stevens– What direction we should take with GIS in the GCMD.

– More GIS contacts to work with to increase GIS within the GCMD