ewa deelman, deelman virtual metadata catalogs: augmenting existing metadata catalogs with semantic...

19
Ewa Deelman, www.isi.edu/~deelman Virtual Metadata Catalogs: Augmenting Existing Metadata Catalogs with Semantic Representations Yolanda Gil, Varun Ratnakar, and Ewa Deelman USC Information Sciences Institute

Upload: cecily-gallagher

Post on 18-Jan-2018

237 views

Category:

Documents


0 download

DESCRIPTION

Ewa Deelman, D a t a a n d / o r A n a l y s i s D i s c o v e r y D a t a, M e t a d a t a a n d P r o v e n a n c e M g n t E x e c u t i o n A n a l y s i s D e f i n i t i o n a n d M a p p i n g A na l ys i s D e sc r i p t i on Scientific Analysis Derived Data Metadata Provenance Raw Data Metadata

TRANSCRIPT

Page 1: Ewa Deelman,  deelman Virtual Metadata Catalogs: Augmenting Existing Metadata Catalogs with Semantic Representations Yolanda Gil, Varun Ratnakar,

Ewa Deelman, www.isi.edu/~deelman

Virtual Metadata Catalogs:Augmenting Existing Metadata

Catalogs with Semantic Representations

Yolanda Gil, Varun Ratnakar, and Ewa Deelman

USC Information Sciences Institute

Page 2: Ewa Deelman,  deelman Virtual Metadata Catalogs: Augmenting Existing Metadata Catalogs with Semantic Representations Yolanda Gil, Varun Ratnakar,

Ewa Deelman, www.isi.edu/~deelman

Data

and

/or

Anal

ysi s

D

iscov

ery

Data, Metadata and

Provenance MgntExecu

tion

Analysis

Definition and M

appingAnalysis Description

Scientific Analysis

Derived DataMetadata

Provenance

Raw DataMetadata

Page 3: Ewa Deelman,  deelman Virtual Metadata Catalogs: Augmenting Existing Metadata Catalogs with Semantic Representations Yolanda Gil, Varun Ratnakar,

Ewa Deelman, www.isi.edu/~deelman

Data

and

/or

Anal

ysi s

D

iscov

ery

Data, Metadata and

Provenance MgntExecu

tion

Analysis

Definition and M

appingAnalysis Description

Scientific Analysis

Derived DataMetadata

Provenance

Raw DataMetadata

Page 4: Ewa Deelman,  deelman Virtual Metadata Catalogs: Augmenting Existing Metadata Catalogs with Semantic Representations Yolanda Gil, Varun Ratnakar,

Ewa Deelman, www.isi.edu/~deelman

Data

and

/or

Anal

ysi s

D

iscov

ery

Data, Metadata and

Provenance MgntExecu

tion

Analysis

Definition and M

appingAnalysis Description

Scientific Analysis

Derived DataMetadata

Provenance

Raw DataMetadata

Page 5: Ewa Deelman,  deelman Virtual Metadata Catalogs: Augmenting Existing Metadata Catalogs with Semantic Representations Yolanda Gil, Varun Ratnakar,

Ewa Deelman, www.isi.edu/~deelman

Data

and

/or

Anal

ysi s

D

iscov

ery

Data, Metadata and

Provenance MgntExecu

tion

Analysis

Definition and M

appingAnalysis Description

Scientific Analysis

Derived DataMetadata

Provenance

Raw DataMetadata

Page 6: Ewa Deelman,  deelman Virtual Metadata Catalogs: Augmenting Existing Metadata Catalogs with Semantic Representations Yolanda Gil, Varun Ratnakar,

Ewa Deelman, www.isi.edu/~deelman

Raw DataMetadata

Derived DataMetadata

Provenance

Analysis Description

Raw DataMetadata

Derived DataMetadata

Provenance

Analysis Description

Raw DataMetadata

Derived DataMetadata

Provenance

Analysis Description

Raw DataMetadata

Derived DataMetadata

Provenance

Analysis Description

Page 7: Ewa Deelman,  deelman Virtual Metadata Catalogs: Augmenting Existing Metadata Catalogs with Semantic Representations Yolanda Gil, Varun Ratnakar,

Ewa Deelman, www.isi.edu/~deelman

Raw DataMetadata

Derived DataMetadata

Provenance

Analysis Description

Raw DataMetadata

Derived DataMetadata

Provenance

Analysis Description

Raw DataMetadata

Derived DataMetadata

Provenance

Analysis Description

Raw DataMetadata

Derived DataMetadata

Provenance

Analysis Description

Problem: How to find the data in this environment?

How to specify the characteristics of the data (the metadata attributes)?

How to manage distributed provenance records?

Today clients must figure out manually the meaning of the attributes, identify what are the relevant ones to query, and formulate queries

Page 8: Ewa Deelman,  deelman Virtual Metadata Catalogs: Augmenting Existing Metadata Catalogs with Semantic Representations Yolanda Gil, Varun Ratnakar,

Ewa Deelman, www.isi.edu/~deelman

Approach

Expose the information in the catalogs at a semantic level Expose the attributes

Provide access to the content based on a user-selected ontology

Support semantic queries to the catalog contents

Page 9: Ewa Deelman,  deelman Virtual Metadata Catalogs: Augmenting Existing Metadata Catalogs with Semantic Representations Yolanda Gil, Varun Ratnakar,

Ewa Deelman, www.isi.edu/~deelman

The Virtual Metadata Catalog Augments the existing metadata catalogs with

semantic representations Do not modify the original sources

Maps virtual metadata attributes to the original attributes

Represents virtual attributes declaratively Represents constraints as relations between attributes Expands and translates a query to the Virtual Catalog

into the original attributes Explored approach with temporal attributes

Page 10: Ewa Deelman,  deelman Virtual Metadata Catalogs: Augmenting Existing Metadata Catalogs with Semantic Representations Yolanda Gil, Varun Ratnakar,

Ewa Deelman, www.isi.edu/~deelman

Queries to other metadata catalogs

Mappings

Virtual Metadata

Shared ontologies& vocabularies

Metadata Catalog

Metadata Attributes

Mappings

Virtual Metadata

Virtual Metadata Catalog Service

Metadata Catalog Service

Queries with virtual metadata attributes

Queries with metadata attributes

Queries to multiple catalogsQuery Mediator

Queries to a specific metadata catalog

Page 11: Ewa Deelman,  deelman Virtual Metadata Catalogs: Augmenting Existing Metadata Catalogs with Semantic Representations Yolanda Gil, Varun Ratnakar,

Ewa Deelman, www.isi.edu/~deelman

Metadata Catalog Service (MCS) Models logical files, logical collections and views Provides a standard set of attributes for each

object type Supports the dynamic definition of attributes of

type String, integer, float, date

Provides a command-line interface and a Java API Used in the Pegasus portal, SCEC, myLEAD

Page 12: Ewa Deelman,  deelman Virtual Metadata Catalogs: Augmenting Existing Metadata Catalogs with Semantic Representations Yolanda Gil, Varun Ratnakar,

Ewa Deelman, www.isi.edu/~deelman

Execution time Wall clock time or CPU time? Could be specified as:

begin-execution-time, end-execution-time begin-execution-time, duration

Diversity of these attributes can be represented at a semantic level Define the virtual attributes in OWL

Page 13: Ewa Deelman,  deelman Virtual Metadata Catalogs: Augmenting Existing Metadata Catalogs with Semantic Representations Yolanda Gil, Varun Ratnakar,

Ewa Deelman, www.isi.edu/~deelman

Gratuitous OWL Slide<owl:Class rdf:ID="IntervalThing">

<rdfs:subClassOf rdf:resource= "#TemporalThing"/> <rdfs:subClassOf> <owl:Restriction> <owl:onProperty rdf:resource="#from" /> <owl:maxCardinality rdf:datatype="&xsd;nonNegativeInteger">1</owl:maxCardinality> </owl:Restriction> </rdfs:subClassOf> <rdfs:subClassOf> <owl:Restriction> <owl:onProperty rdf:resource="#to" /> <owl:maxCardinality rdf:datatype="&xsd;nonNegativeInteger">1</owl:maxCardinality> </owl:Restriction> </rdfs:subClassOf></owl:Class>

<owl:ObjectProperty rdf:ID="from"> <rdfs:domain rdf:resource= "#TemporalThing"/> <rdfs:range rdf:resource="#InstantThing"/> <rdf:type rdf:resource= "&owl;FunctionalProperty"/></owl:ObjectProperty><owl:DatatypeProperty rdf:ID="duration"> <rdfs:domain rdf:resource= "#TemporalThing"/> <rdfs:range rdf:resource="&xsd;duration"/></owl:DatatypeProperty>

Page 14: Ewa Deelman,  deelman Virtual Metadata Catalogs: Augmenting Existing Metadata Catalogs with Semantic Representations Yolanda Gil, Varun Ratnakar,

Ewa Deelman, www.isi.edu/~deelman

Add rules to represent more expressive relations and constraints

[r1: (?x rdf:type tme:IntervalThing), (?x tme:from ?a),(?x tme:duration ?t2), (?a tme:at ?t1), sum(?t1, ?t2, ?t3)makeTemp(?v)->

(?v rdf:type tme:InstantThing) (?v tme:at ?t3) (?x tme:to ?v)]

Page 15: Ewa Deelman,  deelman Virtual Metadata Catalogs: Augmenting Existing Metadata Catalogs with Semantic Representations Yolanda Gil, Varun Ratnakar,

Ewa Deelman, www.isi.edu/~deelman

MCS Query

Generic Catalog Ontology (file, view, collection)

Query

Virtual Metadata Attributes and Mappings

Reasoner

Distributed domain ontologies

Virtual Metadata Catalog

Metadata Catalog

Metadata Attributes

Metadata Catalog

Metadata Catalog Service (MCS)

Query Mapping

Answer

Page 16: Ewa Deelman,  deelman Virtual Metadata Catalogs: Augmenting Existing Metadata Catalogs with Semantic Representations Yolanda Gil, Varun Ratnakar,

Ewa Deelman, www.isi.edu/~deelman

Query Mapping

MCS Query

1. Generate query constituents

2. Convert to MCS Attribute names

3. Construct query formula

OWL Query

Virtual metadata attribute value pairs

MCS Attribute value pairs

Distributed domain ontologies

Virtual Metadata Attributes and Mappings

Metadata Catalog Service (MCS)

Generic OWL + Rules Reasoner

answer

“from 2004-01-01T10:00:00” and “duration PT30S”

“startDate” and “endDate”

Load OWL ontologies and rules referenced in the query

generate new attributes “from 2004-01-01T10:00:00” “to 2004-01-01T10:00:30Z”t

Convert “from” to “StartDate”Convert “to” to “Enddate”

Convert from XML schema to simple strings expected by MCS

Page 17: Ewa Deelman,  deelman Virtual Metadata Catalogs: Augmenting Existing Metadata Catalogs with Semantic Representations Yolanda Gil, Varun Ratnakar,

Ewa Deelman, www.isi.edu/~deelman

Evaluation Developed a prototype system Performed queries across data sets from

different domains Climate modeling, earthquake science,

workflow execution Supported temporal queries using the OWL

time ontology as the target ontology

Page 18: Ewa Deelman,  deelman Virtual Metadata Catalogs: Augmenting Existing Metadata Catalogs with Semantic Representations Yolanda Gil, Varun Ratnakar,

Ewa Deelman, www.isi.edu/~deelman

Discussion Support only for the query functionality Need to expand to

data publication structuring data into collections Supporting multiple catalogs

Need to support richer catalog structures

Page 19: Ewa Deelman,  deelman Virtual Metadata Catalogs: Augmenting Existing Metadata Catalogs with Semantic Representations Yolanda Gil, Varun Ratnakar,

Ewa Deelman, www.isi.edu/~deelman

Conclusions Example of building semantic services on top of

legacy catalogs Provided customized views at the semantic level

Views are customized to a particular user-selected ontology Need to expand the functionality to publication Need to address more the handling of alternative data

formats Syntactic versus semantic transformations Transformations between date/time formats, coordinate

systems, etc. Are these transformations workflows?