data in rdf

36
Applied Semantic Web Timely. Practical. Reliable. http://applied-semantic-web.org Emanuele Della Valle [email protected] http://emanueledellavalle.org Data in RDF

Upload: emanuele-della-valle

Post on 06-May-2015

6.083 views

Category:

Documents


11 download

TRANSCRIPT

Page 1: Data in RDF

Applied Semantic WebTimely. Practical. Reliable.http://applied-semantic-web.org

Emanuele Della [email protected]://emanueledellavalle.org

Data in RDF

Page 2: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

Share, Remix, Reuse — Legally

This work is licensed under the Creative Commons Attribution 3.0 Unported License.

Your are free:

• to Share — to copy, distribute and transmit the work

• to Remix — to adapt the work

Under the following conditions

• Attribution — You must attribute the work by inserting– “© applied-semantic-web.org” at the end of each reused slide– a credits slide stating: these slides are partially based on

“Data in RDF” by Emanuele Della Valle http://applied-semantic-web.org/2010/03/02_RDF.ppt

To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/

2

Page 3: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org 3

Data Interchange: RDF

Page 4: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

RDF in a nutshell

Looking for a flexible data model

Why• Application are always changing

(competitive environment) • People are always adding more features• Graceful evolution is important

Optimal: relational model• Relational model is remarkably flexible• Supports graceful evolution

– Change => Add another table– Existing queries are unaffected

• Easily accommodates new data – Without affecting existing queries

• Allows data to be easily combined ("joined") in new ways• 25+ years of relational database experience

4

Page 5: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

RDF in a nutshell

Resource Description Framework

The adaptation of the relational model to the Web give rise to RDF

From tuples to triples

Any relational data can be represented as triples• Row Key --> Subject• Column --> Property• Value --> Value

5

Page 6: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

RDF in a nutshell

Representing relational data in RDF (almost)

E.g., drug data

Represented in RDF (almost)

Drug Category Formula

BB.2  Anxiolytics

C16H21NO2

Drug Name

BB.2 Propranolol

BB.2 Propranololo

BB.2 プロプラノロール

BB.2

Anxiolytics C16H21NO2 Propranolol Propranololo プロプラノロール

CategoryFormula

Is a Drug

Legend

resource

literal

Name

6

Page 7: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

RDF in a nutshell

Representing relational data in RDF (almost)

Two important problems• Once out of the database internal ID (e.g., BB.2) becomes

useless• Once out of the database internal names of schema element

(e.g., Category) becomes useless as well

RDF solves it by using URI• Internal ID should be replaced by URI• Internal schema names should be replaced by URI• Values do (always) not need to be URI-fied

http://dbpedia.org/resource/Propranolol

http://dbpedia.org/resource/Category:Anxiolytics

C16H21NO2

http://www.w3.org/2004/02/skos/core#subject

http://dbpedia.org/ontology/formula

http://www.w3.org/2000/01/rdf-schema#label

http://dbpedia.org/ontology/Drug

http://www.w3.org/1999/02/22-rdf-syntax-ns#type

Legend

resource

literal

7

Propranolol

Propranololo

プロプラノロール

Page 8: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

Which URI should we use?• Popular ones! Data merge will take place automatically!

RDF in a nutshell

Representing data in RDF Q/A 1/4

http://dbpedia.org/resource/Propranolol

http://dbpedia.org/resource/Category:Anxiolytics

http://www.w3.org/2004/02/skos/core#subject

+http://dbpedia.org/resource/Propranolol

http://dbpedia.org/ontology/knownFor

http://dbpedia.org/resource/Propranolol

http://dbpedia.org/resource/Category:Anxiolytics

http://www.w3.org/2004/02/skos/core#subject

=http://dbpedia.org/ontology/knownFor

8

http://dbpedia.org/resource/James_W._Black

http://dbpedia.org/resource/James_W._Black

Page 9: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

Where do I find popular URIs?• A difficult question with no clear answer• The best place to keep an eye on is

– the Linking Open Data Project http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData

– and in particular the following pages of the Wiki- Data Sets

http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/DataSets

- Semantic Web Search Engines http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/SemanticWebSearchEngines

- Common Vocabularies http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/CommonVocabularies

• I normally use– Sindice API through sameas.org

- e.g. http://sameas.org/html?q=propranolol&x=0&y=0

RDF in a nutshell

Representing data in RDF Q/A 2/4

9

Page 10: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

What is a value? When shall we URI-fy a value? • Literals cannot be used to merge different data set• E.g., what if we what to merge two resources based on their

labels?

– BRCA may refer to different thing on the Webe.g., try http://en.wikipedia.org/wiki/BRCA

• URI-fy any value that can be eventually used to merge different dataset and leave the other values as literals

RDF in a nutshell

Representing data in RDF Q/A 3/4

BRCA

http://www.w3.org/2000/01/rdf-schema#label

BRCA

http://www.w3.org/2000/01/rdf-schema#label

+ = ?

10

Page 11: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

RDF in a nutshell

Other data structure in RDF

Trees can be represented in RDF

Anything can be represented in RDF

11

Page 12: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

RDF in a nutshell

Serializing RDF

Three alternatives to write triples, in1. RDF/XML: Standard serialization in XML

<Description about=”subject”>

<property>value</property>

</Description>– e.g., http://dbpedia.org/data/Propranolol.rdf – Check-out the triples using http://www.w3.org/RDF/Validator/

2. NTriples: Simple (verbose) reference serialization (for specifications only)

<http://...subject> <http://...predicate> “value” .– http://www.w3.org/RDF/Validator/ARPServlet?URI=http%3A%2F%2Fdbpe

dia.org%2Fresource%2FPropranolol&PARSE=Parse+URI%3A+&TRIPLES_AND_GRAPH=PRINT_TRIPLES&FORMAT=PNG_EMBED

• N3 and Turtle: Developer-friendly serializations :subject :property “value” .– e.g. http://dbpedia.org/data/Propranolol.n3

12

Page 13: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

RDF in a nutshell

Serializing RDF in Turtle - namespaces

URI terms can be abbreviated using namespaces@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .@prefix dbpedia: <http://dbpedia.org/resource/> .@prefix dbpedia-owl: <http://dbpedia.org/ontology/> .

dbpedia:Propranolol rdf:type dbpedia-owl:Drug .

<http://www.w3.org/1999/ 02/22-rdf-syntax-ns#type> = 'a'

dbpedia:Propranolol a dbpedia-owl:Drug .

13

Page 14: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

RDF in a nutshell

Serializing RDF in Turtle - Literals

Literals: "Propranolol"• Literals with language tags: " プロプラノロール "@jp

@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

dbpedia:Propranolol rdfs:label "Propranolol"@en . dbpedia:Propranolol rdfs:label "Propranololo"@it . dbpedia:Propranolol rdfs:label " プロプラノロール "@jp .

• Typed literals: "3.14"^^xsd:float

dbpedia:Propranolol dbpprop:molecularWeight "259.34"^^xsd:float .

14

Page 15: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

RDF in a nutshell

Serializing RDF in Turtle - Convience Syntax

Abbreviating repeated subjects:dbpedia:Propranolol rdfs:label "Propranololo"@it . dbpedia:Propranolol dbpprop:molecularWeight "259.34"^^xsd:float .

... is the same as ...dbpedia:Propranolol rdfs:label "Propranololo"@it ; dbpprop:molecularWeight "259.34"^^xsd:float .

Abbreviating repeated subject/predicate pairs:dbpedia:Propranolol rdfs:label "Propranolol"@en . dbpedia:Propranolol rdfs:label "Propranololo"@it . dbpedia:Propranolol rdfs:label " プロプラノロール "@cn .

... is the same as ...dbpedia:Propranolol rdfs:label "Propranolol"@en , "Propranololo"@it , "プロプラノロール "@cn .

15

Page 16: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

RDF in a nutshell

Serializing RDF in XML - basics

W3C standardized an RDF/XML syntax [1]

The basic idea is to insert an XML element for each node (sobject and value) and arch (predicate)

Es.

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#” xmlns:skos="http://www.w3.org/2004/02/skos/core#" <rdf:Description rdf:about="http://dbpedia.org/resource/Propranolol">

<skos:subject>  <rdf:Description

rdf:about="http://dbpedia.org/resource/Category:Anxiolytics"/> </skos:subject> </rdf:Description></rdf:RDF>

[1] RDF/XML Syntax Specification available at http://www.w3.org/TR/rdf-syntax-grammar/

dbpedia:Propranolol category:Anxiolytics

skos:subject

propertyelement

Root tag

16

Page 17: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

RDF in a nutshell

Serializing RDF in XML - abbreviation

A possible abbreviation is using rdf:resource in the property elements

Es.<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#” xmlns:skos="http://www.w3.org/2004/02/skos/core#" <rdf:Description rdf:about="http://dbpedia.org/resource/Propranolol">

<skos:subject rdf:resource="http://dbpedia.org/resource/Category:Anxiolytics"/> </rdf:Description></rdf:RDF>

Page 18: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

RDF in a nutshell

Serializing RDF in XML - abbreviation

Encoding Literals• datatype

<rdf:Description rdf:about="http://dbpedia.org/resource/Propranolol"><dbpprop:molecularWeight

rdf:datatype="http://www.w3.org/2001/XMLSchema#float" >259.34</dbpprop:molecularWeight></rdf:Description>

• Language tag

<rdf:Description rdf:about="http://dbpedia.org/resource/Propranolol"><rdfs:label xml:lang="it">Propranololo</rdfs:label>

</rdf:Description>

18

dbpedia:Propranolol

dbpprop:molecularWeight

dbpedia:Propranolol

rdfs:label

“Propranololo”@it

"259.34"^^xsd:float

Page 19: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

RDF in a nutshell

Serializing RDF in XML – rdf:type abbreviation

It’s possible to abbreviate rdf:type• Es.

• Long form <rdf:Description rdf:about="http://dbpedia.org/resource/Propranolol">

<rdf:type rdf:resource="http://dbpedia.org/ontology/Drug"/></rdf:Description>

• Abbreviated form<dbpedia-owl:Drug rdf:about="http://dbpedia.org/resource/Propranolol" />

dbpedia:Propranolol

http://dbpedia.org/ontology/Drug

rdf:type

19

Page 20: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

RDF in a nutshell

Serializing RDF in XML – more statements 1/3

E.g.

can be• listed

<rdf:Description rdf:about="http://dbpedia.org/resource/Propranolol"><rdf:type rdf:resource="http://dbpedia.org/ontology/Drug"/>

</rdf:Description> <rdf:Description rdf:about="http://dbpedia.org/resource/Propranolol">

<skos:subject rdf:resource="http://dbpedia.org/resource/Category:Anxiolytics"/> </rdf:Description>• nested

<rdf:Description rdf:about="http://dbpedia.org/resource/Propranolol"> <rdf:type rdf:resource="http://dbpedia.org/ontology/Drug"/> <skos:subject rdf:resource="http://dbpedia.org/resource/Category:Anxiolytics"/></rdf:Description>

• shortened to a minimum<dbpedia-owl:Drug rdf:about="http://dbpedia.org/resource/Propranolol" > <skos:subject rdf:resource="http://dbpedia.org/resource/Category:Anxiolytics"/></dbpedia-owl:Drug>

20

dbpedia:Propranolol category:Anxiolytics

skos:subject

http://dbpedia.org/ontology/Drug

Page 21: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

RDF in a nutshell

Serializing RDF in XML – more statements 2/3 E.g.

can be• listed

<rdf:Description rdf:about="http://dbpedia.org/resource/Propranolol"> <skos:subject rdf:resource="http://dbpedia.org/resource/Category:Anxiolytics"/></rdf:Description><rdf:Description rdf:about="http://dbpedia.org/resource/Category:Anxiolytics"> <skos:broader rdf:resource="http://dbpedia.org/resource/Category:Psycholeptics"/>

</rdf:Description>

• nested<rdf:Description rdf:about="http://dbpedia.org/resource/Propranolol"> <skos:subject> <rdf:Description

rdf:about="http://dbpedia.org/resource/Category:Anxiolytics"> <skos:broader rdf:resource="http://dbpedia.org/resource/Category:Psycholeptics"/> </rdf:Description> </skos:subject></rdf:Description>

21

dbpedia:Propranolol category:Anxiolytics

skos:subject

Category:Psycholeptics

skos:broader

Page 22: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

RDF in a nutshell

Serializing RDF in XML – more statements 3/3

E.g.

Applying all abbreviations becomes<dbpedia-owl:Drug rdf:about="http://dbpedia.org/resource/Propranolol" > <rdfs:label xml:lang="it">Propranololo</rdfs:label> <skos:subject> <skos:concept rdf:about="http://dbpedia.org/resource/Category:Anxiolytics"> <skos:broader> <skos:concept rdf:about="http://dbpedia.org/resource/Category:Psycholeptics"/> </skos:broader> </skos:concept> </skos:subject></dbpedia-owl:Drug>

22

dbpedia:Propranolol category:Anxiolytics

skos:subject

Category:Psycholeptics

skos:broader

http://dbpedia.org/ontology/Drugskos:concept

“Propranololo”@itrdfs:label

Page 23: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

RDF in a nutshell

Merging XML files (representing an RDF model)

Suppose you have to merge the two following XML

Merging the XML trees is difficult, but being RDF …

<Drug rdf:about="Propranolol">

<skos:subject>

<skos:concept rdf:about="Anxiolytics"/>

</skos:subject >

<excretion>

<AnatomicalStructure rdf:about="Kidney"/>

</excretion>

</Drug>

<skos:concept rdf:about="Anxiolytics"

skos:isSubjectOf="Propranolol">

<canDamage>

< AnatomicalStructure

rdf:about="Kidney"/>

</canDamage>

</skos:concept>

Propranolol

Anxiolytics

Drug

rdf: type

rdf: type

skos:subject

Concept

Kidney

rdf: type

AnatomicalStructure

excretion

Propranolol

Anxiolytics

rdf:type

Concept

Kidney

rdf:type

AnatomicalStructure

canDamage

skos:isSubjectOf

23

Page 24: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

Propranolol

Anxiolytics

rdf: type

Concept

Kidney

rdf: type

AnatomicalStructure

canDamage

skos:isSubjectOf

RDF in a nutshell

Merging XML files 2/2

It’s (just) a matter to merge the two RDF graphs

NOTE: It works out nicely because both RDF/XML documents refer to the same resources and use the same vocabularies.

U

Propranolol

Anxiolytics

Drug

rdf: type

rdf: type

skos:subject

Concept

Kidney

AnatomicalStructure

canDamage

excretion

skos:isSubjectOf

rdf: type

24

Propranolol

Anxiolytics

Drug

rdf : type

rdf : type

skos:subject

Concept

Kidney

rdf : type

AnatomicalStructure

excretion

Page 25: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

Publishing RDF as Linked Data

Principles

1. Things must be identified with dereferenceable HTTP URIs.

2. If such a URI is dereferenced asking for the MIME-type application/rdf+xml, a data source must return an RDF/XML description of the identified resource.

– Information Resources

– Non-Information Resources

3. Besides RDF links to resources within the same data source, RDF descriptions should also contain RDF links to resources provided by other data sources

25

Page 26: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

Publishing RDF as Linked Data

More on Content Negotiation

Approach

E.g. • Resource identifier

– http://dbpedia.org/resource/Propranolol • RDF representation describing Propranolol

– http://dbpedia.org/data/Propranolol.rdf • HTML representation describing Propranolol

– http://dbpedia.org/page/Propranolol

Test it! http://idi.fundacionctic.org/vapour

26

Page 27: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

Publishing RDF as Linked Data

Recipes for Serving Information as Linked Data

Recipes [1]

1. Serving static RDF files

2. Serving information stored in a DB as (virtual) RDF

3. Using a RDF storage with a Linked Data Interface

4. Wrap API to serve (virtual) RDF

[1] http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/

27

Page 28: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

Publishing RDF as Linked Data

Recipe 1 - Serving static RDF files 1/3

IDEA: negotiate content using Apache mod_rewrite by writing a .htaccess file

Directory layout• apachedocumentroot/

– examples/- .htaccess- content/

- file.html- file.rdf

Expected behavior• Requesting http://localhost/examples/file• a HTML browser is redirected to

http://localhost/examples/content/file.html• a Linked Data application is redirected to

http://localhost/examples/content/file.rdf

28

Page 29: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

Publishing RDF as Linked Data

Recipe 1 - Serving static RDF files 2/3# Turn off MultiViews

Options –MultiViews

# Rewrite engine setup

RewriteEngine On

RewriteBase /examples

# Rewrite rule to serve HTML content from the vocabulary URI if requested

RewriteCond %{HTTP_ACCEPT} !application/rdf\+xml.*(text/html|application/xhtml\+xml)

RewriteCond %{HTTP_ACCEPT} text/html [OR]

RewriteCond %{HTTP_ACCEPT} application/xhtml\+xml [OR]

RewriteCond %{HTTP_USER_AGENT} ^Mozilla/.*

RewriteRule ^example/$ content/file.html [R=303]

# Rewrite rule to serve RDF/XML content if requested

RewriteCond %{HTTP_ACCEPT} application/rdf\+xml

RewriteRule ^example/ content/file.rdf [R=303]

# Rewrite rule to serve RDF/XML content by default

RewriteRule ^example/ content/file.rdf [R=303]

[source: http://www.w3.org/TR/swbp-vocab-pub/#recipe4example ]

29

Page 30: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

Publishing RDF as Linked Data

Recipe 1 - Serving static RDF files 3/3

Testing the configuration

30

http://localhost/examples/file

http://localhost/examples/content/file.html

http://localhost/examples/content/file.html

http://localhost/examples/file

http://localhost/examples/content/file.rdf

http://localhost/examples/content/file.rdf

Page 31: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

Publishing RDF as Linked Data

Recipe 2 - DB as (virtual) RDF

General Idea

Ready to use solutions• D2R Server

– http://sites.wiwiss.fu-berlin.de/suhl/bizer/d2r-server/index.html#quickstart

• OpenLink Virtuoso – http://virtuoso.openlinksw.com/Whitepapers/pdf/Virtuoso_SQL_to_RDF_Mapping.pdf – http://virtuoso.openlinksw.com/Whitepapers/pdf/Deploying_Linked_Data_Final1.pdf

• Triplify– http://triplify.org/Overview

31

Page 32: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

Publishing RDF as Linked Data

Recipe 3 - RDF storage + Linked Data Interface

General Idea

Ready to use solutions• Pubby: a Linked Data interfaces to SPARQL endpoints.

– http://www4.wiwiss.fu-berlin.de/pubby/

32

Page 33: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

Publishing RDF as Linked Data

Recipe 4 - Wrap API to serve (virtual) RDF

General Idea

No “general” solutions, only ad hoc ones• Virtuoso Sponger: a framework for developing Linked Data

wrappers (called cartridges) around different types of data sources– http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtSponger– http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/

VirtSpongerCartridgeSupportedDataSources • SIOC Exporters

– http://sioc-project.org/exporters

33

[source http://sites.wiwiss.fu-berlin.de/suhl/bizer/pub/Bizer-ESWC2007-RDFbookmashup.pdf ]

Page 34: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

Publishing RDF as Linked Data

Which recipe should I use?

How much data do you want to serve? • several hundred RDF triples: recipe 1• Larger dataset: recipe 2, 3 and 4

How is your data currently stored? • In a relational database: recipe 2• Available through an API: recipe 4• As RDF dump: recipe 3

How often does your data change? • Frequently: recipe 2 and 4• Rarely: recipe 1 and 3

34

Page 35: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

RDF in a nutshell

RDF Resources

RDF at the W3C - primer and specifications• http://www.w3.org/RDF/

Semantic Web tools - community maintained list; includes triple store, programming environments, tool sets, and more• http://esw.w3.org/topic/SemanticWebTools

302 Semantic Web Videos and Podcasts - includes a section specifically on RDF videos• http://www.semanticfocus.com/blog/entry/title/302-semantic-

web-videos-and-podcasts/

35

Page 36: Data in RDF

Emanuele Della Valle - http://applied-semantic-web.org

credits

Some slides are derived from• “How to Publish Linked Data on the Web” by Chris Bizer,

Richard Cyganiak and Tom Heath http://sites.wiwiss.fu-berlin.de/suhl/bizer/pub/LinkedDataTutorial/

36