semantic web - sparql - uni-koblenz · semantic web sparql gerd gr oner, matthias thimm...

54
Semantic Web SPARQL Gerd Gr¨ oner , Matthias Thimm {groener,thimm}@uni-koblenz.de Institute for Web Science and Technologies (WeST) University of Koblenz-Landau July 16, 2013 Gerd Gr¨ oner , Matthias Thimm Semantic Web 1 / 47

Upload: tranthuy

Post on 04-Aug-2019

215 views

Category:

Documents


0 download

TRANSCRIPT

Semantic WebSPARQL

Gerd Groner, Matthias Thimm

{groener,thimm}@uni-koblenz.de

Institute for Web Science and Technologies (WeST)University of Koblenz-Landau

July 16, 2013

Gerd Groner, Matthias Thimm Semantic Web 1 / 47

Agenda for this Week

I SPARQL (2 lectures)

I Ontology Engineering

I Pattern-based Ontology Design

I Semantic Search

I Semantic Web Applications, Closing

Gerd Groner, Matthias Thimm Semantic Web 2 / 47

Outline

1 SPARQL Protocol and RDF Query Language

2 SPARQL Protocol

3 Networked Graphs

4 SPARQL 1.1

Gerd Groner, Matthias Thimm Semantic Web 3 / 47

Outline

1 SPARQL Protocol and RDF Query Language

2 SPARQL Protocol

3 Networked Graphs

4 SPARQL 1.1

Gerd Groner, Matthias Thimm Semantic Web 4 / 47

Outline

1 SPARQL Protocol and RDF Query Language

2 SPARQL Protocol

3 Networked Graphs

4 SPARQL 1.1

Gerd Groner, Matthias Thimm Semantic Web 5 / 47

SPARQL Protocol

I The SPARQL protocol specifies how queries and query resultsare exchanged / sent between providing and requestingservice.

I Queries and results are sent over the network.

Protocol is specified by two parts

1. SPARQL interface (abstract interface, independent ofconcrete implementation)

2. bindings of this interfaceI for the query (request)

I HTTP bindingI SOAP binding

I for the resultI XML result binding

I WSDL 2.0 (Web service description langauge) to describe theSPARQL protocol

Gerd Groner, Matthias Thimm Semantic Web 6 / 47

SparqlQuery Interface

I SparqlQuery is the only interface of the SPARQL protocol.

I It contails one operation: ‘query’

→ the ‘query’ operation described by an in-out messageexchange pattern

I this in-out message exchange pattern consists of twomessages:

I ‘in’-messageI ‘out’-message

Gerd Groner, Matthias Thimm Semantic Web 7 / 47

Interface and Operation — the query ‘in’ message

I Content of an ‘in’ message (of a SPARQL query operation) isan instance of an XML Schema complex type

Gerd Groner, Matthias Thimm Semantic Web 8 / 47

Interface and Operation — the query ‘out’ message

I contents of the ‘out’ message of the query operation is aninstance of an XML Schema complex type

I the ‘out’ message is composed of either:I a SPARQL result document (for SELECT and ASK queries), orI a serialized RDF graph (e.g., in the RDF/XML syntax) (for

CONSTRUCT and DESCRIBE queries)

I Additionally, a specification of return message in case ofI incorrect (malformed) queries, orI refused request

Gerd Groner, Matthias Thimm Semantic Web 9 / 47

SPARQL Query Binding: HTTP Binding

Query:

PREFIX dc : <h t t p : / / p u r l . o rg / dc / e l e m e n t s /1.1/>SELECT ? book ?whoWHERE { ? book dc : c r e a t o r ?who }

conveyed to a SPARQL query service http: // www. example. com/ sparql/ , which isillustrated by the following HTTP trace:

GET / s p a r q l /? q u e r y=EncodedQuery HTTP/ 1 . 1Host : www. example . comUser−agent : my−s p a r q l−c l i e n t / 0 . 1

Gerd Groner, Matthias Thimm Semantic Web 10 / 47

SPARQL Query Binding: SOAP Binding

SOAP binding:

<?xml v e r s i o n =”1.0” e n c o d i n g=”UTF−8”?><soapenv : E n v e l o p e

xmlns : soapenv=”h t t p : / /www. w3 . org /2003/05/ soap−e n v e l o p e /”xmlns : xsd=”h t t p : / /www. w3 . org /2001/XMLSchema”xmlns : x s i =”h t t p : / /www. w3 . org /2001/XMLSchema−i n s t a n c e ”>

<soapenv : Body><query−r e q u e s t xmlns=”h t t p : / /www. w3 . org /2005/09/

s p a r q l−p r o t o c o l−t y p e s/#”><query>SELECT ? x ? y WHERE {? x i s R e l a t e d T o ? y}</query>

</query−r e q u e s t></soapenv : Body>

</soapenv : Enve lope>

Gerd Groner, Matthias Thimm Semantic Web 11 / 47

Result XML Binding

Remember: previously introduced example query:

SELECT ? book ?whoWHERE { ? book dc : c r e a t o r ?who }Result:

HTTP/ 1 . 1 200 OKS e r v e r : Apache / 1 . 3 . 2 9 ( Unix ) PHP/ 4 . 3 . 4 DAV/ 1 . 0 . 3C o n n e c t i o n : c l o s eContent−Type : a p p l i c a t i o n / s p a r q l−r e s u l t s+xml

<?xml v e r s i o n =”1.0”?><s p a r q l xmlns=”h t t p : / /www. w3 . org /2005/ s p a r q l− r e s u l t s #”>

<head><v a r i a b l e name=”book”/><v a r i a b l e name=”who”/>

</head>< r e s u l t s d i s t i n c t =” f a l s e ” o r d e r e d=” f a l s e ”>

< r e s u l t ><b i n d i n g name=”book”><u r i >h t t p : / /www. example / book5</u r i ></b i n d i n g><b i n d i n g name=”who”><bnode>r29392923r2922 </bnode></b i n d i n g>

</ r e s u l t >. . .

</ s p a r q l >Gerd Groner, Matthias Thimm Semantic Web 12 / 47

SPARQL Result XML Binding

<?xml v e r s i o n =”1.0”?><s p a r q l xmlns=”h t t p : / /www. w3 . org /2005/ s p a r q l− r e s u l t s #”>

<head><v a r i a b l e name=”name”/><v a r i a b l e name=”mbox”/>

</head>

< r e s u l t s >< r e s u l t >

<b i n d i n g name=”name”> . . . </b i n d i n g><b i n d i n g name=”mbox”> . . . </b i n d i n g>

</ r e s u l t >

< r e s u l t ><b i n d i n g name=”name”> . . . </b i n d i n g><b i n d i n g name=”mbox”> . . . </b i n d i n g>

</ r e s u l t >. . .

</ r e s u l t s ></ s p a r q l >

Gerd Groner, Matthias Thimm Semantic Web 13 / 47

Binding Types inside Results

< r e s u l t ><b i n d i n g name=”x”>

<bnode>r2</bnode></b i n d i n g>

<b i n d i n g name=”hpage”><u r i >h t t p : / / work . example . org /bob/</ u r i >

</b i n d i n g>

<b i n d i n g name=”name”>< l i t e r a l xml : l a n g=”en”>Bob</ l i t e r a l >

</b i n d i n g>

<b i n d i n g name=”age”>< l i t e r a l

d a t a t y p e=”h t t p : / /www. w3 . org /2001/XMLSchema#i n t e g e r ”>30</ l i t e r a l >

</b i n d i n g>

<b i n d i n g name=”mbox”><u r i >m a i l t o : bob@work . example . org</u r i >

</b i n d i n g></ r e s u l t >

Gerd Groner, Matthias Thimm Semantic Web 14 / 47

Outline

1 SPARQL Protocol and RDF Query Language

2 SPARQL Protocol

3 Networked Graphs

4 SPARQL 1.1

Gerd Groner, Matthias Thimm Semantic Web 15 / 47

Networked Graphs

Simon Schenk and Steffen Staab“Networked Graphs: A Declarative Mechanism for SPARQL Rules,SPARQL Views and RDF Data Integration on the Web”,Proceedings of the 17th International World Wide WebConference, WWW2008, Bejing, China, 2008.

Gerd Groner, Matthias Thimm Semantic Web 16 / 47

Networked Graphs

I Basic Idea: define RDF graphsI extensionally by listing statements (RDF triples) orI intensionally using views

I possible to have view within another view (recursion)

I Integrate with existing Semantic Web infrastructure

I Easy exchange of (networked) graphs

I Use existing data sources (RDF graphs)

Gerd Groner, Matthias Thimm Semantic Web 17 / 47

Background: The Vision of the Semantic Web

I A Web of Machineunderstandable data.

I Allowing for Reuse of data.

I A Distributed Infrastructure

I Managed by autonomousagents

I Fundamental technologies:I RDF, SPARQLI HTTP GET

Gerd Groner, Matthias Thimm Semantic Web 18 / 47

Background: The Vision of the Semantic Web

I A Web of Machineunderstandable data.

I Allowing for Reuse of data.

I A Distributed Infrastructure

I Managed by autonomousagents

I Fundamental technologies:I RDF, SPARQLI HTTP GET

Gerd Groner, Matthias Thimm Semantic Web 18 / 47

Motivation for Networked Graphs: Mashups

Gerd Groner, Matthias Thimm Semantic Web 19 / 47

Mashups and Semantic Web

Mashups most popular model:

I Hack-and-Hope

Semantic Web most popularmodel:

I Crawl-Integrate-and-Reason

I Disadvantages:I Screen-Scraping (reading,

extracting text fromdocuments (Web pages))

I No agreement on a datamodel

I It may be differentsometimes:

I Google Web serviceI Amazon Web service

I Disadvantages:I Data is outdatedI Data is not declarative,

but only has implicitsemantics

I Scalability problemsI Access rights: not all is

allowed to be copied /published

I Provenance – datasources can be blurred

Gerd Groner, Matthias Thimm Semantic Web 20 / 47

Mashups and Semantic Web

Mashups most popular model:

I Hack-and-Hope

Semantic Web most popularmodel:

I Crawl-Integrate-and-Reason

I Disadvantages:I Screen-Scraping (reading,

extracting text fromdocuments (Web pages))

I No agreement on a datamodel

I It may be differentsometimes:

I Google Web serviceI Amazon Web service

I Disadvantages:I Data is outdatedI Data is not declarative,

but only has implicitsemantics

I Scalability problemsI Access rights: not all is

allowed to be copied /published

I Provenance – datasources can be blurred

Gerd Groner, Matthias Thimm Semantic Web 20 / 47

Mashups and Semantic Web

Mashups most popular model:

I Hack-and-Hope

Semantic Web most popularmodel:

I Crawl-Integrate-and-Reason

I Disadvantages:I Screen-Scraping (reading,

extracting text fromdocuments (Web pages))

I No agreement on a datamodel

I It may be differentsometimes:

I Google Web serviceI Amazon Web service

I Disadvantages:I Data is outdatedI Data is not declarative,

but only has implicitsemantics

I Scalability problemsI Access rights: not all is

allowed to be copied /published

I Provenance – datasources can be blurred

Gerd Groner, Matthias Thimm Semantic Web 20 / 47

Networked Graphs — A Concrete Scenario

Gerd Groner, Matthias Thimm Semantic Web 21 / 47

Objectives in this Scenario

I A Web of Machineunderstandable data.

I Allowing for Reuse of data.

I A Distributed Infrastructure

I Managed by autonomousagents

I Hardly any assumptions:I RDF, SPARQLI HTTP GET

⇒ Objectives:

I Distributed Views

I Re-use existing languages/protocols (RDF extension)

I Default-negation

I Easy to learn, easy to exchange

Gerd Groner, Matthias Thimm Semantic Web 22 / 47

Objectives in this Scenario

I A Web of Machineunderstandable data.

I Allowing for Reuse of data.

I A Distributed Infrastructure

I Managed by autonomousagents

I Hardly any assumptions:I RDF, SPARQLI HTTP GET

⇒ Objectives:

I Distributed Views

I Re-use existing languages/protocols (RDF extension)

I Default-negation

I Easy to learn, easy to exchange

Gerd Groner, Matthias Thimm Semantic Web 22 / 47

Remember: Named Graphs

Used to group a set of triples.Gerd Groner, Matthias Thimm Semantic Web 23 / 47

Remember: SPARQL CONSTRUCT Queries

Build an RDF graph from existing triples (here from namedgraphs):

CONSTRUCT {?member f o a f : c u r r e n t P r o j e c t : SemWebProject .?member f o a f : phone ? phone }

FROM NAMED : bobFOAFFROM NAMED : chrisFOAFFROM NAMED . . .WHERE {

GRAPH ? f o a f F i l e {? f o a f F i l e f o a f : p r i m a r y T o p i c ?member

OPTIONAL {?member f o a f : phone ? phone

FILTER (REGEX(? phone , ‘ ‘ ˆ\+ 4 9 ’ ’ ) )}

}}

Query: “Find the persons described in the FOAF files, their phone numbers, ifavailable. German phone numbers only.”

Gerd Groner, Matthias Thimm Semantic Web 24 / 47

Now: Networked Graphs

Basic Idea: Define RDF graphsI extensionally by listing statements orI intensionally using views based on SPARQL

Gerd Groner, Matthias Thimm Semantic Web 25 / 47

Independence Set

I S (BobFOAF) = { BobFOAFM i k e s P r o j e c t ,DBLP,

ChrisFOAF}

Semantics of the independence set:

Iteratively evaluate all views until a fixpoint is reached. Possibleproblems:

I Termination of recursion

I Non-monotonic negation

Gerd Groner, Matthias Thimm Semantic Web 26 / 47

Well-founded Semantics

I Models of the Well Founded SemanticsI closed-world (Here: rules only, graphs may be open worlds.)I pessimistic (negate as much as possible)I Interpretation of each fact is three-valued (true, false,

unknown)

I Usage with RDF In RDF only true statements; false, unknowntreated equally. In Networked Graphs false and unknown areused for interim results.

Gerd Groner, Matthias Thimm Semantic Web 27 / 47

RDF Mashups

I with declarative, dynamic semantics, Mashups can be verypowerful!

I re-use and combine data from several data sources on the Web

Gerd Groner, Matthias Thimm Semantic Web 28 / 47

Networked Graphs — Architecture

I Based on Sesame 2 (http://www.openrdf.org/)

I Open source (http://west.uni-koblenz.de/Research/systems/NetworkedGraphs)

Gerd Groner, Matthias Thimm Semantic Web 29 / 47

Networked Graphs — Conclusion

I Networked Graphs allows for easyI formulation,I exchange andI distributed evaluation

of rules and views for the Semantic Web using defaultnegation

Networked Graphs integrate well with existing mechanisms and canbe evaluated efficiently.

Networked Graphs are eases the re-use of existing data in variousapplications.

Gerd Groner, Matthias Thimm Semantic Web 30 / 47

Outline

1 SPARQL Protocol and RDF Query Language

2 SPARQL Protocol

3 Networked Graphs

4 SPARQL 1.1

Gerd Groner, Matthias Thimm Semantic Web 31 / 47

SPARQL 1.1

... some extensions in SPARQL 1.1:

I Projected expressions

I Aggregationen

I Subqueries

I Negation

I Property path

I Service descriptions

I Update language

I Update protocol

I HTTP RDF update (RESTful)

I Basic federated query

Gerd Groner, Matthias Thimm Semantic Web 32 / 47

Projected expressions

I SELECT queries are no longer restricted to variables:

SELECT (? p r i c e ∗ ? amount AS ? t o t a l P r i c e )WHERE { . . . }

Gerd Groner, Matthias Thimm Semantic Web 33 / 47

Aggregation

I Like in SQL: COUNT, SUM, MIN, MAX, GROUP BY

SELECT (MIN(? p r i c e ) AS ? m i n P r i c e ) . . .WHERE { . . . }GROUP BY ? a r t i c l e

Gerd Groner, Matthias Thimm Semantic Web 34 / 47

Subqueries

I nested queriesI multiple queries can be cominged in one query

SELECT ? a r t i c l e ? a u t h o rWHERE

? a r t i c l e ex : a u t h o r ? a u t h o r .{ SELECT ? a r t i c l e

WHERE { . . . ? a r t i c l e . . . }ORDER BY . . .LIMIT . . .

}

Result of a query is nested in another query.

Gerd Groner, Matthias Thimm Semantic Web 35 / 47

Negation

I The trick with OPTIONAL and BOUND is no longer needed.

SELECT . . .WHERE { ? p e r s o n a f o a f : Person .

MINUS { ? p e r s o n f o a f : mbox ? e m a i l }}

Both graph patterns are evaluated and then the difference isreturned.

Gerd Groner, Matthias Thimm Semantic Web 36 / 47

Negation (2)

I Alternative construct NOT EXISTS

SELECT . . .WHERE { ? p e r s o n a f o a f : Person .{ FILTER NOT EXISTS {

? p e r s o n f o a f : mbox ? e m a i l}

}

Both graph patterns are evaluated and then the difference isreturned.

Gerd Groner, Matthias Thimm Semantic Web 37 / 47

Property path

I Excerpt of constructors (let p be an IRI for a predicate, e.g.,foaf:knows):

I ^p inverse pathI p* multiple times (including 0)I p+ multiple times at least 1I p? zero or one timesI p1/p2 sequenceI p1|p2 alternative

Gerd Groner, Matthias Thimm Semantic Web 38 / 47

Property path (2)

I “regular expression”

SELECT . . .WHERE {

? p e r s o n f o a f : knows+ ? network .}

Gerd Groner, Matthias Thimm Semantic Web 39 / 47

Property path (3)

SELECT . . .WHERE {

? book dc : t i t l e | r d f s : l a b e l ? d i s p l a y S t r i n g .}

I “regular expression”

I e.g., alternative

Gerd Groner, Matthias Thimm Semantic Web 40 / 47

Property path (3)

SELECT . . .WHERE {

? book dc : t i t l e | r d f s : l a b e l ? d i s p l a y S t r i n g .}

I “regular expression”

I e.g., alternative

Gerd Groner, Matthias Thimm Semantic Web 40 / 47

Property path (3)

I “regular expression”I e.g., Sequence:

I “Find the names of people 2 foaf:knows links away.”

SELECT . . .WHERE {

? x f o a f : mbox <m a i l t o : a l i c e @ e x a m p l e> .? x f o a f : knows / f o a f : knows / f o a f : name ?name .

}

→ without property path?

Gerd Groner, Matthias Thimm Semantic Web 41 / 47

Property path (3)

I “regular expression”I e.g., Sequence:

I “Find the names of people 2 foaf:knows links away.”

SELECT . . .WHERE {

? x f o a f : mbox <m a i l t o : a l i c e @ e x a m p l e> .? x f o a f : knows / f o a f : knows / f o a f : name ?name .

}

→ without property path?

Gerd Groner, Matthias Thimm Semantic Web 41 / 47

Property path (4)

→ equivalent query (without property path):

SELECT . . .WHERE {

? x f o a f : mbox <m a i l t o : a l i c e @ e x a m p l e> .? x f o a f : knows ? a1 .? a1 f o a f : knows ? a2 .? a2 f o a f : name ?name .

}

Gerd Groner, Matthias Thimm Semantic Web 42 / 47

HAVING (over GROUPed sets)

HAVING operates over grouped solution sets, in the same way thatFILTER operates over un-grouped sets

PREFIX : <h t t p : / / data . example/>

SELECT (AVG(? s i z e ) AS ? a s i z e )WHERE {

? x : s i z e ? s i z e}GROUP BY ? xHAVING(AVG(? s i z e ) > 10)

→ This will return average sizes, grouped by the subject, but onlywhere the mean size is greater than 10.

Gerd Groner, Matthias Thimm Semantic Web 43 / 47

HAVING (over GROUPed sets)

HAVING operates over grouped solution sets, in the same way thatFILTER operates over un-grouped sets

PREFIX : <h t t p : / / data . example/>

SELECT (AVG(? s i z e ) AS ? a s i z e )WHERE {

? x : s i z e ? s i z e}GROUP BY ? xHAVING(AVG(? s i z e ) > 10)

→ This will return average sizes, grouped by the subject, but onlywhere the mean size is greater than 10.

Gerd Groner, Matthias Thimm Semantic Web 43 / 47

Summary

I SPARQLI Standard query language for RDF

I SELECT, CONSTRUCT, ASK, DESCRIBEI Extensive filters, optional and alternative patterns

I Protocol for queries and resultsI Based on triples model (subject-predicate-object)

I No logic inference in language, only in underlying knowledgebase

I Named graphsI Separate or specialized knowledge

Gerd Groner, Matthias Thimm Semantic Web 44 / 47

Learning goals

I Understanding basic and advanced queries using SPARQL.

I Gain knowledge about how to query reified data.

Gerd Groner, Matthias Thimm Semantic Web 45 / 47

SPARQL endpoints

Name URL What’s in there?DBPedia http://dbpedia.org/sparql extensive RDF data

from WikipediaSPARQLer tttp://sparql.org/sparql.html General-purpose

query endpoint forWeb-accessible data

DBLP http://www4.wiwiss.fu-berlin.de/dblp/snorql/ Bibliographic data from computerscience journals and conferences

LinkedMDB http://data.linkedmdb.org/sparql Films, actors, directors,writers, producers, etc.

World factbook http://www4.wiwiss.fu-berlin.de/factbook/snorql/ Country statistics fromthe CIA World Factbook

bio2rdf http://bio2rdf.org/sparql Bioinformatics data fromaround 40 public databases

Gerd Groner, Matthias Thimm Semantic Web 46 / 47

Exercise

I querying DBpedia endpointI Find 10 cities (http://dbpedia.org/ontology/City that

are located in (propertyhttp://dbpedia.org/ontology/country) Germany(http://dbpedia.org/resource/Germany)

I Find 10 “Populated Places” (URI:http://dbpedia.org/ontology/PopulatedPlace) withtheir names (rdfs:label) in which the names are described inGerman (i.e., language tag ”de”).

I Find 10 people (http://dbpedia.org/ontology/Person)with place of birth(http://dbpedia.org/ontology/birthPlace) Berlin(http://dbpedia.org/resource/Berlin)

Gerd Groner, Matthias Thimm Semantic Web 47 / 47