semantic web - sparql - uni-koblenz · semantic web sparql gerd gr oner, matthias thimm...
TRANSCRIPT
Semantic WebSPARQL
Gerd Groner, Matthias Thimm
{groener,thimm}@uni-koblenz.de
Institute for Web Science and Technologies (WeST)University of Koblenz-Landau
July 16, 2013
Gerd Groner, Matthias Thimm Semantic Web 1 / 47
Agenda for this Week
I SPARQL (2 lectures)
I Ontology Engineering
I Pattern-based Ontology Design
I Semantic Search
I Semantic Web Applications, Closing
Gerd Groner, Matthias Thimm Semantic Web 2 / 47
Outline
1 SPARQL Protocol and RDF Query Language
2 SPARQL Protocol
3 Networked Graphs
4 SPARQL 1.1
Gerd Groner, Matthias Thimm Semantic Web 3 / 47
Outline
1 SPARQL Protocol and RDF Query Language
2 SPARQL Protocol
3 Networked Graphs
4 SPARQL 1.1
Gerd Groner, Matthias Thimm Semantic Web 4 / 47
Outline
1 SPARQL Protocol and RDF Query Language
2 SPARQL Protocol
3 Networked Graphs
4 SPARQL 1.1
Gerd Groner, Matthias Thimm Semantic Web 5 / 47
SPARQL Protocol
I The SPARQL protocol specifies how queries and query resultsare exchanged / sent between providing and requestingservice.
I Queries and results are sent over the network.
Protocol is specified by two parts
1. SPARQL interface (abstract interface, independent ofconcrete implementation)
2. bindings of this interfaceI for the query (request)
I HTTP bindingI SOAP binding
I for the resultI XML result binding
I WSDL 2.0 (Web service description langauge) to describe theSPARQL protocol
Gerd Groner, Matthias Thimm Semantic Web 6 / 47
SparqlQuery Interface
I SparqlQuery is the only interface of the SPARQL protocol.
I It contails one operation: ‘query’
→ the ‘query’ operation described by an in-out messageexchange pattern
I this in-out message exchange pattern consists of twomessages:
I ‘in’-messageI ‘out’-message
Gerd Groner, Matthias Thimm Semantic Web 7 / 47
Interface and Operation — the query ‘in’ message
I Content of an ‘in’ message (of a SPARQL query operation) isan instance of an XML Schema complex type
Gerd Groner, Matthias Thimm Semantic Web 8 / 47
Interface and Operation — the query ‘out’ message
I contents of the ‘out’ message of the query operation is aninstance of an XML Schema complex type
I the ‘out’ message is composed of either:I a SPARQL result document (for SELECT and ASK queries), orI a serialized RDF graph (e.g., in the RDF/XML syntax) (for
CONSTRUCT and DESCRIBE queries)
I Additionally, a specification of return message in case ofI incorrect (malformed) queries, orI refused request
Gerd Groner, Matthias Thimm Semantic Web 9 / 47
SPARQL Query Binding: HTTP Binding
Query:
PREFIX dc : <h t t p : / / p u r l . o rg / dc / e l e m e n t s /1.1/>SELECT ? book ?whoWHERE { ? book dc : c r e a t o r ?who }
conveyed to a SPARQL query service http: // www. example. com/ sparql/ , which isillustrated by the following HTTP trace:
GET / s p a r q l /? q u e r y=EncodedQuery HTTP/ 1 . 1Host : www. example . comUser−agent : my−s p a r q l−c l i e n t / 0 . 1
Gerd Groner, Matthias Thimm Semantic Web 10 / 47
SPARQL Query Binding: SOAP Binding
SOAP binding:
<?xml v e r s i o n =”1.0” e n c o d i n g=”UTF−8”?><soapenv : E n v e l o p e
xmlns : soapenv=”h t t p : / /www. w3 . org /2003/05/ soap−e n v e l o p e /”xmlns : xsd=”h t t p : / /www. w3 . org /2001/XMLSchema”xmlns : x s i =”h t t p : / /www. w3 . org /2001/XMLSchema−i n s t a n c e ”>
<soapenv : Body><query−r e q u e s t xmlns=”h t t p : / /www. w3 . org /2005/09/
s p a r q l−p r o t o c o l−t y p e s/#”><query>SELECT ? x ? y WHERE {? x i s R e l a t e d T o ? y}</query>
</query−r e q u e s t></soapenv : Body>
</soapenv : Enve lope>
Gerd Groner, Matthias Thimm Semantic Web 11 / 47
Result XML Binding
Remember: previously introduced example query:
SELECT ? book ?whoWHERE { ? book dc : c r e a t o r ?who }Result:
HTTP/ 1 . 1 200 OKS e r v e r : Apache / 1 . 3 . 2 9 ( Unix ) PHP/ 4 . 3 . 4 DAV/ 1 . 0 . 3C o n n e c t i o n : c l o s eContent−Type : a p p l i c a t i o n / s p a r q l−r e s u l t s+xml
<?xml v e r s i o n =”1.0”?><s p a r q l xmlns=”h t t p : / /www. w3 . org /2005/ s p a r q l− r e s u l t s #”>
<head><v a r i a b l e name=”book”/><v a r i a b l e name=”who”/>
</head>< r e s u l t s d i s t i n c t =” f a l s e ” o r d e r e d=” f a l s e ”>
< r e s u l t ><b i n d i n g name=”book”><u r i >h t t p : / /www. example / book5</u r i ></b i n d i n g><b i n d i n g name=”who”><bnode>r29392923r2922 </bnode></b i n d i n g>
</ r e s u l t >. . .
</ s p a r q l >Gerd Groner, Matthias Thimm Semantic Web 12 / 47
SPARQL Result XML Binding
<?xml v e r s i o n =”1.0”?><s p a r q l xmlns=”h t t p : / /www. w3 . org /2005/ s p a r q l− r e s u l t s #”>
<head><v a r i a b l e name=”name”/><v a r i a b l e name=”mbox”/>
</head>
< r e s u l t s >< r e s u l t >
<b i n d i n g name=”name”> . . . </b i n d i n g><b i n d i n g name=”mbox”> . . . </b i n d i n g>
</ r e s u l t >
< r e s u l t ><b i n d i n g name=”name”> . . . </b i n d i n g><b i n d i n g name=”mbox”> . . . </b i n d i n g>
</ r e s u l t >. . .
</ r e s u l t s ></ s p a r q l >
Gerd Groner, Matthias Thimm Semantic Web 13 / 47
Binding Types inside Results
< r e s u l t ><b i n d i n g name=”x”>
<bnode>r2</bnode></b i n d i n g>
<b i n d i n g name=”hpage”><u r i >h t t p : / / work . example . org /bob/</ u r i >
</b i n d i n g>
<b i n d i n g name=”name”>< l i t e r a l xml : l a n g=”en”>Bob</ l i t e r a l >
</b i n d i n g>
<b i n d i n g name=”age”>< l i t e r a l
d a t a t y p e=”h t t p : / /www. w3 . org /2001/XMLSchema#i n t e g e r ”>30</ l i t e r a l >
</b i n d i n g>
<b i n d i n g name=”mbox”><u r i >m a i l t o : bob@work . example . org</u r i >
</b i n d i n g></ r e s u l t >
Gerd Groner, Matthias Thimm Semantic Web 14 / 47
Outline
1 SPARQL Protocol and RDF Query Language
2 SPARQL Protocol
3 Networked Graphs
4 SPARQL 1.1
Gerd Groner, Matthias Thimm Semantic Web 15 / 47
Networked Graphs
Simon Schenk and Steffen Staab“Networked Graphs: A Declarative Mechanism for SPARQL Rules,SPARQL Views and RDF Data Integration on the Web”,Proceedings of the 17th International World Wide WebConference, WWW2008, Bejing, China, 2008.
Gerd Groner, Matthias Thimm Semantic Web 16 / 47
Networked Graphs
I Basic Idea: define RDF graphsI extensionally by listing statements (RDF triples) orI intensionally using views
I possible to have view within another view (recursion)
I Integrate with existing Semantic Web infrastructure
I Easy exchange of (networked) graphs
I Use existing data sources (RDF graphs)
Gerd Groner, Matthias Thimm Semantic Web 17 / 47
Background: The Vision of the Semantic Web
I A Web of Machineunderstandable data.
I Allowing for Reuse of data.
I A Distributed Infrastructure
I Managed by autonomousagents
I Fundamental technologies:I RDF, SPARQLI HTTP GET
Gerd Groner, Matthias Thimm Semantic Web 18 / 47
Background: The Vision of the Semantic Web
I A Web of Machineunderstandable data.
I Allowing for Reuse of data.
I A Distributed Infrastructure
I Managed by autonomousagents
I Fundamental technologies:I RDF, SPARQLI HTTP GET
Gerd Groner, Matthias Thimm Semantic Web 18 / 47
Mashups and Semantic Web
Mashups most popular model:
I Hack-and-Hope
Semantic Web most popularmodel:
I Crawl-Integrate-and-Reason
I Disadvantages:I Screen-Scraping (reading,
extracting text fromdocuments (Web pages))
I No agreement on a datamodel
I It may be differentsometimes:
I Google Web serviceI Amazon Web service
I Disadvantages:I Data is outdatedI Data is not declarative,
but only has implicitsemantics
I Scalability problemsI Access rights: not all is
allowed to be copied /published
I Provenance – datasources can be blurred
Gerd Groner, Matthias Thimm Semantic Web 20 / 47
Mashups and Semantic Web
Mashups most popular model:
I Hack-and-Hope
Semantic Web most popularmodel:
I Crawl-Integrate-and-Reason
I Disadvantages:I Screen-Scraping (reading,
extracting text fromdocuments (Web pages))
I No agreement on a datamodel
I It may be differentsometimes:
I Google Web serviceI Amazon Web service
I Disadvantages:I Data is outdatedI Data is not declarative,
but only has implicitsemantics
I Scalability problemsI Access rights: not all is
allowed to be copied /published
I Provenance – datasources can be blurred
Gerd Groner, Matthias Thimm Semantic Web 20 / 47
Mashups and Semantic Web
Mashups most popular model:
I Hack-and-Hope
Semantic Web most popularmodel:
I Crawl-Integrate-and-Reason
I Disadvantages:I Screen-Scraping (reading,
extracting text fromdocuments (Web pages))
I No agreement on a datamodel
I It may be differentsometimes:
I Google Web serviceI Amazon Web service
I Disadvantages:I Data is outdatedI Data is not declarative,
but only has implicitsemantics
I Scalability problemsI Access rights: not all is
allowed to be copied /published
I Provenance – datasources can be blurred
Gerd Groner, Matthias Thimm Semantic Web 20 / 47
Objectives in this Scenario
I A Web of Machineunderstandable data.
I Allowing for Reuse of data.
I A Distributed Infrastructure
I Managed by autonomousagents
I Hardly any assumptions:I RDF, SPARQLI HTTP GET
⇒ Objectives:
I Distributed Views
I Re-use existing languages/protocols (RDF extension)
I Default-negation
I Easy to learn, easy to exchange
Gerd Groner, Matthias Thimm Semantic Web 22 / 47
Objectives in this Scenario
I A Web of Machineunderstandable data.
I Allowing for Reuse of data.
I A Distributed Infrastructure
I Managed by autonomousagents
I Hardly any assumptions:I RDF, SPARQLI HTTP GET
⇒ Objectives:
I Distributed Views
I Re-use existing languages/protocols (RDF extension)
I Default-negation
I Easy to learn, easy to exchange
Gerd Groner, Matthias Thimm Semantic Web 22 / 47
Remember: Named Graphs
Used to group a set of triples.Gerd Groner, Matthias Thimm Semantic Web 23 / 47
Remember: SPARQL CONSTRUCT Queries
Build an RDF graph from existing triples (here from namedgraphs):
CONSTRUCT {?member f o a f : c u r r e n t P r o j e c t : SemWebProject .?member f o a f : phone ? phone }
FROM NAMED : bobFOAFFROM NAMED : chrisFOAFFROM NAMED . . .WHERE {
GRAPH ? f o a f F i l e {? f o a f F i l e f o a f : p r i m a r y T o p i c ?member
OPTIONAL {?member f o a f : phone ? phone
FILTER (REGEX(? phone , ‘ ‘ ˆ\+ 4 9 ’ ’ ) )}
}}
Query: “Find the persons described in the FOAF files, their phone numbers, ifavailable. German phone numbers only.”
Gerd Groner, Matthias Thimm Semantic Web 24 / 47
Now: Networked Graphs
Basic Idea: Define RDF graphsI extensionally by listing statements orI intensionally using views based on SPARQL
Gerd Groner, Matthias Thimm Semantic Web 25 / 47
Independence Set
I S (BobFOAF) = { BobFOAFM i k e s P r o j e c t ,DBLP,
ChrisFOAF}
Semantics of the independence set:
Iteratively evaluate all views until a fixpoint is reached. Possibleproblems:
I Termination of recursion
I Non-monotonic negation
Gerd Groner, Matthias Thimm Semantic Web 26 / 47
Well-founded Semantics
I Models of the Well Founded SemanticsI closed-world (Here: rules only, graphs may be open worlds.)I pessimistic (negate as much as possible)I Interpretation of each fact is three-valued (true, false,
unknown)
I Usage with RDF In RDF only true statements; false, unknowntreated equally. In Networked Graphs false and unknown areused for interim results.
Gerd Groner, Matthias Thimm Semantic Web 27 / 47
RDF Mashups
I with declarative, dynamic semantics, Mashups can be verypowerful!
I re-use and combine data from several data sources on the Web
Gerd Groner, Matthias Thimm Semantic Web 28 / 47
Networked Graphs — Architecture
I Based on Sesame 2 (http://www.openrdf.org/)
I Open source (http://west.uni-koblenz.de/Research/systems/NetworkedGraphs)
Gerd Groner, Matthias Thimm Semantic Web 29 / 47
Networked Graphs — Conclusion
I Networked Graphs allows for easyI formulation,I exchange andI distributed evaluation
of rules and views for the Semantic Web using defaultnegation
Networked Graphs integrate well with existing mechanisms and canbe evaluated efficiently.
Networked Graphs are eases the re-use of existing data in variousapplications.
Gerd Groner, Matthias Thimm Semantic Web 30 / 47
Outline
1 SPARQL Protocol and RDF Query Language
2 SPARQL Protocol
3 Networked Graphs
4 SPARQL 1.1
Gerd Groner, Matthias Thimm Semantic Web 31 / 47
SPARQL 1.1
... some extensions in SPARQL 1.1:
I Projected expressions
I Aggregationen
I Subqueries
I Negation
I Property path
I Service descriptions
I Update language
I Update protocol
I HTTP RDF update (RESTful)
I Basic federated query
Gerd Groner, Matthias Thimm Semantic Web 32 / 47
Projected expressions
I SELECT queries are no longer restricted to variables:
SELECT (? p r i c e ∗ ? amount AS ? t o t a l P r i c e )WHERE { . . . }
Gerd Groner, Matthias Thimm Semantic Web 33 / 47
Aggregation
I Like in SQL: COUNT, SUM, MIN, MAX, GROUP BY
SELECT (MIN(? p r i c e ) AS ? m i n P r i c e ) . . .WHERE { . . . }GROUP BY ? a r t i c l e
Gerd Groner, Matthias Thimm Semantic Web 34 / 47
Subqueries
I nested queriesI multiple queries can be cominged in one query
SELECT ? a r t i c l e ? a u t h o rWHERE
? a r t i c l e ex : a u t h o r ? a u t h o r .{ SELECT ? a r t i c l e
WHERE { . . . ? a r t i c l e . . . }ORDER BY . . .LIMIT . . .
}
Result of a query is nested in another query.
Gerd Groner, Matthias Thimm Semantic Web 35 / 47
Negation
I The trick with OPTIONAL and BOUND is no longer needed.
SELECT . . .WHERE { ? p e r s o n a f o a f : Person .
MINUS { ? p e r s o n f o a f : mbox ? e m a i l }}
Both graph patterns are evaluated and then the difference isreturned.
Gerd Groner, Matthias Thimm Semantic Web 36 / 47
Negation (2)
I Alternative construct NOT EXISTS
SELECT . . .WHERE { ? p e r s o n a f o a f : Person .{ FILTER NOT EXISTS {
? p e r s o n f o a f : mbox ? e m a i l}
}
Both graph patterns are evaluated and then the difference isreturned.
Gerd Groner, Matthias Thimm Semantic Web 37 / 47
Property path
I Excerpt of constructors (let p be an IRI for a predicate, e.g.,foaf:knows):
I ^p inverse pathI p* multiple times (including 0)I p+ multiple times at least 1I p? zero or one timesI p1/p2 sequenceI p1|p2 alternative
Gerd Groner, Matthias Thimm Semantic Web 38 / 47
Property path (2)
I “regular expression”
SELECT . . .WHERE {
? p e r s o n f o a f : knows+ ? network .}
Gerd Groner, Matthias Thimm Semantic Web 39 / 47
Property path (3)
SELECT . . .WHERE {
? book dc : t i t l e | r d f s : l a b e l ? d i s p l a y S t r i n g .}
I “regular expression”
I e.g., alternative
Gerd Groner, Matthias Thimm Semantic Web 40 / 47
Property path (3)
SELECT . . .WHERE {
? book dc : t i t l e | r d f s : l a b e l ? d i s p l a y S t r i n g .}
I “regular expression”
I e.g., alternative
Gerd Groner, Matthias Thimm Semantic Web 40 / 47
Property path (3)
I “regular expression”I e.g., Sequence:
I “Find the names of people 2 foaf:knows links away.”
SELECT . . .WHERE {
? x f o a f : mbox <m a i l t o : a l i c e @ e x a m p l e> .? x f o a f : knows / f o a f : knows / f o a f : name ?name .
}
→ without property path?
Gerd Groner, Matthias Thimm Semantic Web 41 / 47
Property path (3)
I “regular expression”I e.g., Sequence:
I “Find the names of people 2 foaf:knows links away.”
SELECT . . .WHERE {
? x f o a f : mbox <m a i l t o : a l i c e @ e x a m p l e> .? x f o a f : knows / f o a f : knows / f o a f : name ?name .
}
→ without property path?
Gerd Groner, Matthias Thimm Semantic Web 41 / 47
Property path (4)
→ equivalent query (without property path):
SELECT . . .WHERE {
? x f o a f : mbox <m a i l t o : a l i c e @ e x a m p l e> .? x f o a f : knows ? a1 .? a1 f o a f : knows ? a2 .? a2 f o a f : name ?name .
}
Gerd Groner, Matthias Thimm Semantic Web 42 / 47
HAVING (over GROUPed sets)
HAVING operates over grouped solution sets, in the same way thatFILTER operates over un-grouped sets
PREFIX : <h t t p : / / data . example/>
SELECT (AVG(? s i z e ) AS ? a s i z e )WHERE {
? x : s i z e ? s i z e}GROUP BY ? xHAVING(AVG(? s i z e ) > 10)
→ This will return average sizes, grouped by the subject, but onlywhere the mean size is greater than 10.
Gerd Groner, Matthias Thimm Semantic Web 43 / 47
HAVING (over GROUPed sets)
HAVING operates over grouped solution sets, in the same way thatFILTER operates over un-grouped sets
PREFIX : <h t t p : / / data . example/>
SELECT (AVG(? s i z e ) AS ? a s i z e )WHERE {
? x : s i z e ? s i z e}GROUP BY ? xHAVING(AVG(? s i z e ) > 10)
→ This will return average sizes, grouped by the subject, but onlywhere the mean size is greater than 10.
Gerd Groner, Matthias Thimm Semantic Web 43 / 47
Summary
I SPARQLI Standard query language for RDF
I SELECT, CONSTRUCT, ASK, DESCRIBEI Extensive filters, optional and alternative patterns
I Protocol for queries and resultsI Based on triples model (subject-predicate-object)
I No logic inference in language, only in underlying knowledgebase
I Named graphsI Separate or specialized knowledge
Gerd Groner, Matthias Thimm Semantic Web 44 / 47
Learning goals
I Understanding basic and advanced queries using SPARQL.
I Gain knowledge about how to query reified data.
Gerd Groner, Matthias Thimm Semantic Web 45 / 47
SPARQL endpoints
Name URL What’s in there?DBPedia http://dbpedia.org/sparql extensive RDF data
from WikipediaSPARQLer tttp://sparql.org/sparql.html General-purpose
query endpoint forWeb-accessible data
DBLP http://www4.wiwiss.fu-berlin.de/dblp/snorql/ Bibliographic data from computerscience journals and conferences
LinkedMDB http://data.linkedmdb.org/sparql Films, actors, directors,writers, producers, etc.
World factbook http://www4.wiwiss.fu-berlin.de/factbook/snorql/ Country statistics fromthe CIA World Factbook
bio2rdf http://bio2rdf.org/sparql Bioinformatics data fromaround 40 public databases
Gerd Groner, Matthias Thimm Semantic Web 46 / 47
Exercise
I querying DBpedia endpointI Find 10 cities (http://dbpedia.org/ontology/City that
are located in (propertyhttp://dbpedia.org/ontology/country) Germany(http://dbpedia.org/resource/Germany)
I Find 10 “Populated Places” (URI:http://dbpedia.org/ontology/PopulatedPlace) withtheir names (rdfs:label) in which the names are described inGerman (i.e., language tag ”de”).
I Find 10 people (http://dbpedia.org/ontology/Person)with place of birth(http://dbpedia.org/ontology/birthPlace) Berlin(http://dbpedia.org/resource/Berlin)
Gerd Groner, Matthias Thimm Semantic Web 47 / 47