universidad de chilejperez/talks/rw2012.pdf · an example of an rdf graph to query: dblp...

338
Federation and Navigation in SPARQL 1.1 Jorge P´ erez Assistant Professor Department of Computer Science Universidad de Chile

Upload: others

Post on 18-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Federation and Navigation in SPARQL 1.1

Jorge Perez

Assistant ProfessorDepartment of Computer Science

Universidad de Chile

Page 2: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Outline

Basics of SPARQLSyntax and Semantics of SPARQL 1.0What is new in SPARQL 1.1

Federation: SERVICE operatorSyntax and SemanticsEvaluation of SERVICE queries

Navigation: Property PathsNavigating graphs with regular expressionsThe history of paths (in SPARQL 1.1 specification)Evaluation procedures and complexity

Page 3: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL query language for RDF

Page 4: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL query language for RDF

RDF Graph:

[email protected]:name

:email

:phone

:name

:friendOf [email protected]:email

Federico Meza35-446928

Ruth Garrido

URI 2URI 1

RDF-triples: (URI 2, :email, [email protected])

Page 5: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL query language for RDF

RDF Graph:

[email protected]:name

:email

:phone

:name

:friendOf [email protected]:email

Federico Meza35-446928

Ruth Garrido

URI 2URI 1

RDF-triples: (URI 2, :email, [email protected])

SPARQL Query:

SELECT ?N

WHERE

{

?X :name ?N .

}

Page 6: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL query language for RDF

RDF Graph:

[email protected]:name

:email

:phone

:name

:friendOf [email protected]:email

Federico Meza35-446928

Ruth Garrido

URI 2URI 1

RDF-triples: (URI 2, :email, [email protected])

SPARQL Query:

SELECT ?N ?E

WHERE

{

?X :name ?N .

?X :email ?E .

}

Page 7: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL query language for RDF

RDF Graph:

[email protected]:name

:email

:phone

:name

:friendOf [email protected]:email

Federico Meza35-446928

Ruth Garrido

URI 2URI 1

RDF-triples: (URI 2, :email, [email protected])

SPARQL Query:

SELECT ?N ?E

WHERE

{

?X :name ?N .

?X :email ?E .

?X :friendOf ?Y . ?Y :name "Ruth Garrido"

}

Page 8: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

An example of an RDF graph to query: DBLP

inAMW:SarmaUW09 :Jeffrey D. Ullman

:Anish Das Sarma

:Jennifer Widom

inAMW:2009

"Schema Design for ..."

dc:creatordc:creator

dc:cre

ator

dct:partOf

dc:titleswrc:series

conf:amw

<http://purl.org/dc/terms/>

: <http://dblp.l3s.de/d2r/resource/authors/>

conf: <http://dblp.l3s.de/d2r/resource/conferences/>

inAMW: <http://dblp.l3s.de/d2r/resource/publications/conf/amw/>

swrc: <http://swrc.ontoware.org/ontology#>

dc:

dct:

<http://purl.org/dc/elements/1.1/>

Page 9: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL: A Simple RDF Query Language

Example: Authors that have published in ISWC

Page 10: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL: A Simple RDF Query Language

Example: Authors that have published in ISWC

SELECT ?Author

Page 11: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL: A Simple RDF Query Language

Example: Authors that have published in ISWC

SELECT ?Author

WHERE

{

}

Page 12: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL: A Simple RDF Query Language

Example: Authors that have published in ISWC

SELECT ?Author

WHERE

{

?Paper dc:creator ?Author .

}

Page 13: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL: A Simple RDF Query Language

Example: Authors that have published in ISWC

SELECT ?Author

WHERE

{

?Paper dc:creator ?Author .

?Paper dct:partOf ?Conf .

}

Page 14: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL: A Simple RDF Query Language

Example: Authors that have published in ISWC

SELECT ?Author

WHERE

{

?Paper dc:creator ?Author .

?Paper dct:partOf ?Conf .

?Conf swrc:series conf:iswc .

}

Page 15: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL: A Simple RDF Query Language

Example: Authors that have published in ISWC

SELECT ?Author

WHERE

{

?Paper dc:creator ?Author .

?Paper dct:partOf ?Conf .

?Conf swrc:series conf:iswc .

}

A SPARQL query consists of a:

Page 16: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL: A Simple RDF Query Language

Example: Authors that have published in ISWC

SELECT ?Author

WHERE

{

?Paper dc:creator ?Author .

?Paper dct:partOf ?Conf .

?Conf swrc:series conf:iswc .

}

A SPARQL query consists of a:

Head: Processing of the variables

Page 17: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL: A Simple RDF Query Language

Example: Authors that have published in ISWC

SELECT ?Author

WHERE

{

?Paper dc:creator ?Author .

?Paper dct:partOf ?Conf .

?Conf swrc:series conf:iswc .

}

A SPARQL query consists of a:

Head: Processing of the variables

Body: Pattern matching expression

Page 18: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL: A Simple RDF Query Language

Example: Authors that have published in ISWC, and their Webpages if this information is available:

SELECT ?Author ?WebPage

WHERE

{

?Paper dc:creator ?Author .

?Paper dct:partOf ?Conf .

?Conf swrc:series conf:iswc .

OPTIONAL {

?Author foaf:homePage ?WebPage . }

}

Page 19: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL: A Simple RDF Query Language

Example: Authors that have published in ISWC, and their Webpages if this information is available:

SELECT ?Author ?WebPage

WHERE

{

?Paper dc:creator ?Author .

?Paper dct:partOf ?Conf .

?Conf swrc:series conf:iswc .

OPTIONAL {

?Author foaf:homePage ?WebPage . }

}

Page 20: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

But things can become more complex...

Interesting features of patternmatching on graphs SELECT ?X1 ?X2 ...

{ P1 .

P2 }

Page 21: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

But things can become more complex...

Interesting features of patternmatching on graphs

◮ Grouping

SELECT ?X1 ?X2 ...

{{ P1 .

P2 }

{ P3 .

P4 }

}

Page 22: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

But things can become more complex...

Interesting features of patternmatching on graphs

◮ Grouping

◮ Optional parts

SELECT ?X1 ?X2 ...

{{ P1 .

P2

OPTIONAL { P5 } }

{ P3 .

P4

OPTIONAL { P7 } }

}

Page 23: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

But things can become more complex...

Interesting features of patternmatching on graphs

◮ Grouping

◮ Optional parts

◮ Nesting

SELECT ?X1 ?X2 ...

{{ P1 .

P2

OPTIONAL { P5 } }

{ P3 .

P4

OPTIONAL { P7

OPTIONAL { P8 } } }

}

Page 24: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

But things can become more complex...

Interesting features of patternmatching on graphs

◮ Grouping

◮ Optional parts

◮ Nesting

◮ Union of patterns

SELECT ?X1 ?X2 ...

{{{ P1 .

P2

OPTIONAL { P5 } }

{ P3 .

P4

OPTIONAL { P7

OPTIONAL { P8 } } }

}

UNION

{ P9 }}

Page 25: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

But things can become more complex...

Interesting features of patternmatching on graphs

◮ Grouping

◮ Optional parts

◮ Nesting

◮ Union of patterns

◮ Filtering

◮ ...

◮ + several new features inthe upcoming version:federation, navigation

SELECT ?X1 ?X2 ...

{{{ P1 .

P2

OPTIONAL { P5 } }

{ P3 .

P4

OPTIONAL { P7

OPTIONAL { P8 } } }

}

UNION

{ P9

FILTER ( R ) }}

Page 26: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

But things can become more complex...

Interesting features of patternmatching on graphs

◮ Grouping

◮ Optional parts

◮ Nesting

◮ Union of patterns

◮ Filtering

◮ ...

◮ + several new features inthe upcoming version:federation, navigation

SELECT ?X1 ?X2 ...

{{{ P1 .

P2

OPTIONAL { P5 } }

{ P3 .

P4

OPTIONAL { P7

OPTIONAL { P8 } } }

}

UNION

{ P9

FILTER ( R ) }}

What is the (formal) meaning of a general SPARQL query?

Page 27: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Outline

Basics of SPARQLSyntax and Semantics of SPARQL 1.0What is new in SPARQL 1.1

Federation: SERVICE operatorSyntax and SemanticsEvaluation of SERVICE queries

Navigation: Property PathsNavigating graphs with regular expressionsThe history of paths (in SPARQL 1.1 specification)Evaluation procedures and complexity

Page 28: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

RDF triples and graphs

Subject ObjectPredicate

LB

U

U UB

U : set of URIs

B : set of blank nodes

L : set of literals

Page 29: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

RDF triples and graphs

Subject ObjectPredicate

LB

U

U UB

U : set of URIs

B : set of blank nodes

L : set of literals

(s, p, o) ∈ (U ∪ B)× U × (U ∪ B ∪ L) is called an RDF triple

Page 30: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

RDF triples and graphs

Subject ObjectPredicate

LB

U

U UB

U : set of URIs

B : set of blank nodes

L : set of literals

(s, p, o) ∈ (U ∪ B)× U × (U ∪ B ∪ L) is called an RDF triple

A set of RDF triples is called an RDF graph

Page 31: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

RDF triples and graphs

Subject ObjectPredicate

LB

U

U UB

U : set of URIs

B : set of blank nodes

L : set of literals

(s, p, o) ∈ (U ∪ B)× U × (U ∪ B ∪ L) is called an RDF triple

A set of RDF triples is called an RDF graph

In this talk, we do not consider blank nodes

◮ (s, p, o) ∈ U × U × (U ∪ L) is called an RDF triple

Page 32: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

A standard algebraic syntax

◮ Triple patterns: just RDF triples + variables (from a set V )

?X :name "john" (?X , name, john)

Page 33: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

A standard algebraic syntax

◮ Triple patterns: just RDF triples + variables (from a set V )

?X :name "john" (?X , name, john)

◮ Graph patterns: full parenthesized algebra

original SPARQL syntax algebraic syntax

{ P1 . P2 } (P1 AND P2 )

Page 34: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

A standard algebraic syntax

◮ Triple patterns: just RDF triples + variables (from a set V )

?X :name "john" (?X , name, john)

◮ Graph patterns: full parenthesized algebra

original SPARQL syntax algebraic syntax

{ P1 . P2 } (P1 AND P2 )

{ P1 OPTIONAL { P2 }} (P1 OPT P2 )

Page 35: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

A standard algebraic syntax

◮ Triple patterns: just RDF triples + variables (from a set V )

?X :name "john" (?X , name, john)

◮ Graph patterns: full parenthesized algebra

original SPARQL syntax algebraic syntax

{ P1 . P2 } (P1 AND P2 )

{ P1 OPTIONAL { P2 }} (P1 OPT P2 )

{ P1 } UNION { P2 } (P1 UNION P2 )

Page 36: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

A standard algebraic syntax

◮ Triple patterns: just RDF triples + variables (from a set V )

?X :name "john" (?X , name, john)

◮ Graph patterns: full parenthesized algebra

original SPARQL syntax algebraic syntax

{ P1 . P2 } (P1 AND P2 )

{ P1 OPTIONAL { P2 }} (P1 OPT P2 )

{ P1 } UNION { P2 } (P1 UNION P2 )

{ P1 FILTER ( R ) } (P1 FILTER R )

Page 37: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

A standard algebraic syntax (cont.)

◮ Explicit precedence/association

Example

{ t1

t2

OPTIONAL { t3 }

OPTIONAL { t4 }

t5

}

( ( ( ( t1 AND t2 ) OPT t3 ) OPT t4 ) AND t5 )

Page 38: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Mappings: building block for the semantics

Definition

A mapping is a partial function from variables to RDF terms

µ : V → U ∪ L

Page 39: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Mappings: building block for the semantics

Definition

A mapping is a partial function from variables to RDF terms

µ : V → U ∪ L

Given a mapping µ and a triple pattern t:

Page 40: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Mappings: building block for the semantics

Definition

A mapping is a partial function from variables to RDF terms

µ : V → U ∪ L

Given a mapping µ and a triple pattern t:

◮ µ(t): triple obtained from t replacing variables according to µ

Page 41: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Mappings: building block for the semantics

Definition

A mapping is a partial function from variables to RDF terms

µ : V → U ∪ L

Given a mapping µ and a triple pattern t:

◮ µ(t): triple obtained from t replacing variables according to µ

Example

Page 42: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Mappings: building block for the semantics

Definition

A mapping is a partial function from variables to RDF terms

µ : V → U ∪ L

Given a mapping µ and a triple pattern t:

◮ µ(t): triple obtained from t replacing variables according to µ

Example

µ = {?X → R1, ?Y → R2, ?Name → john}

Page 43: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Mappings: building block for the semantics

Definition

A mapping is a partial function from variables to RDF terms

µ : V → U ∪ L

Given a mapping µ and a triple pattern t:

◮ µ(t): triple obtained from t replacing variables according to µ

Example

µ = {?X → R1, ?Y → R2, ?Name → john}

t = (?X , name, ?Name)

Page 44: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Mappings: building block for the semantics

Definition

A mapping is a partial function from variables to RDF terms

µ : V → U ∪ L

Given a mapping µ and a triple pattern t:

◮ µ(t): triple obtained from t replacing variables according to µ

Example

µ = {?X → R1, ?Y → R2, ?Name → john}

t = (?X , name, ?Name)

µ(t) = (R1, name, john)

Page 45: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

The semantics of triple patterns

Definition

The evaluation of triple patter t over a graph G , denoted by JtKG ,is the set of all mappings µ such that:

Page 46: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

The semantics of triple patterns

Definition

The evaluation of triple patter t over a graph G , denoted by JtKG ,is the set of all mappings µ such that:

◮ dom(µ) is exactly the set of variables occurring in t

Page 47: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

The semantics of triple patterns

Definition

The evaluation of triple patter t over a graph G , denoted by JtKG ,is the set of all mappings µ such that:

◮ dom(µ) is exactly the set of variables occurring in t

◮ µ(t) ∈ G

Page 48: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example

G(R1, name, john)

(R1, email, [email protected])(R2, name, paul)

J(?X , name, ?N)KG

Page 49: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example

G(R1, name, john)

(R1, email, [email protected])(R2, name, paul)

J(?X , name, ?N)KG{

µ1 = {?X → R1, ?N → john}µ2 = {?X → R2, ?N → paul}

}

Page 50: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example

G(R1, name, john)

(R1, email, [email protected])(R2, name, paul)

J(?X , name, ?N)KG{

µ1 = {?X → R1, ?N → john}µ2 = {?X → R2, ?N → paul}

}

J(?X , email, ?E )KG

Page 51: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example

G(R1, name, john)

(R1, email, [email protected])(R2, name, paul)

J(?X , name, ?N)KG{

µ1 = {?X → R1, ?N → john}µ2 = {?X → R2, ?N → paul}

}

J(?X , email, ?E )KG{

µ = {?X → R1, ?E → [email protected]}}

Page 52: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example

G(R1, name, john)

(R1, email, [email protected])(R2, name, paul)

J(?X , name, ?N)KG

?X ?Nµ1 R1 johnµ2 R2 paul

J(?X , email, ?E )KG

?X ?Eµ R1 [email protected]

Page 53: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example

G(R1, name, john)

(R1, email, [email protected])(R2, name, paul)

J(R1,webPage, ?W )KG

J(R2, name, paul)KG

J(R3, name, ringo)KG

Page 54: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example

G(R1, name, john)

(R1, email, [email protected])(R2, name, paul)

J(R1,webPage, ?W )KG{ }

J(R2, name, paul)KG

J(R3, name, ringo)KG

Page 55: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example

G(R1, name, john)

(R1, email, [email protected])(R2, name, paul)

J(R1,webPage, ?W )KG{ }

J(R2, name, paul)KG

J(R3, name, ringo)KG{ }

Page 56: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example

G(R1, name, john)

(R1, email, [email protected])(R2, name, paul)

J(R1,webPage, ?W )KG{ }

J(R2, name, paul)KG{

µ∅ = { }}

J(R3, name, ringo)KG{ }

Page 57: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Compatible mappings: mappings that can be merged.

Definition

The mappings µ1, µ2 are compatibles iff they agreein their shared variables:

Page 58: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Compatible mappings: mappings that can be merged.

Definition

The mappings µ1, µ2 are compatibles iff they agreein their shared variables:

◮ µ1(?X ) = µ2(?X ) for every ?X ∈ dom(µ1) ∩ dom(µ2).

Page 59: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Compatible mappings: mappings that can be merged.

Definition

The mappings µ1, µ2 are compatibles iff they agreein their shared variables:

◮ µ1(?X ) = µ2(?X ) for every ?X ∈ dom(µ1) ∩ dom(µ2).

µ1 ∪ µ2 is also a mapping.

Page 60: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Compatible mappings: mappings that can be merged.

Definition

The mappings µ1, µ2 are compatibles iff they agreein their shared variables:

◮ µ1(?X ) = µ2(?X ) for every ?X ∈ dom(µ1) ∩ dom(µ2).

µ1 ∪ µ2 is also a mapping.

Example

?X ?Y ?U ?Vµ1 R1 johnµ2 R1 [email protected]µ3 [email protected] R2

Page 61: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Compatible mappings: mappings that can be merged.

Definition

The mappings µ1, µ2 are compatibles iff they agreein their shared variables:

◮ µ1(?X ) = µ2(?X ) for every ?X ∈ dom(µ1) ∩ dom(µ2).

µ1 ∪ µ2 is also a mapping.

Example

?X ?Y ?U ?Vµ1 R1 johnµ2 R1 [email protected]µ3 [email protected] R2

Page 62: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Compatible mappings: mappings that can be merged.

Definition

The mappings µ1, µ2 are compatibles iff they agreein their shared variables:

◮ µ1(?X ) = µ2(?X ) for every ?X ∈ dom(µ1) ∩ dom(µ2).

µ1 ∪ µ2 is also a mapping.

Example

?X ?Y ?U ?Vµ1 R1 johnµ2 R1 [email protected]µ3 [email protected] R2

µ1 ∪ µ2 R1 john [email protected]

Page 63: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Compatible mappings: mappings that can be merged.

Definition

The mappings µ1, µ2 are compatibles iff they agreein their shared variables:

◮ µ1(?X ) = µ2(?X ) for every ?X ∈ dom(µ1) ∩ dom(µ2).

µ1 ∪ µ2 is also a mapping.

Example

?X ?Y ?U ?Vµ1 R1 johnµ2 R1 [email protected]µ3 [email protected] R2

µ1 ∪ µ2 R1 john [email protected]

Page 64: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Compatible mappings: mappings that can be merged.

Definition

The mappings µ1, µ2 are compatibles iff they agreein their shared variables:

◮ µ1(?X ) = µ2(?X ) for every ?X ∈ dom(µ1) ∩ dom(µ2).

µ1 ∪ µ2 is also a mapping.

Example

?X ?Y ?U ?Vµ1 R1 johnµ2 R1 [email protected]µ3 [email protected] R2

µ1 ∪ µ2 R1 john [email protected]

µ1 ∪ µ3 R1 john [email protected] R2

Page 65: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Compatible mappings: mappings that can be merged.

Definition

The mappings µ1, µ2 are compatibles iff they agreein their shared variables:

◮ µ1(?X ) = µ2(?X ) for every ?X ∈ dom(µ1) ∩ dom(µ2).

µ1 ∪ µ2 is also a mapping.

Example

?X ?Y ?U ?Vµ1 R1 johnµ2 R1 [email protected]µ3 [email protected] R2

µ1 ∪ µ2 R1 john [email protected]

µ1 ∪ µ3 R1 john [email protected] R2

µ∅ = { } is compatible with every mapping.

Page 66: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Sets of mappings and operations

Let M1 and M2 be sets of mappings:

Definition

Join: M1 ⋊⋉ M2

◮ {µ1 ∪ µ2 | µ1 ∈ M1, µ2 ∈ M2, and µ1, µ2 are compatibles}

◮ extending mappings in M1 with compatible mappings in M2

will be used to define AND

Page 67: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Sets of mappings and operations

Let M1 and M2 be sets of mappings:

Definition

Join: M1 ⋊⋉ M2

◮ {µ1 ∪ µ2 | µ1 ∈ M1, µ2 ∈ M2, and µ1, µ2 are compatibles}

◮ extending mappings in M1 with compatible mappings in M2

will be used to define AND

Definition

Union: M1 ∪M2

◮ {µ | µ ∈ M1 or µ ∈ M2}

◮ mappings in M1 plus mappings in M2 (the usual set union)

will be used to define UNION

Page 68: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Sets of mappings and operations

Definition

Difference: M1 rM2

◮ {µ ∈ M1 | for all µ′ ∈ M2, µ and µ′ are not compatibles}

◮ mappings in M1 that cannot be extended with mappings in M2

Page 69: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Sets of mappings and operations

Definition

Difference: M1 rM2

◮ {µ ∈ M1 | for all µ′ ∈ M2, µ and µ′ are not compatibles}

◮ mappings in M1 that cannot be extended with mappings in M2

Definition

Left outer join: M1 M2 = (M1 ⋊⋉ M2) ∪ (M1 rM2)

◮ extension of mappings in M1 with compatible mappings in M2

◮ plus the mappings in M1 that cannot be extended.

will be used to define OPT

Page 70: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Semantics of general graph patterns

Definition

Given a graph G the evaluation of a pattern is recursively defined

the base case is the evaluation of a triple pattern.

Page 71: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Semantics of general graph patterns

Definition

Given a graph G the evaluation of a pattern is recursively defined

◮ J(P1 AND P2)KG = JP1KG ⋊⋉ JP2KG

the base case is the evaluation of a triple pattern.

Page 72: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Semantics of general graph patterns

Definition

Given a graph G the evaluation of a pattern is recursively defined

◮ J(P1 AND P2)KG = JP1KG ⋊⋉ JP2KG

◮ J(P1 UNION P2)KG = JP1KG ∪ JP2KG

the base case is the evaluation of a triple pattern.

Page 73: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Semantics of general graph patterns

Definition

Given a graph G the evaluation of a pattern is recursively defined

◮ J(P1 AND P2)KG = JP1KG ⋊⋉ JP2KG

◮ J(P1 UNION P2)KG = JP1KG ∪ JP2KG

◮ J(P1 OPT P2)KG = JP1KG JP2KG

the base case is the evaluation of a triple pattern.

Page 74: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (AND)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J((?X , name, ?N) AND (?X , email, ?E ))KG

Page 75: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (AND)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J((?X , name, ?N) AND (?X , email, ?E ))KG

J(?X , name, ?N)KG ⋊⋉ J(?X , email, ?E )KG

Page 76: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (AND)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J((?X , name, ?N) AND (?X , email, ?E ))KG

J(?X , name, ?N)KG ⋊⋉ J(?X , email, ?E )KG

?X ?Nµ1 R1 johnµ2 R2 paulµ3 R3 ringo

Page 77: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (AND)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J((?X , name, ?N) AND (?X , email, ?E ))KG

J(?X , name, ?N)KG ⋊⋉ J(?X , email, ?E )KG

?X ?Nµ1 R1 johnµ2 R2 paulµ3 R3 ringo

?X ?Eµ4 R1 [email protected]µ5 R3 [email protected]

Page 78: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (AND)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J((?X , name, ?N) AND (?X , email, ?E ))KG

J(?X , name, ?N)KG ⋊⋉ J(?X , email, ?E )KG

?X ?Nµ1 R1 johnµ2 R2 paulµ3 R3 ringo

⋊⋉

?X ?Eµ4 R1 [email protected]µ5 R3 [email protected]

Page 79: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (AND)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J((?X , name, ?N) AND (?X , email, ?E ))KG

J(?X , name, ?N)KG ⋊⋉ J(?X , email, ?E )KG

?X ?Nµ1 R1 johnµ2 R2 paulµ3 R3 ringo

⋊⋉

?X ?Eµ4 R1 [email protected]µ5 R3 [email protected]

?X ?N ?Eµ1 ∪ µ4 R1 john [email protected]µ3 ∪ µ5 R3 ringo [email protected]

Page 80: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (OPT)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J((?X , name, ?N) OPT (?X , email, ?E ))KG

Page 81: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (OPT)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J((?X , name, ?N) OPT (?X , email, ?E ))KG

J(?X , name, ?N)KG J(?X , email, ?E )KG

Page 82: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (OPT)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J((?X , name, ?N) OPT (?X , email, ?E ))KG

J(?X , name, ?N)KG J(?X , email, ?E )KG

?X ?Nµ1 R1 johnµ2 R2 paulµ3 R3 ringo

Page 83: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (OPT)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J((?X , name, ?N) OPT (?X , email, ?E ))KG

J(?X , name, ?N)KG J(?X , email, ?E )KG

?X ?Nµ1 R1 johnµ2 R2 paulµ3 R3 ringo

?X ?Eµ4 R1 [email protected]µ5 R3 [email protected]

Page 84: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (OPT)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J((?X , name, ?N) OPT (?X , email, ?E ))KG

J(?X , name, ?N)KG J(?X , email, ?E )KG

?X ?Nµ1 R1 johnµ2 R2 paulµ3 R3 ringo

?X ?Eµ4 R1 [email protected]µ5 R3 [email protected]

Page 85: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (OPT)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J((?X , name, ?N) OPT (?X , email, ?E ))KG

J(?X , name, ?N)KG J(?X , email, ?E )KG

?X ?Nµ1 R1 johnµ2 R2 paulµ3 R3 ringo

?X ?Eµ4 R1 [email protected]µ5 R3 [email protected]

?X ?N ?Eµ1 ∪ µ4 R1 john [email protected]µ3 ∪ µ5 R3 ringo [email protected]

µ2 R2 paul

Page 86: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (OPT)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J((?X , name, ?N) OPT (?X , email, ?E ))KG

J(?X , name, ?N)KG J(?X , email, ?E )KG

?X ?Nµ1 R1 johnµ2 R2 paulµ3 R3 ringo

?X ?Eµ4 R1 [email protected]µ5 R3 [email protected]

?X ?N ?Eµ1 ∪ µ4 R1 john [email protected]µ3 ∪ µ5 R3 ringo [email protected]

µ2 R2 paul

Page 87: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (UNION)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J((?X , email, ?Info) UNION (?X , webPage, ?Info))KG

Page 88: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (UNION)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J((?X , email, ?Info) UNION (?X , webPage, ?Info))KG

J(?X , email, ?Info)KG ∪ J(?X , webPage, ?Info)KG

Page 89: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (UNION)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J((?X , email, ?Info) UNION (?X , webPage, ?Info))KG

J(?X , email, ?Info)KG ∪ J(?X , webPage, ?Info)KG

?X ?Infoµ1 R1 [email protected]µ2 R3 [email protected]

Page 90: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (UNION)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J((?X , email, ?Info) UNION (?X , webPage, ?Info))KG

J(?X , email, ?Info)KG ∪ J(?X , webPage, ?Info)KG

?X ?Infoµ1 R1 [email protected]µ2 R3 [email protected]

?X ?Infoµ3 R3 www.ringo.com

Page 91: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (UNION)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J((?X , email, ?Info) UNION (?X , webPage, ?Info))KG

J(?X , email, ?Info)KG ∪ J(?X , webPage, ?Info)KG

?X ?Infoµ1 R1 [email protected]µ2 R3 [email protected]

∪?X ?Info

µ3 R3 www.ringo.com

Page 92: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (UNION)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J((?X , email, ?Info) UNION (?X , webPage, ?Info))KG

J(?X , email, ?Info)KG ∪ J(?X , webPage, ?Info)KG

?X ?Infoµ1 R1 [email protected]µ2 R3 [email protected]

∪?X ?Info

µ3 R3 www.ringo.com

?X ?Infoµ1 R1 [email protected]µ2 R3 [email protected]µ3 R3 www.ringo.com

Page 93: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Boolean filter expressions (value constraints)

In filter expressions we consider

◮ the equality = among variables and RDF terms

◮ a unary predicate bound

◮ boolean combinations (∧, ∨, ¬)

A mapping µ satisfies

◮ ?X = c if µ(?X ) = c

◮ ?X =?Y if µ(?X ) = µ(?Y )

◮ bound(?X ) if µ is defined in ?X , i.e. ?X ∈ dom(µ)

Page 94: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Satisfaction of value constraints

◮ If P is a graph pattern and R is a value constraint then(P FILTER R) is also a graph pattern.

Page 95: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Satisfaction of value constraints

◮ If P is a graph pattern and R is a value constraint then(P FILTER R) is also a graph pattern.

Definition

Given a graph G

◮ J(P FILTER R)KG = {µ ∈ JPKG | µ satisfies R}i.e. mappings in the evaluation of P that satisfy R .

Page 96: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (FILTER)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J((?X , name, ?N) FILTER (?N = ringo ∨ ?N = paul))KG

Page 97: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (FILTER)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J((?X , name, ?N) FILTER (?N = ringo ∨ ?N = paul))KG

?X ?Nµ1 R1 johnµ2 R2 paulµ3 R3 ringo

Page 98: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (FILTER)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J((?X , name, ?N) FILTER (?N = ringo ∨ ?N = paul))KG

?X ?Nµ1 R1 johnµ2 R2 paulµ3 R3 ringo

?N = ringo ∨ ?N = paul

Page 99: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (FILTER)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J((?X , name, ?N) FILTER (?N = ringo ∨ ?N = paul))KG

?X ?Nµ1 R1 johnµ2 R2 paulµ3 R3 ringo

?N = ringo ∨ ?N = paul

?X ?Nµ2 R2 paulµ3 R3 ringo

Page 100: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (FILTER)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J(((?X , name, ?N) OPT (?X , email, ?E )) FILTER ¬ bound(?E ))KG

Page 101: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (FILTER)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J(((?X , name, ?N) OPT (?X , email, ?E )) FILTER ¬ bound(?E ))KG

?X ?N ?Eµ1 ∪ µ4 R1 john [email protected]µ3 ∪ µ5 R3 ringo [email protected]

µ2 R2 paul

Page 102: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (FILTER)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J(((?X , name, ?N) OPT (?X , email, ?E )) FILTER ¬ bound(?E ))KG

?X ?N ?Eµ1 ∪ µ4 R1 john [email protected]µ3 ∪ µ5 R3 ringo [email protected]

µ2 R2 paul

¬ bound(?E )

Page 103: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (FILTER)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J(((?X , name, ?N) OPT (?X , email, ?E )) FILTER ¬ bound(?E ))KG

?X ?N ?Eµ1 ∪ µ4 R1 john [email protected]µ3 ∪ µ5 R3 ringo [email protected]

µ2 R2 paul

¬ bound(?E )

?X ?Nµ2 R2 paul

Page 104: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (FILTER)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J(((?X , name, ?N) OPT (?X , email, ?E )) FILTER ¬ bound(?E ))KG

?X ?N ?Eµ1 ∪ µ4 R1 john [email protected]µ3 ∪ µ5 R3 ringo [email protected]

µ2 R2 paul

¬ bound(?E )

?X ?Nµ2 R2 paul

(a non-monotonic query)

Page 105: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Why do we need/want to formalize SPARQL

A formalization is beneficial

◮ clarifying corner cases

◮ helping in the implementation process

◮ providing solid foundations (we can actually prove properties!)

Page 106: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

The evaluation decision problem

Evaluation problem for SPARQL patterns

Input: A mapping µ, an RDF graph G

a graph pattern P

Output: Is the mapping µ in the evaluation of pattern P

over the RDF graph G

Page 107: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Brief complexity-theory reminder

P

Page 108: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Brief complexity-theory reminder

NP

P

Page 109: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Brief complexity-theory reminder

PSPACE

NP

P

Page 110: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Complexity of the evaluation problem

Theorem (PAG09)

For patterns using only the AND operator,

the evaluation problem is in P

Page 111: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Complexity of the evaluation problem

Theorem (PAG09)

For patterns using only the AND operator,

the evaluation problem is in P

Theorem (SML10)

For patterns using AND and UNION operators,

the evaluation problem is NP-complete.

Page 112: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Complexity of the evaluation problem

Theorem (PAG09)

For patterns using only the AND operator,

the evaluation problem is in P

Theorem (SML10)

For patterns using AND and UNION operators,

the evaluation problem is NP-complete.

Theorem (PAG09,SML10)

For general patterns that include OPT operator,

the evaluation problem is PSPACE-complete.

Page 113: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Complexity of the evaluation problem

Theorem (PAG09)

For patterns using only the AND operator,

the evaluation problem is in P

Theorem (SML10)

For patterns using AND and UNION operators,

the evaluation problem is NP-complete.

Theorem (PAG09,SML10)

For general patterns that include OPT operator,

the evaluation problem is PSPACE-complete.

Good news: evaluation in P if the query is fixed (data complexity)

Page 114: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Well–designed patterns

Can we find a natural fragment with better complexity?

Definition

An AND-OPT pattern is well–designed iff for every OPT in thepattern

( · · · · · · · · · · · · ( A OPT B ) · · · · · · · · · · · · )

if a variable occurs

Page 115: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Well–designed patterns

Can we find a natural fragment with better complexity?

Definition

An AND-OPT pattern is well–designed iff for every OPT in thepattern

( · · · · · · · · · · · · ( A OPT B ) · · · · · · · · · · · · )↑

if a variable occurs inside B

Page 116: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Well–designed patterns

Can we find a natural fragment with better complexity?

Definition

An AND-OPT pattern is well–designed iff for every OPT in thepattern

( · · · · · · · · · · · · ( A OPT B ) · · · · · · · · · · · · )↑ ↑ ↑

if a variable occurs inside B and anywhere outside the OPT,

Page 117: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Well–designed patterns

Can we find a natural fragment with better complexity?

Definition

An AND-OPT pattern is well–designed iff for every OPT in thepattern

( · · · · · · · · · · · · ( A OPT B ) · · · · · · · · · · · · )↑ ⇑ ↑ ↑

if a variable occurs inside B and anywhere outside the OPT, thenthe variable must also occur inside A.

Page 118: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Well–designed patterns

Can we find a natural fragment with better complexity?

Definition

An AND-OPT pattern is well–designed iff for every OPT in thepattern

( · · · · · · · · · · · · ( A OPT B ) · · · · · · · · · · · · )↑ ⇑ ↑ ↑

if a variable occurs inside B and anywhere outside the OPT, thenthe variable must also occur inside A.

Example[ [

(?Y , name, paul) OPT (?X , email, ?Z )]

AND (?X , name, john)]

Page 119: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Well–designed patterns

Can we find a natural fragment with better complexity?

Definition

An AND-OPT pattern is well–designed iff for every OPT in thepattern

( · · · · · · · · · · · · ( A OPT B ) · · · · · · · · · · · · )↑ ⇑ ↑ ↑

if a variable occurs inside B and anywhere outside the OPT, thenthe variable must also occur inside A.

Example[ [

(?Y , name, paul) OPT (?X , email, ?Z )]

AND (?X , name, john)]

Page 120: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Well–designed patterns

Can we find a natural fragment with better complexity?

Definition

An AND-OPT pattern is well–designed iff for every OPT in thepattern

( · · · · · · · · · · · · ( A OPT B ) · · · · · · · · · · · · )↑ ⇑ ↑ ↑

if a variable occurs inside B and anywhere outside the OPT, thenthe variable must also occur inside A.

Example[ [

(?Y , name, paul) OPT (?X , email, ?Z )]

AND (?X , name, john)]

↑ ↑

Page 121: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Well–designed patterns

Can we find a natural fragment with better complexity?

Definition

An AND-OPT pattern is well–designed iff for every OPT in thepattern

( · · · · · · · · · · · · ( A OPT B ) · · · · · · · · · · · · )↑ ⇑ ↑ ↑

if a variable occurs inside B and anywhere outside the OPT, thenthe variable must also occur inside A.

Example[ [

(?Y , name, paul) OPT (?X , email, ?Z )]

AND (?X , name, john)]

�� ↑ ↑

Page 122: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

A bit more on complexity...

PSPACE

NP

P

Page 123: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

A bit more on complexity...

PSPACE

coNPNP

P

Page 124: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Evaluation of well-designed patterns is in coNP-complete

Theorem (PAG09)

For AND-OPT well–designed graph patterns

the evaluation problem is coNP-complete

Page 125: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Evaluation of well-designed patterns is in coNP-complete

Theorem (PAG09)

For AND-OPT well–designed graph patterns

the evaluation problem is coNP-complete

Well-designed patterns also allow to study static analysis:

Theorem (LPPS12)

Equivalence of well-designed SPARQL patterns is in NP

Page 126: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SELECT (a.k.a. projection)

Besides graph patterns, SPARQL 1.0 allow result forms

the most simple is SELECT

Definition

A SELECT query is an expression

(SELECT W P)

where P is a graph pattern and W is a set of variables, or *

Page 127: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SELECT (a.k.a. projection)

Besides graph patterns, SPARQL 1.0 allow result forms

the most simple is SELECT

Definition

A SELECT query is an expression

(SELECT W P)

where P is a graph pattern and W is a set of variables, or *

The evaluation of a SELECT query against G is

◮ J(SELECT W P)KG = {µ|W | µ ∈ JPKG}where µ|W is the restriction of µ to domain W .

◮ J(SELECT * P)KG = JPKG

Page 128: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (SELECT)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J(SELECT {?N , ?E} ((?X , name, ?N) AND (?X , email, ?E )))KG

Page 129: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (SELECT)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J(SELECT {?N , ?E} ((?X , name, ?N) AND (?X , email, ?E )))KG

SELECT{?N , ?E}?X ?N ?E

µ1 R1 john [email protected]µ2 R3 ringo [email protected]

Page 130: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Example (SELECT)

G :(R1, name, john) (R2, name, paul) (R3, name, ringo)(R1, email, [email protected]) (R3, email, [email protected])

(R3, webPage, www.ringo.com)

J(SELECT {?N , ?E} ((?X , name, ?N) AND (?X , email, ?E )))KG

SELECT{?N , ?E}?X ?N ?E

µ1 R1 john [email protected]µ2 R3 ringo [email protected]

?N ?Eµ1|{?N,?E}

john [email protected]

µ2|{?N,?E}ringo [email protected]

Page 131: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL 1.1 introduces several new features

In SPARQL 1.1:

◮ (SELECT W P) can be used as any other graph pattern(ASK P) can be used as a constraint in FILTER⇒ sub-queries

◮ Aggregations via ORDER-BY plus COUNT, SUM, etc.

◮ More important for us: Federation and Navigation

Page 132: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Outline

Basics of SPARQLSyntax and Semantics of SPARQL 1.0What is new in SPARQL 1.1

Federation: SERVICE operatorSyntax and SemanticsEvaluation of SERVICE queries

Navigation: Property PathsNavigating graphs with regular expressionsThe history of paths (in SPARQL 1.1 specification)Evaluation procedures and complexity

Page 133: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL endpoints and the SERVICE operator

◮ SPARQL endpoints are services that accepts HTTP requestsasking for SPARQL queries

◮ http://www.w3.org/wiki/SparqlEndpoints lists some

◮ SPARQL 1.1 allows to mix local and remote queries toendpoints via the SERVICE operator

Page 134: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL endpoints and the SERVICE operator

◮ SPARQL endpoints are services that accepts HTTP requestsasking for SPARQL queries

◮ http://www.w3.org/wiki/SparqlEndpoints lists some

◮ SPARQL 1.1 allows to mix local and remote queries toendpoints via the SERVICE operator

?SELECT ?Author

WHERE

{

?Paper dc:creator ?Author .

?Paper dct:partOf ?Conf .

?Conf swrc:series conf:iswc .

}

Page 135: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL endpoints and the SERVICE operator

◮ SPARQL endpoints are services that accepts HTTP requestsasking for SPARQL queries

◮ http://www.w3.org/wiki/SparqlEndpoints lists some

◮ SPARQL 1.1 allows to mix local and remote queries toendpoints via the SERVICE operator

?SELECT ?Author ?Place

WHERE

{

?Paper dc:creator ?Author .

?Paper dct:partOf ?Conf .

?Conf swrc:series conf:iswc .

SERVICE dbpedia:service

{ ?Author dbpedia:birthPlace ?Place . }

}

Page 136: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Syntax of SERVICE

Syntax

If c ∈ U, ?X is a variable, and P is a graph pattern then

Page 137: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Syntax of SERVICE

Syntax

If c ∈ U, ?X is a variable, and P is a graph pattern then

◮ (SERVICE c P)

◮ (SERVICE ?X P)

are SERVICE graph patterns.

Page 138: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Syntax of SERVICE

Syntax

If c ∈ U, ?X is a variable, and P is a graph pattern then

◮ (SERVICE c P)

◮ (SERVICE ?X P)

are SERVICE graph patterns.

◮ SERVICE graph pattern are included recursively in the algebraof graph patterns

◮ Variables are included in the SERVICE syntax to allowdynamic choosing of endpoints

Page 139: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Semantics of SERVICE

We assume the existence of a partial function

ep : U → RDF graphs

intuitively, ep(c) is the (default) graph associatedto the SPARQL endpoint defined by c

Page 140: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Semantics of SERVICE

We assume the existence of a partial function

ep : U → RDF graphs

intuitively, ep(c) is the (default) graph associatedto the SPARQL endpoint defined by c

Definition

Given an RDF graph G , an element c ∈ U, and a graph pattern P

J(SERVICE c P)KG =

{

Page 141: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Semantics of SERVICE

We assume the existence of a partial function

ep : U → RDF graphs

intuitively, ep(c) is the (default) graph associatedto the SPARQL endpoint defined by c

Definition

Given an RDF graph G , an element c ∈ U, and a graph pattern P

J(SERVICE c P)KG =

{

JPKep(c) if c ∈ dom(ep)

Page 142: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Semantics of SERVICE

We assume the existence of a partial function

ep : U → RDF graphs

intuitively, ep(c) is the (default) graph associatedto the SPARQL endpoint defined by c

Definition

Given an RDF graph G , an element c ∈ U, and a graph pattern P

J(SERVICE c P)KG =

{

JPKep(c) if c ∈ dom(ep)

{µ∅} otherwise

Page 143: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Semantics of SERVICE

We assume the existence of a partial function

ep : U → RDF graphs

intuitively, ep(c) is the (default) graph associatedto the SPARQL endpoint defined by c

Definition

Given an RDF graph G , an element c ∈ U, and a graph pattern P

J(SERVICE c P)KG =

{

JPKep(c) if c ∈ dom(ep)

{µ∅} otherwise

but the interesting case is when the endpoint is a variable...

Page 144: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Semantics of SERVICE

Definition

Given an RDF graph G , a variable ?X , and a graph pattern P

J(SERVICE ?X P)KG =

Page 145: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Semantics of SERVICE

Definition

Given an RDF graph G , a variable ?X , and a graph pattern P

J(SERVICE ?X P)KG = JPKep(c)

Page 146: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Semantics of SERVICE

Definition

Given an RDF graph G , a variable ?X , and a graph pattern P

J(SERVICE ?X P)KG = JPKep(c) ⋊⋉ {{?X → c}}

Page 147: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Semantics of SERVICE

Definition

Given an RDF graph G , a variable ?X , and a graph pattern P

J(SERVICE ?X P)KG =⋃

c∈dom(ep)

(

JPKep(c) ⋊⋉ {{?X → c}}

)

Page 148: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Semantics of SERVICE

Definition

Given an RDF graph G , a variable ?X , and a graph pattern P

J(SERVICE ?X P)KG =⋃

c∈dom(ep)

(

JPKep(c) ⋊⋉ {{?X → c}}

)

Page 149: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Semantics of SERVICE

Definition

Given an RDF graph G , a variable ?X , and a graph pattern P

J(SERVICE ?X P)KG =⋃

c∈dom(ep)

(

JPKep(c) ⋊⋉ {{?X → c}}

)

can we effectively evaluate a SERVICE query?

Page 150: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SERVICE example

Some queries/patterns can be safely evaluated:

Page 151: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SERVICE example

Some queries/patterns can be safely evaluated:

◮ ((?X , service address, ?Y ) AND (SERVICE ?Y (?N , email, ?E )))

Page 152: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SERVICE example

Some queries/patterns can be safely evaluated:

◮ ((?X , service address, ?Y ) AND (SERVICE ?Y (?N , email, ?E )))

◮ ((SERVICE ?Y (?N , email, ?E )) AND (?X , service address, ?Y ))

Page 153: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SERVICE example

Some queries/patterns can be safely evaluated:

◮ ((?X , service address, ?Y ) AND (SERVICE ?Y (?N , email, ?E )))

◮ ((SERVICE ?Y (?N , email, ?E )) AND (?X , service address, ?Y ))

In both cases, the SERVICE variable is controlled by the data inthe initial graph

Page 154: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SERVICE example

Some queries/patterns can be safely evaluated:

◮ ((?X , service address, ?Y ) AND (SERVICE ?Y (?N , email, ?E )))

◮ ((SERVICE ?Y (?N , email, ?E )) AND (?X , service address, ?Y ))

In both cases, the SERVICE variable is controlled by the data inthe initial graph

There is natural way of evaluating the query:

Page 155: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SERVICE example

Some queries/patterns can be safely evaluated:

◮ ((?X , service address, ?Y ) AND (SERVICE ?Y (?N , email, ?E )))

◮ ((SERVICE ?Y (?N , email, ?E )) AND (?X , service address, ?Y ))

In both cases, the SERVICE variable is controlled by the data inthe initial graph

There is natural way of evaluating the query:

◮ Evaluate first (?X , service address, ?Y )

Page 156: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SERVICE example

Some queries/patterns can be safely evaluated:

◮ ((?X , service address, ?Y ) AND (SERVICE ?Y (?N , email, ?E )))

◮ ((SERVICE ?Y (?N , email, ?E )) AND (?X , service address, ?Y ))

In both cases, the SERVICE variable is controlled by the data inthe initial graph

There is natural way of evaluating the query:

◮ Evaluate first (?X , service address, ?Y )

◮ only for the obtained mappings evaluate the SERVICE onendpoint µ(?Y )

Page 157: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Unbounded SERVICE queries

What about this pattern?

[

(

(?X , service description, ?Z ) UNION (?X , service address, ?Y ))

AND (SERVICE ?Y (?N , email, ?E ))

]

Page 158: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Unbounded SERVICE queries

What about this pattern?

[

(

(?X , service description, ?Z ) UNION (?X , service address, ?Y ))

AND (SERVICE ?Y (?N , email, ?E ))

]

:-(

Page 159: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Unbounded SERVICE queries

What about this pattern?

[

(

(?X , service description, ?Z ) UNION (?X , service address, ?Y ))

AND (SERVICE ?Y (?N , email, ?E ))

]

:-(

Idea: force the SERVICE variable to be bound in every solution.

Page 160: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Boundedness

Definition

A variable ?X is bound in graph pattern P iffor every graph G and every µ ∈ JPKG it holds that:

◮ ?X ∈ dom(µ), and

◮ µ(?X ) is a value in G .

Page 161: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Boundedness

Definition

A variable ?X is bound in graph pattern P iffor every graph G and every µ ∈ JPKG it holds that:

◮ ?X ∈ dom(µ), and

◮ µ(?X ) is a value in G .

We only need a procedure to ensure that every variable mentionedin SERVICE is bounded!

Page 162: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Boundedness

Definition

A variable ?X is bound in graph pattern P iffor every graph G and every µ ∈ JPKG it holds that:

◮ ?X ∈ dom(µ), and

◮ µ(?X ) is a value in G .

We only need a procedure to ensure that every variable mentionedin SERVICE is bounded! Oh wait...

Page 163: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Boundedness

Definition

A variable ?X is bound in graph pattern P iffor every graph G and every µ ∈ JPKG it holds that:

◮ ?X ∈ dom(µ), and

◮ µ(?X ) is a value in G .

We only need a procedure to ensure that every variable mentionedin SERVICE is bounded! Oh wait...

Theorem (BAC11)

The problem of checking if a variable is bound in a graph pattern

is undecidable.

Page 164: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Boundedness

Definition

A variable ?X is bound in graph pattern P iffor every graph G and every µ ∈ JPKG it holds that:

◮ ?X ∈ dom(µ), and

◮ µ(?X ) is a value in G .

We only need a procedure to ensure that every variable mentionedin SERVICE is bounded! Oh wait...

Theorem (BAC11)

The problem of checking if a variable is bound in a graph pattern

is undecidable.

:-(

Page 165: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Undecidability of boundedness

Proof idea

◮ From [AG08]: it is undecidable to check if a SPARQL patternP is satisfiable (if JPKG 6= ∅ for some G ).

Page 166: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Undecidability of boundedness

Proof idea

◮ From [AG08]: it is undecidable to check if a SPARQL patternP is satisfiable (if JPKG 6= ∅ for some G ).

◮ Assume P does not mention ?X , and letQ = ((?X , ?Y , ?Z ) UNION P):

Page 167: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Undecidability of boundedness

Proof idea

◮ From [AG08]: it is undecidable to check if a SPARQL patternP is satisfiable (if JPKG 6= ∅ for some G ).

◮ Assume P does not mention ?X , and letQ = ((?X , ?Y , ?Z ) UNION P):

?X is bound in Q iff P is not satisfiable.

Page 168: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Undecidability of boundedness

Proof idea

◮ From [AG08]: it is undecidable to check if a SPARQL patternP is satisfiable (if JPKG 6= ∅ for some G ).

◮ Assume P does not mention ?X , and letQ = ((?X , ?Y , ?Z ) UNION P):

?X is bound in Q iff P is not satisfiable.

Undecidable: the SPARQL engine cannot check for boundedness...

Page 169: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

A sufficient condition for boundedness

Definition [BAC11]

The set of strongly bounded variables in a pattern P , denoted bySB(P) is defined recursively as follows.

◮ if P is a triple pattern t, then SB(P) = var(t)

◮ if P = (P1 AND P2), then

Page 170: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

A sufficient condition for boundedness

Definition [BAC11]

The set of strongly bounded variables in a pattern P , denoted bySB(P) is defined recursively as follows.

◮ if P is a triple pattern t, then SB(P) = var(t)

◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2)

Page 171: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

A sufficient condition for boundedness

Definition [BAC11]

The set of strongly bounded variables in a pattern P , denoted bySB(P) is defined recursively as follows.

◮ if P is a triple pattern t, then SB(P) = var(t)

◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2)

◮ if P = (P1 OPT P2), then

Page 172: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

A sufficient condition for boundedness

Definition [BAC11]

The set of strongly bounded variables in a pattern P , denoted bySB(P) is defined recursively as follows.

◮ if P is a triple pattern t, then SB(P) = var(t)

◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2)

◮ if P = (P1 OPT P2), then SB(P) = SB(P1)

Page 173: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

A sufficient condition for boundedness

Definition [BAC11]

The set of strongly bounded variables in a pattern P , denoted bySB(P) is defined recursively as follows.

◮ if P is a triple pattern t, then SB(P) = var(t)

◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2)

◮ if P = (P1 OPT P2), then SB(P) = SB(P1)

◮ if P = (P1 UNION P2), then

Page 174: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

A sufficient condition for boundedness

Definition [BAC11]

The set of strongly bounded variables in a pattern P , denoted bySB(P) is defined recursively as follows.

◮ if P is a triple pattern t, then SB(P) = var(t)

◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2)

◮ if P = (P1 OPT P2), then SB(P) = SB(P1)

◮ if P = (P1 UNION P2), then SB(P) = SB(P1) ∩ SB(P2)

Page 175: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

A sufficient condition for boundedness

Definition [BAC11]

The set of strongly bounded variables in a pattern P , denoted bySB(P) is defined recursively as follows.

◮ if P is a triple pattern t, then SB(P) = var(t)

◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2)

◮ if P = (P1 OPT P2), then SB(P) = SB(P1)

◮ if P = (P1 UNION P2), then SB(P) = SB(P1) ∩ SB(P2)

◮ if P = (SELECT W P1), then

Page 176: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

A sufficient condition for boundedness

Definition [BAC11]

The set of strongly bounded variables in a pattern P , denoted bySB(P) is defined recursively as follows.

◮ if P is a triple pattern t, then SB(P) = var(t)

◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2)

◮ if P = (P1 OPT P2), then SB(P) = SB(P1)

◮ if P = (P1 UNION P2), then SB(P) = SB(P1) ∩ SB(P2)

◮ if P = (SELECT W P1), then SB(P) = W ∩ SB(P1)

Page 177: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

A sufficient condition for boundedness

Definition [BAC11]

The set of strongly bounded variables in a pattern P , denoted bySB(P) is defined recursively as follows.

◮ if P is a triple pattern t, then SB(P) = var(t)

◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2)

◮ if P = (P1 OPT P2), then SB(P) = SB(P1)

◮ if P = (P1 UNION P2), then SB(P) = SB(P1) ∩ SB(P2)

◮ if P = (SELECT W P1), then SB(P) = W ∩ SB(P1)

◮ if P is a SERVICE pattern, then

Page 178: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

A sufficient condition for boundedness

Definition [BAC11]

The set of strongly bounded variables in a pattern P , denoted bySB(P) is defined recursively as follows.

◮ if P is a triple pattern t, then SB(P) = var(t)

◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2)

◮ if P = (P1 OPT P2), then SB(P) = SB(P1)

◮ if P = (P1 UNION P2), then SB(P) = SB(P1) ∩ SB(P2)

◮ if P = (SELECT W P1), then SB(P) = W ∩ SB(P1)

◮ if P is a SERVICE pattern, then SB(P) = ∅

Page 179: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

A sufficient condition for boundedness

Definition [BAC11]

The set of strongly bounded variables in a pattern P , denoted bySB(P) is defined recursively as follows.

◮ if P is a triple pattern t, then SB(P) = var(t)

◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2)

◮ if P = (P1 OPT P2), then SB(P) = SB(P1)

◮ if P = (P1 UNION P2), then SB(P) = SB(P1) ∩ SB(P2)

◮ if P = (SELECT W P1), then SB(P) = W ∩ SB(P1)

◮ if P is a SERVICE pattern, then SB(P) = ∅

Proposition (BAC11)

If ?X ∈ SB(P) then ?X is bound in P.

Page 180: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

A sufficient condition for boundedness

Definition [BAC11]

The set of strongly bounded variables in a pattern P , denoted bySB(P) is defined recursively as follows.

◮ if P is a triple pattern t, then SB(P) = var(t)

◮ if P = (P1 AND P2), then SB(P) = SB(P1) ∪ SB(P2)

◮ if P = (P1 OPT P2), then SB(P) = SB(P1)

◮ if P = (P1 UNION P2), then SB(P) = SB(P1) ∩ SB(P2)

◮ if P = (SELECT W P1), then SB(P) = W ∩ SB(P1)

◮ if P is a SERVICE pattern, then SB(P) = ∅

Proposition (BAC11)

If ?X ∈ SB(P) then ?X is bound in P.

(SB(P) can be efficiently computed)

Page 181: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

(Strongly) boundedness is not enough

Are we happy now?

P1 =

[

(?X , service description, ?Z ) UNION

(

(?X , service address, ?Y ) AND

(SERVICE ?Y (?N, email, ?E ))

)]

.

Page 182: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

(Strongly) boundedness is not enough

Are we happy now?

P1 =

[

(?X , service description, ?Z ) UNION

(

(?X , service address, ?Y ) AND

(SERVICE ?Y (?N, email, ?E ))

)]

.

◮ ?Y is not bound in P1 (nor strongly bound)

Page 183: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

(Strongly) boundedness is not enough

Are we happy now?

P1 =

[

(?X , service description, ?Z ) UNION

(

(?X , service address, ?Y ) AND

(SERVICE ?Y (?N, email, ?E ))

)]

.

◮ ?Y is not bound in P1 (nor strongly bound)

◮ nevertheless there is a natural evaluation of this pattern

Page 184: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

(Strongly) boundedness is not enough

Are we happy now?

P1 =

[

(?X , service description, ?Z ) UNION

(

(?X , service address, ?Y ) AND

(SERVICE ?Y (?N, email, ?E ))

)]

.

◮ ?Y is not bound in P1 (nor strongly bound)

◮ nevertheless there is a natural evaluation of this pattern

(?Y is bounded in the important part!)

Page 185: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

(Strongly) boundedness is not enough

Are we happy now?

Page 186: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

(Strongly) boundedness is not enough

Are we happy now?

P2 =

[

(?U1, related with, ?U2) AND

(

SERVICE ?U1

(

(?N, email, ?E ) OPT

(SERVICE ?U2 (?N, phone, ?F ))

))]

.

Page 187: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

(Strongly) boundedness is not enough

Are we happy now?

P2 =

[

(?U1, related with, ?U2) AND

(

SERVICE ?U1

(

(?N, email, ?E ) OPT

(SERVICE ?U2 (?N, phone, ?F ))

))]

.

◮ ?U1 and ?U2 are strongly bounded, but

Page 188: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

(Strongly) boundedness is not enough

Are we happy now?

P2 =

[

(?U1, related with, ?U2) AND

(

SERVICE ?U1

(

(?N, email, ?E ) OPT

(SERVICE ?U2 (?N, phone, ?F ))

))]

.

◮ ?U1 and ?U2 are strongly bounded, but

◮ can we effectively evaluate this query?

(?U2 is unbounded in the important part!)

Page 189: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

(Strongly) boundedness is not enough

Are we happy now?

P2 =

[

(?U1, related with, ?U2) AND

(

SERVICE ?U1

(

(?N, email, ?E ) OPT

(SERVICE ?U2 (?N, phone, ?F ))

))]

.

◮ ?U1 and ?U2 are strongly bounded, but

◮ can we effectively evaluate this query?

(?U2 is unbounded in the important part!)

we need to define what the important part is...

Page 190: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Parse tree of a pattern

We need first to formalize the tree of subexpressions of a pattern

(

(?Y , a, ?Z ) UNION(

(?X , b, c) AND (SERVICE ?X (?Y , a, ?Z )))

)

Page 191: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Parse tree of a pattern

We need first to formalize the tree of subexpressions of a pattern

(

(?Y , a, ?Z ) UNION(

(?X , b, c) AND (SERVICE ?X (?Y , a, ?Z )))

)

u6 : (?Y , a, ?Z )

u1 : ((?Y , a, ?Z ) UNION ((?X , b, c) AND (SERVICE ?X (?Y , a, ?Z ))))

u2 : (?Y , a, ?Z ) u3 : ((?X , b, c) AND (SERVICE ?X (?Y , a, ?Z )))

u4 : (?X , b, c) u5 : (SERVICE ?X (?Y , a, ?Z ))

Page 192: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Parse tree of a pattern

We need first to formalize the tree of subexpressions of a pattern

(

(?Y , a, ?Z ) UNION(

(?X , b, c) AND (SERVICE ?X (?Y , a, ?Z )))

)

u6 : (?Y , a, ?Z )

u1 : ((?Y , a, ?Z ) UNION ((?X , b, c) AND (SERVICE ?X (?Y , a, ?Z ))))

u2 : (?Y , a, ?Z ) u3 : ((?X , b, c) AND (SERVICE ?X (?Y , a, ?Z )))

u4 : (?X , b, c) u5 : (SERVICE ?X (?Y , a, ?Z ))

We denote by T (P) the tree of subexpressions of P .

Page 193: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Service-boundedness

Notion of boundedness considering the important part

Definition

A pattern P is service-bound if for every node u in T (P) with label(SERVICE ?X P1) it holds that:

Page 194: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Service-boundedness

Notion of boundedness considering the important part

Definition

A pattern P is service-bound if for every node u in T (P) with label(SERVICE ?X P1) it holds that:

1. There exists an ancestor of u in T (P) with label P2 s.t. ?X isbound in P2, and

Page 195: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Service-boundedness

Notion of boundedness considering the important part

Definition

A pattern P is service-bound if for every node u in T (P) with label(SERVICE ?X P1) it holds that:

1. There exists an ancestor of u in T (P) with label P2 s.t. ?X isbound in P2, and

2. P1 is service-bound.

Page 196: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Service-boundedness

Notion of boundedness considering the important part

Definition

A pattern P is service-bound if for every node u in T (P) with label(SERVICE ?X P1) it holds that:

1. There exists an ancestor of u in T (P) with label P2 s.t. ?X isbound in P2, and

2. P1 is service-bound.

Unfortunately... (and not so surprisingly anymore)

Page 197: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Service-boundedness

Notion of boundedness considering the important part

Definition

A pattern P is service-bound if for every node u in T (P) with label(SERVICE ?X P1) it holds that:

1. There exists an ancestor of u in T (P) with label P2 s.t. ?X isbound in P2, and

2. P1 is service-bound.

Unfortunately... (and not so surprisingly anymore)

Theorem (BAC11)

Checking if a pattern is service-bound is undecidable.

Page 198: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Service-boundedness

Notion of boundedness considering the important part

Definition

A pattern P is service-bound if for every node u in T (P) with label(SERVICE ?X P1) it holds that:

1. There exists an ancestor of u in T (P) with label P2 s.t. ?X isbound in P2, and

2. P1 is service-bound.

Unfortunately... (and not so surprisingly anymore)

Theorem (BAC11)

Checking if a pattern is service-bound is undecidable.

Exercise: prove the theorem.

Page 199: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Service-safeness

We need a decidable sufficient condition.

Idea: replace bound by SB(·) in the previous notion.

Page 200: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Service-safeness

We need a decidable sufficient condition.

Idea: replace bound by SB(·) in the previous notion.

Definition

A pattern P is service-safe if of every node u in T (P) with label(SERVICE ?X P1) it holds that:

Page 201: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Service-safeness

We need a decidable sufficient condition.

Idea: replace bound by SB(·) in the previous notion.

Definition

A pattern P is service-safe if of every node u in T (P) with label(SERVICE ?X P1) it holds that:

1. There exists an ancestor of u in T (P) with label P2 s.t.?X ∈ SB(P2), and

Page 202: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Service-safeness

We need a decidable sufficient condition.

Idea: replace bound by SB(·) in the previous notion.

Definition

A pattern P is service-safe if of every node u in T (P) with label(SERVICE ?X P1) it holds that:

1. There exists an ancestor of u in T (P) with label P2 s.t.?X ∈ SB(P2), and

2. P1 is service-safe.

Page 203: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Service-safeness

We need a decidable sufficient condition.

Idea: replace bound by SB(·) in the previous notion.

Definition

A pattern P is service-safe if of every node u in T (P) with label(SERVICE ?X P1) it holds that:

1. There exists an ancestor of u in T (P) with label P2 s.t.?X ∈ SB(P2), and

2. P1 is service-safe.

Proposition (BAC11)

If P is service-safe the it is service-bound.

Page 204: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Service-safeness

We need a decidable sufficient condition.

Idea: replace bound by SB(·) in the previous notion.

Definition

A pattern P is service-safe if of every node u in T (P) with label(SERVICE ?X P1) it holds that:

1. There exists an ancestor of u in T (P) with label P2 s.t.?X ∈ SB(P2), and

2. P1 is service-safe.

Proposition (BAC11)

If P is service-safe the it is service-bound.

We finally have our desired decidable condition.

Page 205: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Applying service-safeness

P1 =

[

(?X , service description, ?Z ) UNION

(

(?X , service address, ?Y ) AND

(SERVICE ?Y (?N, email, ?E))

)]

Page 206: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Applying service-safeness

P1 =

[

(?X , service description, ?Z ) UNION

(

(?X , service address, ?Y ) AND

(SERVICE ?Y (?N, email, ?E))

)]

is service-safe.

Page 207: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Applying service-safeness

P1 =

[

(?X , service description, ?Z ) UNION

(

(?X , service address, ?Y ) AND

(SERVICE ?Y (?N, email, ?E))

)]

is service-safe.

P2 =

[

(?U1, related with, ?U2) AND

(

SERVICE ?U1

(

(?N, email, ?E) OPT

(SERVICE ?U2 (?N, phone, ?F ))

))]

Page 208: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Applying service-safeness

P1 =

[

(?X , service description, ?Z ) UNION

(

(?X , service address, ?Y ) AND

(SERVICE ?Y (?N, email, ?E))

)]

is service-safe.

P2 =

[

(?U1, related with, ?U2) AND

(

SERVICE ?U1

(

(?N, email, ?E) OPT

(SERVICE ?U2 (?N, phone, ?F ))

))]

is not service-safe.

Page 209: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Closing words on SERVICE

SERVICE: several interesting research question

◮ Optimization (rewriting, reordering, containment?)

◮ Cost analysis: in practice we cannot query all that we maywant

◮ Different endpoints provide different completeness certificates(regarding the data they return)

◮ Several implementations challenges

Page 210: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Outline

Basics of SPARQLSyntax and Semantics of SPARQL 1.0What is new in SPARQL 1.1

Federation: SERVICE operatorSyntax and SemanticsEvaluation of SERVICE queries

Navigation: Property PathsNavigating graphs with regular expressionsThe history of paths (in SPARQL 1.1 specification)Evaluation procedures and complexity

Page 211: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL 1.0 has very limited navigational capabilities

Assume a graph with cities and connections with RDF triples like:

(C1, connected,C2)

Page 212: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL 1.0 has very limited navigational capabilities

Assume a graph with cities and connections with RDF triples like:

(C1, connected,C2)

query: is city B reachable from A by a sequence of connections?

Page 213: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL 1.0 has very limited navigational capabilities

Assume a graph with cities and connections with RDF triples like:

(C1, connected,C2)

query: is city B reachable from A by a sequence of connections?

◮ Known fact: SPARQL 1.0 cannot express this query!

◮ Follows easily from locality of FO-logic

Page 214: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL 1.0 has very limited navigational capabilities

Assume a graph with cities and connections with RDF triples like:

(C1, connected,C2)

query: is city B reachable from A by a sequence of connections?

◮ Known fact: SPARQL 1.0 cannot express this query!

◮ Follows easily from locality of FO-logic

You (should) already know that Datalog can express this query.

Page 215: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL 1.0 has very limited navigational capabilities

Assume a graph with cities and connections with RDF triples like:

(C1, connected,C2)

query: is city B reachable from A by a sequence of connections?

◮ Known fact: SPARQL 1.0 cannot express this query!

◮ Follows easily from locality of FO-logic

You (should) already know that Datalog can express this query.

We can consider a new predicate reach and the program

Page 216: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL 1.0 has very limited navigational capabilities

Assume a graph with cities and connections with RDF triples like:

(C1, connected,C2)

query: is city B reachable from A by a sequence of connections?

◮ Known fact: SPARQL 1.0 cannot express this query!

◮ Follows easily from locality of FO-logic

You (should) already know that Datalog can express this query.

We can consider a new predicate reach and the program

(?X , reach, ?Y ) ← (?X , connected, ?Y )

Page 217: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL 1.0 has very limited navigational capabilities

Assume a graph with cities and connections with RDF triples like:

(C1, connected,C2)

query: is city B reachable from A by a sequence of connections?

◮ Known fact: SPARQL 1.0 cannot express this query!

◮ Follows easily from locality of FO-logic

You (should) already know that Datalog can express this query.

We can consider a new predicate reach and the program

(?X , reach, ?Y ) ← (?X , connected, ?Y )(?X , reach, ?Y ) ← (?X , reach, ?Z ), (?Z , connected, ?Y )

Page 218: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL 1.0 has very limited navigational capabilities

Assume a graph with cities and connections with RDF triples like:

(C1, connected,C2)

query: is city B reachable from A by a sequence of connections?

◮ Known fact: SPARQL 1.0 cannot express this query!

◮ Follows easily from locality of FO-logic

You (should) already know that Datalog can express this query.

We can consider a new predicate reach and the program

(?X , reach, ?Y ) ← (?X , connected, ?Y )(?X , reach, ?Y ) ← (?X , reach, ?Z ), (?Z , connected, ?Y )

but do we want to integrate Datalog and SPARQL?

Page 219: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL 1.1 provides an alternative way for navigating

URI 2:email

[email protected]:name

:email

:phone

:name

:friendOf [email protected]

URI 1

Seba446928888

Juan

Claudio:name

:nameMaria

URI 0

:friendOf

URI 3

:friendOf

Page 220: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL 1.1 provides an alternative way for navigating

URI 2:email

[email protected]:name

:email

:phone

:name

:friendOf [email protected]

URI 1

Seba446928888

Juan

Claudio:name

:nameMaria

URI 0

:friendOf

URI 3

:friendOf

SELECT ?X

WHERE

{

?X :friendOf ?Y .

?Y :name "Maria" .

}

Page 221: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL 1.1 provides an alternative way for navigating

URI 2:email

[email protected]:name

:email

:phone

:name

:friendOf [email protected]

URI 1

Seba446928888

Juan

Claudio:name

:nameMaria

URI 0

:friendOf

URI 3

:friendOf

SELECT ?X

WHERE

{

?X :friendOf ?Y .

?Y :name "Maria" .

}

Page 222: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL 1.1 provides an alternative way for navigating

URI 2:email

[email protected]:name

:email

:phone

:name

:friendOf [email protected]

URI 1

Seba446928888

Juan

Claudio:name

:nameMaria

URI 0

:friendOf

URI 3

:friendOf

SELECT ?X

WHERE

{

?X (:friendOf)* ?Y .

?Y :name "Maria" .

}

Page 223: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL 1.1 provides an alternative way for navigating

URI 2:email

[email protected]:name

:email

:phone

:name

:friendOf [email protected]

URI 1

Seba446928888

Juan

Claudio:name

:nameMaria

URI 0

:friendOf

URI 3

:friendOf

SELECT ?X

WHERE

{

?X (:friendOf)* ?Y .

?Y :name "Maria" .

}

Page 224: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL 1.1 provides an alternative way for navigating

URI 2:email

[email protected]:name

:email

:phone

:name

:friendOf [email protected]

URI 1

Seba446928888

Juan

Claudio:name

:nameMaria

URI 0

:friendOf

URI 3

:friendOf

SELECT ?X

WHERE

{

?X (:friendOf)* ?Y . ← SPARQL 1.1 property path

?Y :name "Maria" .

}

Page 225: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL 1.1 provides an alternative way for navigating

URI 2:email

[email protected]:name

:email

:phone

:name

:friendOf [email protected]

URI 1

Seba446928888

Juan

Claudio:name

:nameMaria

URI 0

:friendOf

URI 3

:friendOf

SELECT ?X

WHERE

{

?X (:friendOf)* ?Y . ← SPARQL 1.1 property path

?Y :name "Maria" .

}

Idea: navigate RDF graphs using regular expressions

Page 226: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

General navigation using regular expressions

Regular expressions define sets of strings using

◮ concatenation: /

◮ disjunction: |

◮ Kleene star: *

Example

Consider strings composed of symbols a and b

a/(b)*/a

defines strings of the form abbb · · · bbba.

Page 227: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

General navigation using regular expressions

Regular expressions define sets of strings using

◮ concatenation: /

◮ disjunction: |

◮ Kleene star: *

Example

Consider strings composed of symbols a and b

a/(b)*/a

defines strings of the form abbb · · · bbba.

Idea: use regular expressions to define paths

◮ a path p satisfies a regular expression r if the string composedof the sequence of edges of p satisfies expression r

Page 228: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Interesting navigational queries

◮ RDF graph with :father, :mother edges:

ancestors of John

Page 229: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Interesting navigational queries

◮ RDF graph with :father, :mother edges:

ancestors of John{ John (:father|:mother)* ?X }

Page 230: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Interesting navigational queries

◮ RDF graph with :father, :mother edges:

ancestors of John{ John (:father|:mother)* ?X }

◮ Connections between cities via :train, :bus, :plane

Cities that reach Paris with exactly one :bus connection

Page 231: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Interesting navigational queries

◮ RDF graph with :father, :mother edges:

ancestors of John{ John (:father|:mother)* ?X }

◮ Connections between cities via :train, :bus, :plane

Cities that reach Paris with exactly one :bus connection{ ?X Paris }

Page 232: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Interesting navigational queries

◮ RDF graph with :father, :mother edges:

ancestors of John{ John (:father|:mother)* ?X }

◮ Connections between cities via :train, :bus, :plane

Cities that reach Paris with exactly one :bus connection{ ?X (:train|:plane)* Paris }

Page 233: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Interesting navigational queries

◮ RDF graph with :father, :mother edges:

ancestors of John{ John (:father|:mother)* ?X }

◮ Connections between cities via :train, :bus, :plane

Cities that reach Paris with exactly one :bus connection{ ?X (:train|:plane)*/:bus Paris }

Page 234: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Interesting navigational queries

◮ RDF graph with :father, :mother edges:

ancestors of John{ John (:father|:mother)* ?X }

◮ Connections between cities via :train, :bus, :plane

Cities that reach Paris with exactly one :bus connection{ ?X (:train|:plane)*/:bus/(:train|:plane)* Paris }

Page 235: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Interesting navigational queries

◮ RDF graph with :father, :mother edges:

ancestors of John{ John (:father|:mother)* ?X }

◮ Connections between cities via :train, :bus, :plane

Cities that reach Paris with exactly one :bus connection{ ?X (:train|:plane)*/:bus/(:train|:plane)* Paris }

Exercise: cities that reach Paris with an even number ofconnections

Page 236: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Interesting navigational queries

◮ RDF graph with :father, :mother edges:

ancestors of John{ John (:father|:mother)* ?X }

◮ Connections between cities via :train, :bus, :plane

Cities that reach Paris with exactly one :bus connection{ ?X (:train|:plane)*/:bus/(:train|:plane)* Paris }

Exercise: cities that reach Paris with an even number ofconnections

Mixing regular expressions and SPARQL operatorsgives interesting expressive power:

Persons in my professional network that attended the same school

Page 237: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Interesting navigational queries

◮ RDF graph with :father, :mother edges:

ancestors of John{ John (:father|:mother)* ?X }

◮ Connections between cities via :train, :bus, :plane

Cities that reach Paris with exactly one :bus connection{ ?X (:train|:plane)*/:bus/(:train|:plane)* Paris }

Exercise: cities that reach Paris with an even number ofconnections

Mixing regular expressions and SPARQL operatorsgives interesting expressive power:

Persons in my professional network that attended the same school

{ ?X (:conn)* ?Y .

?X (:conn)* ?Z .

?Y :sameSchool ?Z }

Page 238: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

As always, we need a (formal) semantics

◮ Regular expressions in SPARQL queries seem reasonable

◮ We need to agree in the meaning of these new queries

Page 239: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

As always, we need a (formal) semantics

◮ Regular expressions in SPARQL queries seem reasonable

◮ We need to agree in the meaning of these new queries

A bit of history on W3C standardization of property paths:

◮ Mid 2010: W3C defines an informal semantics for paths

Page 240: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

As always, we need a (formal) semantics

◮ Regular expressions in SPARQL queries seem reasonable

◮ We need to agree in the meaning of these new queries

A bit of history on W3C standardization of property paths:

◮ Mid 2010: W3C defines an informal semantics for paths

◮ Late 2010: several discussions on possible drawbacks on thenon-standard definition by the W3C

Page 241: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

As always, we need a (formal) semantics

◮ Regular expressions in SPARQL queries seem reasonable

◮ We need to agree in the meaning of these new queries

A bit of history on W3C standardization of property paths:

◮ Mid 2010: W3C defines an informal semantics for paths

◮ Late 2010: several discussions on possible drawbacks on thenon-standard definition by the W3C

◮ Early 2011: first formal semantics by the W3C

Page 242: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

As always, we need a (formal) semantics

◮ Regular expressions in SPARQL queries seem reasonable

◮ We need to agree in the meaning of these new queries

A bit of history on W3C standardization of property paths:

◮ Mid 2010: W3C defines an informal semantics for paths

◮ Late 2010: several discussions on possible drawbacks on thenon-standard definition by the W3C

◮ Early 2011: first formal semantics by the W3C

◮ Late 2011: empirical and theoretical study on SPARQL 1.1property paths showing unfeasibility of evaluation

Page 243: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

As always, we need a (formal) semantics

◮ Regular expressions in SPARQL queries seem reasonable

◮ We need to agree in the meaning of these new queries

A bit of history on W3C standardization of property paths:

◮ Mid 2010: W3C defines an informal semantics for paths

◮ Late 2010: several discussions on possible drawbacks on thenon-standard definition by the W3C

◮ Early 2011: first formal semantics by the W3C

◮ Late 2011: empirical and theoretical study on SPARQL 1.1property paths showing unfeasibility of evaluation

◮ Mid 2012: semantics change to overcome the raised issues

Page 244: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

As always, we need a (formal) semantics

◮ Regular expressions in SPARQL queries seem reasonable

◮ We need to agree in the meaning of these new queries

A bit of history on W3C standardization of property paths:

◮ Mid 2010: W3C defines an informal semantics for paths

◮ Late 2010: several discussions on possible drawbacks on thenon-standard definition by the W3C

◮ Early 2011: first formal semantics by the W3C

◮ Late 2011: empirical and theoretical study on SPARQL 1.1property paths showing unfeasibility of evaluation

◮ Mid 2012: semantics change to overcome the raised issues

The following experimental study is based on [ACP12].

Page 245: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL 1.1 implementations

had a poor performance

Data:◮ cliques (complete graphs) of different size◮ from 2 nodes (87 bytes) to 13 nodes (970 bytes)

:p:a0

:a1

:a3

:p

:p

:p

:p

:p

:a2

RDF clique with 4 nodes (127 bytes)

Page 246: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL 1.1 implementations

had a poor performance

1

10

100

1000

2 4 6 8 10 12 14 16

ARQ

+ + + + + + ++

+

+

++

RDFQ

× × × ××

×

×

×

×

×KGram

∗ ∗ ∗ ∗ ∗∗∗

∗Sesame

� � � �

[ACP12]

SELECT * WHERE { :a0 (:p)* :a1 }

Page 247: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Poor performance with real Web data of small size

Data:

◮ Social Network data given by foaf:knows links

◮ Crawled from Axel Polleres’ foaf document (3 steps)

◮ Different documents, deleting some nodes

foaf:knows

axel:me

ivan:me

bizer:chris

richard:cygri

...

· · ·

andreas:ah

· · ·

· · ·

Page 248: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Poor performance with real Web data of small size

SELECT * WHERE { axel:me (foaf:knows)* ?x }

Page 249: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Poor performance with real Web data of small size

SELECT * WHERE { axel:me (foaf:knows)* ?x }

Input ARQ RDFQ Kgram Sesame9.2KB 5.13 75.70 313.37 –

10.9KB 8.20 325.83 – –11.4KB 65.87 – – –13.2KB 292.43 – – –14.8KB – – – –17.2KB – – – –20.5KB – – – –25.8KB – – – – [ACP12]

(time in seconds, timeout = 1hr)

Page 250: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Poor performance with real Web data of small size

SELECT * WHERE { axel:me (foaf:knows)* ?x }

Input ARQ RDFQ Kgram Sesame9.2KB 5.13 75.70 313.37 –

10.9KB 8.20 325.83 – –11.4KB 65.87 – – –13.2KB 292.43 – – –14.8KB – – – –17.2KB – – – –20.5KB – – – –25.8KB – – – – [ACP12]

(time in seconds, timeout = 1hr)

Is this a problem of these particular implementations?

Page 251: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

This is a problem of the specification

[ACP12]

Any implementation that follows SPARQL 1.1 standard(as of January 2012) is doomed to show the same behavior

Page 252: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

This is a problem of the specification

[ACP12]

Any implementation that follows SPARQL 1.1 standard(as of January 2012) is doomed to show the same behavior

The main sources of complexity is counting

Page 253: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

This is a problem of the specification

[ACP12]

Any implementation that follows SPARQL 1.1 standard(as of January 2012) is doomed to show the same behavior

The main sources of complexity is counting

Impact on W3C standard:

◮ Standard semantics of SPARQL 1.1 property pathswas changed in July 2012 to overcome these issues

Page 254: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL 1.1 property paths match regular expressions

but also count

:a

:b

:c

:p

:p

:p

:p

:d

SELECT ?X

WHERE { :a (:p)* ?X }

Page 255: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL 1.1 property paths match regular expressions

but also count

:a

:b

:c

:p

:p

:p

:p

:d

SELECT ?X

WHERE { :a (:p)* ?X }

?X:a

:b

:c

:d

:d

Page 256: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL 1.1 property paths match regular expressions

but also count

:a

:b

:c

:p

:p

:p

:p

:d

SELECT ?X

WHERE { :a (:p)* ?X }

?X:a

:b

:c

:d

:d

Page 257: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL 1.1 property paths match regular expressions

but also count

:p:a

:b

:c

:p

:p

:p

:p

:d

SELECT ?X

WHERE { :a (:p)* ?X }

?X:a

:b

:c

:d

:d

Page 258: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL 1.1 property paths match regular expressions

but also count

:p:a

:b

:c

:p

:p

:p

:p

:d

SELECT ?X

WHERE { :a (:p)* ?X }

?X:a

:b

:c

:d

:d

:c

:d

Page 259: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL 1.1 property paths match regular expressions

but also count

:p:a

:b

:c

:p

:p

:p

:p

:d

SELECT ?X

WHERE { :a (:p)* ?X }

?X:a

:b

:c

:d

:d

:c

:d

But what if we have cycles?

Page 260: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL 1.1 document provides a special procedure

to handle cycles (and make the count)

Evaluation of path*

“the algorithm extends the multiset of results by one application of path.

If a node has been visited for path, it is not a candidate for another step.

A node can be visited multiple times if different paths visit it.”

SPARQL 1.1 Last Call (Jan 2012)

Page 261: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL 1.1 document provides a special procedure

to handle cycles (and make the count)

Evaluation of path*

“the algorithm extends the multiset of results by one application of path.

If a node has been visited for path, it is not a candidate for another step.

A node can be visited multiple times if different paths visit it.”

SPARQL 1.1 Last Call (Jan 2012)

◮ W3C document provides a procedure (ArbitraryLengthPath)

◮ This procedure was formalized in [ACP12]

Page 262: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Counting the number of solutions...

Data: Clique of size n

{ :a0 (:p)* :a1 }

every solution is a copy of the empty mapping µ∅ (| | in ARQ)

Page 263: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Counting the number of solutions...

Data: Clique of size n

{ :a0 (:p)* :a1 }

n # Sol.9 13,70010 109,60111 986,41012 9,864,10113 –

every solution is a copy of the empty mapping µ∅ (| | in ARQ)

Page 264: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Counting the number of solutions...

Data: Clique of size n

{ :a0 (:p)* :a1 } { :a0 ((:p)*)* :a1 }

n # Sol.9 13,70010 109,60111 986,41012 9,864,10113 –

every solution is a copy of the empty mapping µ∅ (| | in ARQ)

Page 265: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Counting the number of solutions...

Data: Clique of size n

{ :a0 (:p)* :a1 } { :a0 ((:p)*)* :a1 }

n # Sol.9 13,70010 109,60111 986,41012 9,864,10113 –

n # Sol2 13 64 3055 418,5766 –

every solution is a copy of the empty mapping µ∅ (| | in ARQ)

Page 266: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Counting the number of solutions...

Data: Clique of size n

{ :a0 (:p)* :a1 } { :a0 ((:p)*)* :a1 } { :a0 (((:p)*)*)* :a1 }

n # Sol.9 13,70010 109,60111 986,41012 9,864,10113 –

n # Sol2 13 64 3055 418,5766 –

every solution is a copy of the empty mapping µ∅ (| | in ARQ)

Page 267: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Counting the number of solutions...

Data: Clique of size n

{ :a0 (:p)* :a1 } { :a0 ((:p)*)* :a1 } { :a0 (((:p)*)*)* :a1 }

n # Sol.9 13,70010 109,60111 986,41012 9,864,10113 –

n # Sol2 13 64 3055 418,5766 –

n # Sol.2 13 424 –

every solution is a copy of the empty mapping µ∅ (| | in ARQ)

Page 268: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

More on counting the number of solutions...

Data: foaf links crawled from the Web

{ axel:me (foaf:knows)* ?x }

Page 269: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

More on counting the number of solutions...

Data: foaf links crawled from the Web

{ axel:me (foaf:knows)* ?x }

File # URIs # Sol. Output Size9.2KB 38 29,817 2MB10.9KB 43 122,631 8.4MB11.4KB 47 1,739,331 120MB13.2KB 52 8,511,943 587MB14.8KB 54 – –

Page 270: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

More on counting the number of solutions...

Data: foaf links crawled from the Web

{ axel:me (foaf:knows)* ?x }

File # URIs # Sol. Output Size9.2KB 38 29,817 2MB10.9KB 43 122,631 8.4MB11.4KB 47 1,739,331 120MB13.2KB 52 8,511,943 587MB14.8KB 54 – –

What is really happening here?

Page 271: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

More on counting the number of solutions...

Data: foaf links crawled from the Web

{ axel:me (foaf:knows)* ?x }

File # URIs # Sol. Output Size9.2KB 38 29,817 2MB10.9KB 43 122,631 8.4MB11.4KB 47 1,739,331 120MB13.2KB 52 8,511,943 587MB14.8KB 54 – –

What is really happening here?

Theory can help!

Page 272: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

A bit more on complexity classes...

Complexity can be measured by using counting-complexity classes

NP #P

Sat: is a propositional CountSat: how many assignmentsformula satisfiable? satisfy a propositional formula?

Page 273: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

A bit more on complexity classes...

Complexity can be measured by using counting-complexity classes

NP #P

Sat: is a propositional CountSat: how many assignmentsformula satisfiable? satisfy a propositional formula?

Formally

A function f (·) on strings is in #P if there exists apolynomial-time non-deterministic TM M such that

f (x) = number of accepting computations of M with input x

Page 274: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

A bit more on complexity classes...

Complexity can be measured by using counting-complexity classes

NP #P

Sat: is a propositional CountSat: how many assignmentsformula satisfiable? satisfy a propositional formula?

Formally

A function f (·) on strings is in #P if there exists apolynomial-time non-deterministic TM M such that

f (x) = number of accepting computations of M with input x

◮ CountSat is #P-complete

Page 275: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Counting problem for property paths

CountW3C

Input: RDF graph G

Property path triple { :a path :b }

Output: Count the number of solutions of { :a path :b } over G(according to the semantics proposed by W3C)

Page 276: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

The complexity of property paths is intractable

Theorem (ACP12)

CountW3C is outside #P

Page 277: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

The complexity of property paths is intractable

Theorem (ACP12)

CountW3C is outside #P

CountW3C is hard to solve even if P = NP

Page 278: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

A doubly exponential lower bound for counting

◮ Let paths be a property path of the form

(· · ·((:p)*)*)· · ·)*

with s nested stars

Page 279: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

A doubly exponential lower bound for counting

◮ Let paths be a property path of the form

(· · ·((:p)*)*)· · ·)*

with s nested stars

◮ Let Kn be a clique with n nodes

Page 280: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

A doubly exponential lower bound for counting

◮ Let paths be a property path of the form

(· · ·((:p)*)*)· · ·)*

with s nested stars

◮ Let Kn be a clique with n nodes

◮ Let CountClique(s, n) be the number of solutions of{ :a0 paths :a1 } over Kn

Page 281: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

A doubly exponential lower bound for counting

◮ Let paths be a property path of the form

(· · ·((:p)*)*)· · ·)*

with s nested stars

◮ Let Kn be a clique with n nodes

◮ Let CountClique(s, n) be the number of solutions of{ :a0 paths :a1 } over Kn

Lemma (ACP12)

CountClique(s , n) ≥ (n − 2)!(n−1)s−1

Page 282: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

A doubly exponential lower bound for counting

◮ Let paths be a property path of the form

(· · ·((:p)*)*)· · ·)*

with s nested stars

◮ Let Kn be a clique with n nodes

◮ Let CountClique(s, n) be the number of solutions of{ :a0 paths :a1 } over Kn

Lemma (ACP12)

CountClique(s , n) ≥ (n − 2)!(n−1)s−1

In [ACP12]: A recursive formula for calculating CountClique(s , n)

Page 283: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

We can explain the experimental results

CountClique(s, n) allows to fill in the blanks

{ :a0 ((:p)*)* :a1 }

n # Sol.2 13 64 3055 418,5766 –7 –8 –

Page 284: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

We can explain the experimental results

CountClique(s, n) allows to fill in the blanks

{ :a0 ((:p)*)* :a1 }

n # Sol.2 1 X

3 64 3055 418,5766 –7 –8 –

Page 285: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

We can explain the experimental results

CountClique(s, n) allows to fill in the blanks

{ :a0 ((:p)*)* :a1 }

n # Sol.2 1 X

3 6 X

4 3055 418,5766 –7 –8 –

Page 286: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

We can explain the experimental results

CountClique(s, n) allows to fill in the blanks

{ :a0 ((:p)*)* :a1 }

n # Sol.2 1 X

3 6 X

4 305 X

5 418,5766 –7 –8 –

Page 287: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

We can explain the experimental results

CountClique(s, n) allows to fill in the blanks

{ :a0 ((:p)*)* :a1 }

n # Sol.2 1 X

3 6 X

4 305 X

5 418,576 X

6 –7 –8 –

Page 288: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

We can explain the experimental results

CountClique(s, n) allows to fill in the blanks

{ :a0 ((:p)*)* :a1 }

n # Sol.2 1 X

3 6 X

4 305 X

5 418,576 X

6 – ← 28× 109

7 –8 –

Page 289: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

We can explain the experimental results

CountClique(s, n) allows to fill in the blanks

{ :a0 ((:p)*)* :a1 }

n # Sol.2 1 X

3 6 X

4 305 X

5 418,576 X

6 – ← 28× 109

7 – ← 144× 1015

8 –

Page 290: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

We can explain the experimental results

CountClique(s, n) allows to fill in the blanks

{ :a0 ((:p)*)* :a1 }

n # Sol.2 1 X

3 6 X

4 305 X

5 418,576 X

6 – ← 28× 109

7 – ← 144× 1015

8 – ← 79× 1024

Page 291: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

We can explain the experimental results

CountClique(s, n) allows to fill in the blanks

{ :a0 ((:p)*)* :a1 }

n # Sol.2 1 X

3 6 X

4 305 X

5 418,576 X

6 – ← 28× 109

7 – ← 144× 1015

8 – ← 79× 1024

79 Yottabytes for the answer over a file of 379 bytes

Page 292: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

We can explain the experimental results

CountClique(s, n) allows to fill in the blanks

{ :a0 ((:p)*)* :a1 }

n # Sol.2 1 X

3 6 X

4 305 X

5 418,576 X

6 – ← 28× 109

7 – ← 144× 1015

8 – ← 79× 1024

79 Yottabytes for the answer over a file of 379 bytes

1 Yottabyte > the estimated capacity of all digital storage in the world

Page 293: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Data complexity of property path is still intractable

Common assumption in Databases:

◮ queries are much smaller than data sources

Page 294: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Data complexity of property path is still intractable

Common assumption in Databases:

◮ queries are much smaller than data sources

Data complexity

◮ measure the complexity considering the query fixed

Page 295: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Data complexity of property path is still intractable

Common assumption in Databases:

◮ queries are much smaller than data sources

Data complexity

◮ measure the complexity considering the query fixed

◮ Data complexity of SQL, XPath, SPARQL 1.0, etc.are all polynomial

Page 296: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Data complexity of property path is still intractable

Common assumption in Databases:

◮ queries are much smaller than data sources

Data complexity

◮ measure the complexity considering the query fixed

◮ Data complexity of SQL, XPath, SPARQL 1.0, etc.are all polynomial

Theorem (ACP12)

Data complexity of CountW3C is #P-complete

Page 297: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Data complexity of property path is still intractable

Common assumption in Databases:

◮ queries are much smaller than data sources

Data complexity

◮ measure the complexity considering the query fixed

◮ Data complexity of SQL, XPath, SPARQL 1.0, etc.are all polynomial

Theorem (ACP12)

Data complexity of CountW3C is #P-complete

Corollary

SPARQL 1.1 query evaluation (as of January 2012)is intractable in Data Complexity

Page 298: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Existential semantics: a possible alternative

Possible solution

Do not count

Just check whether there exists a pathsatisfying the property path expression

Page 299: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Existential semantics: a possible alternative

Possible solution

Do not count

Just check whether there exists a pathsatisfying the property path expression

Years of experiences (theory and practice) in:

◮ Graph Databases

◮ XML

◮ SPARQL 1.0 (PSPARQL, Gleen)

+ equivalent regular expressions giving equivalent results

Page 300: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Existential semantics: decision problems

Input: RDF graph G

Property path triple { :a path :b }

ExistsPath

Question: Is there a path from :a to :b in G satisfyingthe regular expression path?

ExistsW3C

Question: Is the number of solutions of { :a path :b } over Ggreater than 0 (according to W3C semantics)?

Page 301: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Evaluating existential paths is tractable

Theorem (well-known result)

ExistsPath can be solved in O(|G | × |path|)

Can be proved by using automata theory:

Page 302: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Evaluating existential paths is tractable

Theorem (well-known result)

ExistsPath can be solved in O(|G | × |path|)

Can be proved by using automata theory:

1. consider G as an NFA with :a initial state and :b final state

Page 303: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Evaluating existential paths is tractable

Theorem (well-known result)

ExistsPath can be solved in O(|G | × |path|)

Can be proved by using automata theory:

1. consider G as an NFA with :a initial state and :b final state

2. construct from path an NFA Apath

Page 304: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Evaluating existential paths is tractable

Theorem (well-known result)

ExistsPath can be solved in O(|G | × |path|)

Can be proved by using automata theory:

1. consider G as an NFA with :a initial state and :b final state

2. construct from path an NFA Apath

3. construct the product automaton G ×Apath:

Page 305: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Evaluating existential paths is tractable

Theorem (well-known result)

ExistsPath can be solved in O(|G | × |path|)

Can be proved by using automata theory:

1. consider G as an NFA with :a initial state and :b final state

2. construct from path an NFA Apath

3. construct the product automaton G ×Apath:◮ whenever (x , r , y) ∈ G and (p, r , q) is a transition in Apath

add a transition ((x , p), r , (y , q)) to G ×Apath

Page 306: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Evaluating existential paths is tractable

Theorem (well-known result)

ExistsPath can be solved in O(|G | × |path|)

Can be proved by using automata theory:

1. consider G as an NFA with :a initial state and :b final state

2. construct from path an NFA Apath

3. construct the product automaton G ×Apath:◮ whenever (x , r , y) ∈ G and (p, r , q) is a transition in Apath

add a transition ((x , p), r , (y , q)) to G ×Apath

4. check if we can go from (:a, q0) to (:b, qf ) in G ×Apath

Page 307: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Evaluating existential paths is tractable

Theorem (well-known result)

ExistsPath can be solved in O(|G | × |path|)

Can be proved by using automata theory:

1. consider G as an NFA with :a initial state and :b final state

2. construct from path an NFA Apath

3. construct the product automaton G ×Apath:◮ whenever (x , r , y) ∈ G and (p, r , q) is a transition in Apath

add a transition ((x , p), r , (y , q)) to G ×Apath

4. check if we can go from (:a, q0) to (:b, qf ) in G ×Apath

with q0 initial state of Apath and qf some final state of Apath

Page 308: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Relationship between ExistsPath and ExistsW3C

Theorem (ACP12)

ExistsPath and ExistsW3C are equivalent decision problems

Page 309: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Relationship between ExistsPath and ExistsW3C

Theorem (ACP12)

ExistsPath and ExistsW3C are equivalent decision problems

Corollary (ACP12)

ExistsW3C can be solved in O(|G | × |path|)

Page 310: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

So there are possibilities for optimization

SPARQL includes an operator to eliminate duplicates (DISTINCT)

Page 311: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

So there are possibilities for optimization

SPARQL includes an operator to eliminate duplicates (DISTINCT)

Corollary

Property path queries with SELECT DISTINCT

can be efficiently evaluated

Page 312: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

So there are possibilities for optimization

SPARQL includes an operator to eliminate duplicates (DISTINCT)

Corollary

Property path queries with SELECT DISTINCT

can be efficiently evaluated

And we can also use DISTINCT over general queries

Theorem

SELECT DISTINCT SPARQL 1.1 queries are tractable in Data Complexity

Page 313: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL 1.1 implementations

do not take advantage of SELECT DISTINCT

SELECT DISTINCT * WHERE { axel:me (foaf:knows)* ?x }

Input ARQ RDFQ Kgram Sesame Psparql Gleen

Page 314: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL 1.1 implementations

do not take advantage of SELECT DISTINCT

SELECT DISTINCT * WHERE { axel:me (foaf:knows)* ?x }

Input ARQ RDFQ Kgram Sesame Psparql Gleen

Page 315: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL 1.1 implementations

do not take advantage of SELECT DISTINCT

SELECT DISTINCT * WHERE { axel:me (foaf:knows)* ?x }

Input ARQ RDFQ Kgram Sesame Psparql Gleen9.2KB 2.24 47.31 2.37 – 0.29 1.39

10.9KB 2.60 204.95 6.43 – 0.30 1.3211.4KB 6.88 3222.47 80.73 – 0.30 1.3413.2KB 24.42 – 394.61 – 0.31 1.3814.8KB – – – – 0.33 1.3817.2KB – – – – 0.35 1.4220.5KB – – – – 0.44 1.5025.8KB – – – – 0.45 1.52

Page 316: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

SPARQL 1.1 implementations

do not take advantage of SELECT DISTINCT

SELECT DISTINCT * WHERE { axel:me (foaf:knows)* ?x }

Input ARQ RDFQ Kgram Sesame Psparql Gleen9.2KB 2.24 47.31 2.37 – 0.29 1.39

10.9KB 2.60 204.95 6.43 – 0.30 1.3211.4KB 6.88 3222.47 80.73 – 0.30 1.3413.2KB 24.42 – 394.61 – 0.31 1.3814.8KB – – – – 0.33 1.3817.2KB – – – – 0.35 1.4220.5KB – – – – 0.44 1.5025.8KB – – – – 0.45 1.52

Optimization possibilities can remain hiddenin a complicated specification

Page 317: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

New semantics for property paths (July 2012)

◮ Paths constructed only from / and | should be counted

◮ As soon as * is used, duplicates are eliminated (from that partof the expression)

Page 318: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

New semantics for property paths (July 2012)

◮ Paths constructed only from / and | should be counted

◮ As soon as * is used, duplicates are eliminated (from that partof the expression)

For example, consider path1, path2, and path3 not using * thenwhen evaluating

{ :a path1/(path2)*/path3 :b }

one should (intuitively):

Page 319: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

New semantics for property paths (July 2012)

◮ Paths constructed only from / and | should be counted

◮ As soon as * is used, duplicates are eliminated (from that partof the expression)

For example, consider path1, path2, and path3 not using * thenwhen evaluating

{ :a path1/(path2)*/path3 :b }

one should (intuitively):

◮ consider all the paths from :a to some intermediate :c1satisfying path1

Page 320: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

New semantics for property paths (July 2012)

◮ Paths constructed only from / and | should be counted

◮ As soon as * is used, duplicates are eliminated (from that partof the expression)

For example, consider path1, path2, and path3 not using * thenwhen evaluating

{ :a path1/(path2)*/path3 :b }

one should (intuitively):

◮ consider all the paths from :a to some intermediate :c1satisfying path1

◮ check if there exists :c2 reachable from :c1 following(path2)*

Page 321: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

New semantics for property paths (July 2012)

◮ Paths constructed only from / and | should be counted

◮ As soon as * is used, duplicates are eliminated (from that partof the expression)

For example, consider path1, path2, and path3 not using * thenwhen evaluating

{ :a path1/(path2)*/path3 :b }

one should (intuitively):

◮ consider all the paths from :a to some intermediate :c1satisfying path1

◮ check if there exists :c2 reachable from :c1 following(path2)*

◮ consider all the paths from :c2 to :b satisfying path3

Page 322: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

New semantics for property paths (July 2012)

◮ Paths constructed only from / and | should be counted

◮ As soon as * is used, duplicates are eliminated (from that partof the expression)

For example, consider path1, path2, and path3 not using * thenwhen evaluating

{ :a path1/(path2)*/path3 :b }

one should (intuitively):

◮ consider all the paths from :a to some intermediate :c1satisfying path1

◮ check if there exists :c2 reachable from :c1 following(path2)*

◮ consider all the paths from :c2 to :b satisfying path3

◮ make the count (to produce copies)

Page 323: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Is the new semantics the right one?

W3C definitely wants to count paths.Are there more reasonable alternatives?

Page 324: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Is the new semantics the right one?

W3C definitely wants to count paths.Are there more reasonable alternatives?

◮ to have different operators for counting (e.g. . and ||)so the user can decide.

Page 325: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Is the new semantics the right one?

W3C definitely wants to count paths.Are there more reasonable alternatives?

◮ to have different operators for counting (e.g. . and ||)so the user can decide.

◮ the honest approach: just make the count and outputinfininty in the presence of cycles

Page 326: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Is the new semantics the right one?

W3C definitely wants to count paths.Are there more reasonable alternatives?

◮ to have different operators for counting (e.g. . and ||)so the user can decide.

◮ the honest approach: just make the count and outputinfininty in the presence of cycles

◮ count the number of simple paths

Page 327: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Is the new semantics the right one?

W3C definitely wants to count paths.Are there more reasonable alternatives?

◮ to have different operators for counting (e.g. . and ||)so the user can decide.

◮ the honest approach: just make the count and outputinfininty in the presence of cycles

◮ count the number of simple paths

◮ does someone from the audience have another in mind?

Page 328: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Is the new semantics the right one?

W3C definitely wants to count paths.Are there more reasonable alternatives?

◮ to have different operators for counting (e.g. . and ||)so the user can decide.

◮ the honest approach: just make the count and outputinfininty in the presence of cycles

◮ count the number of simple paths

◮ does someone from the audience have another in mind?

In all the above cases you have to decide what exactly you count:

Page 329: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Is the new semantics the right one?

W3C definitely wants to count paths.Are there more reasonable alternatives?

◮ to have different operators for counting (e.g. . and ||)so the user can decide.

◮ the honest approach: just make the count and outputinfininty in the presence of cycles

◮ count the number of simple paths

◮ does someone from the audience have another in mind?

In all the above cases you have to decide what exactly you count:

◮ the number of paths satisfying the expression?

◮ the number of ways that the expr can be satisfied? (W3C)

Page 330: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Is the new semantics the right one?

W3C definitely wants to count paths.Are there more reasonable alternatives?

◮ to have different operators for counting (e.g. . and ||)so the user can decide.

◮ the honest approach: just make the count and outputinfininty in the presence of cycles

◮ count the number of simple paths

◮ does someone from the audience have another in mind?

In all the above cases you have to decide what exactly you count:

◮ the number of paths satisfying the expression?

◮ the number of ways that the expr can be satisfied? (W3C)

(they are not the same! consider for example (a|a) )

Page 331: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Several work to do on navigational queries for graphs!

Other lines of research with open questions:

◮ Nested regular expressions and nSPARQL [PAG09,BPR12]

◮ Queries that can output paths (e.g. ECRPQs [BLHW10])

◮ More complexity results on counting paths [LM12]

What is the right language (and the right semantics)for navigating RDF graphs?

Page 332: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Outline

Basics of SPARQLSyntax and Semantics of SPARQL 1.0What is new in SPARQL 1.1

Federation: SERVICE operatorSyntax and SemanticsEvaluation of SERVICE queries

Navigation: Property PathsNavigating graphs with regular expressionsThe history of paths (in SPARQL 1.1 specification)Evaluation procedures and complexity

Page 333: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Concluding remarks

◮ Federation and Navigation are fundamental features inSPARQL 1.1

◮ They need formalization and (serious) study

Page 334: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Concluding remarks

◮ Federation and Navigation are fundamental features inSPARQL 1.1

◮ They need formalization and (serious) study

◮ Do not runaway from Theory! it can really help (and hashelped) to understand the implications of design decisions

Page 335: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Concluding remarks

◮ Federation and Navigation are fundamental features inSPARQL 1.1

◮ They need formalization and (serious) study

◮ Do not runaway from Theory! it can really help (and hashelped) to understand the implications of design decisions

Big challenge:

◮ Can we integrate everything to effectively query Linked Data?

◮ Is SPARQL 1.1 the ultimate query language for Linked Data?

Page 336: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Concluding remarks

◮ Federation and Navigation are fundamental features inSPARQL 1.1

◮ They need formalization and (serious) study

◮ Do not runaway from Theory! it can really help (and hashelped) to understand the implications of design decisions

Big challenge:

◮ Can we integrate everything to effectively query Linked Data?

◮ Is SPARQL 1.1 the ultimate query language for Linked Data?

A lot of work to do, so lets start! and I’m happy to collaborate! :-)

Page 337: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

Federation and Navigation in SPARQL 1.1

Jorge Perez

Assistant ProfessorDepartment of Computer Science

Universidad de Chile

Page 338: Universidad de Chilejperez/talks/rw2012.pdf · An example of an RDF graph to query: DBLP inAMW:SarmaUW09 :JeffreyD.Ullman:AnishDasSarma:JenniferWidom inAMW:2009 "Schema Design for

References

ACP12 Counting Beyond a Yottabyte ..., WWW 2012

AG08 The Expressive Power of SPARQL, ISWC 2008

BAC11 Sem & Opt of SPARQL 1.1 Federation Extensions, ISWC 2011

BLHMW10 Expressive Query Languages for Path Queries, PODS 2010

BPR12 Relative Expressiveness of Nested Regular Expressions, AMW 2012

LPSS12 Static Analysis and Optimization of SemWeb Queries, PODS 2012

LM12 Complexity of Evaluating Path Expressions in SPARQL, PODS 2012

PAG06-09 Semantics and Complexity of SPARQL, ISWC 2006, TODS 2009