toon calders [email protected] deductive databases

75
TOON CALDERS [email protected] Deductive databases

Upload: skyler-brundage

Post on 16-Dec-2015

225 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

TOON [email protected]

Deductive databases

Page 2: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Important Information

This material was developed by

Dr. Toon Calders and prof. Dr. Jan Paredaensat University of Technology, Eindhoven.The original document can be accessed at the following

address:http://wwwis.win.tue.nl/~tcalders/teaching/advancedDB/

All changes and adaptations are my own responsability, Rogelio Dávila Pérez

Page 3: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Motivation: Deductive DB

Motivation is two-fold: add deductive capabilities to databases; the

database contains: facts (extensional relations) rules to generate derived facts (intensional relations)

Database is knowledge base Extend the querying

datalog allows for recursion

Page 4: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Motivation: Deductive DB

Datalog as engine of deductive databases similarities with Prolog has facts and rules rules define -possibly recursive- views

Semantics not always clear safety negation recursion

Page 5: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Outline

Syntax of the Datalog languageSemantics of a Datalog programRelational algebra = safe Datalog with

negation and without recursionOptimization techniquesConclusions

Page 6: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Syntax of Datalog

Datalog query/program: facts traditional relational tables rules define intensional views

Rules if-then rules can contain recursion can contain negations

Semantics of program can be ambiguous

Page 7: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Syntax of Datalog

Example

father(X,Y) :- person(X,m), parent(X,Y).

grandson(X,Y) :- parent(Y,Z), parent(Z,X), person(X,m).

hbrothers(X,Y) :- person(X,m), person(Y,m), parent(Z,X), parent(Z,Y).

person parentName Sex Parent Childingrid f ingrid toonalex m alex toonrita f rita anjan m jan antoon m an berndan f an mattijsbernd m toon berndmattijs m toon mattijs

Page 8: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Syntax of Datalog

Variables: X, YConstants: m, f, rita, …Positive literal: p(t1,…,tn)

p is the name of a relation (EDB or IDB) t1, …, tn constants or variables

Negative literal: not p(t1, …, tn)

Rule: h :- l1, …, ln

h positive literal, l1, …, ln literals

Page 9: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Syntax of Datalog

Variables: X, YConstants: m, f, rita, …Positive literal: p(t1,…,tn)

p is the name of a relation (EDB or IDB) t1, …, tn constants or variables

Negative literal: not p(t1, …, tn)

Rule: h :- l1, …, ln

h positive literal, l1, …, ln literals

In Datalog: Classic logical negation

( In contrast to Prolog’s negation by failure )

Page 10: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Syntax of Datalog

Rules can be recursiveArithmetic operations considered as

special predicates A<B : smaller(A,B) A+B=C : plus(A,B,C)

Page 11: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Outline

Syntax of the Datalog languageSemantics of a Datalog program

non-recursive recursive datalog aggregation

Relational algebra = safe Datalog with negation and without recursion

Optimization techniquesConclusions

Page 12: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Semantics of Non-Recursive Datalog Programs

Ground instantiation of a rule h :- l1, …, ln : replace every variable in the rule by a constant

Example:father(X,Y) :- person(X,m), parent(X,Y)

instantiation:father(toon,an) :- person(toon,m), parent(toon,an).

Page 13: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Semantics of Non-Recursive Datalog Programs

Let I be a set of factsThe body of a rule instantiation R’ is

satisfied by I if: every positive literal in the body of R’ is in I no negative literal in the body of R’ is in I

Example:person(toon,m), parent(toon,an) not satisfied by the facts given before

Page 14: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Semantics of Non-Recursive Datalog Programs

Let I be a set of factsR is a rule h :- l1, …, ln

Infer(R,I) = { h’ : h’ :- l1’, …, ln’ is a ground instantiation of R

l1’ … ln’ is satisfied by I }

R={R1, …, Rn}Infer(R,I) = Infer(R1,I) … Infer(Rn,I)

Page 15: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Semantics of Non-Recursive Datalog Programs

A rule h :- l1, …, ln is in layer 1: l1, …, ln only involve extensional predicates

A rule h :- l1, …, ln is in layer i for all 0<j<i, it is not in layer j l1, …, ln only involve predicates that are

extensional and in the layers 1, …, i-1

Page 16: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Semantics of Non-Recursive Datalog Programs

Let I0 be the facts in a datalog program

Let R1 be the rules at layer 1

…Let Rn be the rules at layer n

I1 = I0 Infer(R1, I0)

I2 = I1 Infer(R2, I1)

…In = In-1 Infer(Rn, In-1)

Page 17: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Semantics of Non-Recursive Datalog Programs

Example:

person parentName Sex Parent Childingrid f ingrid toonalex m alex toonrita f rita anjan m jan antoon m an berndan f an mattijsbernd m toon berndmattijs m toon mattijs

father(X,Y) :- person(X,m), parent(X,Y).grandfather(X,Y) :- father(X,Y), parent(Y,Z).hbrothers(X,Y) :- person(X,m), person(Y,m),

parent(Z,X), parent(Z,Y).

Page 18: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Semantics of Non-Recursive Datalog Programs

person parentName Sex Parent Childingrid f ingrid toonalex m alex toonrita f rita anjan m jan antoon m an berndan f an mattijsbernd m toon berndmattijs m toon mattijs

father(X,Y) :- person(X,m), parent(X,Y).grandfather(X,Y) :- father(X,Y), parent(Y,Z).hbrothers(X,Y) :- person(X,m), person(Y,m),

parent(Z,X), parent(Z,Y).

Stratum 0Valuation:

Page 19: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Semantics of Non-Recursive Datalog Programs

person parentName Sex Parent Childingrid f ingrid toonalex m alex toonrita f rita anjan m jan antoon m an berndan f an mattijsbernd m toon berndmattijs m toon mattijs

father(X,Y) :- person(X,m), parent(X,Y).grandfather(X,Y) :- father(X,Y), parent(Y,Z).hbrothers(X,Y) :- person(X,m), person(Y,m),

parent(Z,X), parent(Z,Y).

Stratum 0 father alex toonjan antoon berndtoon mattijs hbrothersbernd mattijsmattijs berndmattijs mattijsbernd bernd

Stratum 1Valuation:

Page 20: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Semantics of Non-Recursive Datalog Programs

person parentName Sex Parent Childingrid f ingrid toonalex m alex toonrita f rita anjan m jan antoon m an berndan f an mattijsbernd m toon berndmattijs m toon mattijs

father(X,Y) :- person(X,m), parent(X,Y).grandfather(X,Y) :- father(X,Y), parent(Y,Z).hbrothers(X,Y) :- person(X,m), person(Y,m),

parent(Z,X), parent(Z,Y).

Stratum 0 father grandfather alex toon jan mattijsjan an jan berndtoon bernd alex mattijstoon mattijs alex bernd hbrothersbernd mattijsmattijs berndmattijs mattijsbernd bernd

Stratum 1 Stratum 2 Valuation:

Page 21: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Caveat: Correct Negation

Negation in Datalog Negation in PrologProlog negation (Negation by failure):

not(p(X)) “is true if we fail to prove p(X)”

Datalog negation (Correct negation): not(p(X)) “binds X to a value such that p(X) does

not hold.”

Page 22: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Caveat: Correct Negation

Example:

father(a,b). person(a). person(b).nfather(X) :- person(X), not( father (X,Y) ),

person(Y).

Datalog:? nfather(X) { (a), (b) }Prolog:? nfather(X) { (b) }? person(a), not(father(a,a)), person(a) yes

Page 23: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Caveat: Correct Negation

Prolog: Order of the clauses is important nfather(X) :- person(X), not( father (X,Y) ), person(Y). versus nfather(X) :- person(X), person(Y), not( father (X,Y) ). Order of the rules is important

Datalog: Order not important More declarative

Page 24: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Caveat: Correct Negation

Difference is not fundamental:

Prolog:nfather(X) :- person(X), not( father (X,Y) ).

Datalog:nfather(X) :- person(X), not(father_of_someone(X)).

father_of_someone(X) :- father (X,Y).

Page 25: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Caveat: Correct Negation

Difference is not fundamental: Many systems that claim to implement Datalog,

actually implement negation by failure. Debating on whether or not this is correct is

pointless; both perspectives are useful Check on beforehand how an engine implements

negation …

Throughout the course, in all exercises, in the exam, …, we assume correct negation.

Page 26: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Safety

A rule can make no sense if variables appear in funny ways

Examples:S(x) :- R(y)S(x) :- not R(x)S(x) :- R(y), x<yIn each of these cases the result is infinite

even if the relation R is finite

Page 27: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Safety

Even when not leading to infinite relations, such Datalog Programs can be domain-dependent.

Example:s(a,b). s(a,a). r(a). r(b).t(X) :- not(s(X,Y)), r(X).If domain is {a,b}:

only t(b) holds.If domain is {a,b,c}

not only t(b), but also t(a) holds( Ground instantiation t(a) :- not(s(a,c)), r(a). )

Page 28: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Safety

Therefore, we will only consider rules that are safe.

A rule h :- l1, …, ln is safe if:

every variable in the head of the rule: also in a non-arithmetic positive literale in body

every variable in a negative literal of the body: also in some positive literal of the body

Page 29: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Model-Theoretic Semantics

A model M of a Datalog program is:

An instatiation of all intensional relations in the program

That satisfies all rules in the program If the body of a ground instantiation of a rule holds in

M, also the head must hold

Some models are special …

Page 30: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Model-Theoretic Semantics

father(a,b).person(X) :- father(X,Y).person(Y) :- father(X,Y).

M1: father persona b a

bM2: father person

a b ab a ba a

Page 31: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Model-Theoretic Semantics

A model is minimal if « we cannot remove tuples »

M1: father persona b a

b

M2: father persona b ab a ba a

Minimal

Not Minimal

Page 32: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Model-Theoretic Semantics

For non-recursive, safe datalog programs semantics is well defined

The model = all facts that can be derived from the program

Closed-World Assumption: if a fact cannot be derived from the database, then it is not true

Is a minimal model

Page 33: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Model-Theoretic Semantics

Minimal model is, however, not necessarily unique

Example:r(a).t(X) :- r(X), not s(X).

minimal models: M1 M2

r s t r s ta a a a

Page 34: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Outline

Syntax of the Datalog languageSemantics of a Datalog program

non-recursive recursive datalog aggregation

Relational algebra = safe Datalog with negation and without recursion

Optimization techniquesConclusions

Page 35: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Semantics of Recursive Datalog Programs

g(a,b). g(b,c). g(a,d).reach(X,X) :- g(X,Y). reach(Y,Y) :- g(X,Y)reach(X,Y) :- g(X,Y).reach(X,Z) :- reach(X,Y), reach(Y,Z).

Fixpoint of a set of rules R, starting with set of facts I:repeat

Old_I := II := I infer(R,I)

until I = Old_I

Always termination (inflationary fixpoint)

Page 36: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Semantics of Recursive Datalog Programs

g(a,b). g(b,c). g(a,d).reach(X,X) :- g(X,Y). reach(Y,Y) :- g(X,Y)reach(X,Y) :- g(X,Y).reach(X,Z) :- reach(X,Y), reach(Y,Z).

Step 0: reach = { }Step 1: reach = { (a,a), (b,b), (c,c), (d,d), (a,b), (b,c),

(a,d) }Step 2: reach = {(a,a), (b,b), (c,c), (d,d), (a,b), (b,c), (a,d),

(a,c) }Step 3: reach = {(a,a), (b,b), (c,c), (d,d), (a,b), (b,c), (a,d),

(a,c) } STOP

Page 37: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Semantics of Recursive Datalog Programs

Datalog without negation: Always a unique minimal model.

Semantics of recursive datalog with negation is less clear.

Example:T(a).R(X) :- T(X), not S(X).S(X) :- T(X), not R(X).

What about R(a)? S(a)?

Page 38: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Semantics of Recursive Datalog Programs

For some classes of Datalog queries with negation still a natural semantics can be defied

Important class: stratified programs T depends on S if some rule with T in the head

contains S or (recursively) some predicate that depends on S, in the body.

Stratified program: If T depends on (not S), then S cannot depend on T or (not T).

Page 39: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Semantics of Recursive Datalog Programs

The program:T(a).R(X) :- T(X), not S(X).S(X) :- T(X), not R(X).

is not stratified;

R depends negatively on SS depends negatively on R

R T

S

Page 40: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Semantics of Recursive Datalog Programs

g(a,b). g(b,c). g(a,d).reach(X,X) :- g(X,Y).reach(Y,Y) :- g(X,Y).reach(X,Y) :- g(X,Y).reach(X,Z) :- reach(X,Y),

reach(Y,Z).node(X) :- g(X,Y).node(Y) :- g(X,Y).unreach(X,Y) :- node(X), node(Y),

not reach(X,Y).

g

reach node

unreach

Page 41: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Semantics of Recursive Datalog Programs

If a program is stratified, the tables in the program can be partitioned into strata: Stratum 0: All database tables. Stratum I: Tables defined in terms of tables in

Stratum I and lower strata. If T depends on (not S), S is in lower stratum than

T.

Page 42: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Semantics of Recursive Datalog Programs

g(a,b). g(b,c). g(a,d).reach(X,X) :- g(X,Y).reach(Y,Y) :- g(X,Y).reach(X,Y) :- g(X,Y).reach(X,Z) :- reach(X,Y),

reach(Y,Z).node(X) :- g(X,Y).node(Y) :- g(X,Y).unreach(X,Y) :- node(X), node(Y),

not reach(X,Y).

g

reach node

unreach

0

1

2

Page 43: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Semantics of Recursive Datalog Programs

Semantics of a stratified program given by: First, compute the least fixpoint of all

tables in Stratum 1. (Stratum 0 tables are fixed.)

Then, compute the least fixpoint of tables in Stratum 2; then the lfp of tables in Stratum 3, and so on, stratum-by-stratum.

Page 44: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Semantics of Recursive Datalog Programs

Fixpoint of a set of rules R, starting with set of facts I:repeat

Old_I := II := I infer(R,I)

until I = Old_I

Fixpoint within one stratum always terminates Due to monotonicity within the strata Only positive dependence between tables in stratum l.

Due to finite program, number of strata isfinite as well

Page 45: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Semantics of Recursive Datalog Programs

Stratum 0: g(a,b). g(b,c). g(a,d).Stratum 1:

node(a), node(b), node(c), node(d),reach(a,a), reach(b,b), reach(c,c), reach(d,d), reach(a,b), reach(b,c), …

Stratum 2:unreach(b,a), unreach(c,a), …

Page 46: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Outline

Syntax of the Datalog languageSemantics of a Datalog program

non-recursive recursive datalog aggregation

Relational algebra = safe Datalog with negation and without recursion

Optimization techniquesConclusions

Page 47: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Aggregate Operators

The < … > notation in the head indicates grouping; the remaining arguments (X, in this example) are the GROUP BY fields.

In order to apply such a rule, must have all of relation g available.

Stratification with respect to use of < … > is similar to negation.

Degree(X, SUM(<Y>)) :- g(X,Y).

Page 48: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Aggregate Operators

bi(X,Y) :- g(X,Y). gbi(Y,X) :- g(X,Y).Degree(X, SUM(<Y>)) :- bi(X,Y). bi

degree

Page 49: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Aggregate Operators

bi(X,Y) :- g(X,Y). gbi(Y,X) :- g(X,Y).Degree(X, SUM(<Y>)) :- bi(X,Y). bi

degree

0

1

2

Page 50: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Aggregate Operators

bi(X,Y) :- g(X,Y). gbi(Y,X) :- g(X,Y).Degree(X, SUM(<Y>)) :- bi(X,Y). bi

degree

Compute stratum by stratumAssume strata 1 k fixed when computing

k+1

0

1

2

Page 51: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Aggregate Operators

r(a,b). r(a,c). s(a,d).t(X,SUM(<Y>)) :- r(X,Y).r(X,Y) :- t(X,Z), Z=2, s(X,Y).

Page 52: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Aggregate Operators

r(a,b). r(a,c). s(a,d). tt(X,SUM(<Y>)) :- r(X,Y).r(X,Y) :- t(X,Z), Z=2, s(X,Y). r s

Page 53: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Aggregate Operators

r(a,b). r(a,c). s(a,d). tt(X,SUM(<Y>)) :- r(X,Y).r(X,Y) :- t(X,Z), Z=2, s(X,Y). r s

“a is aggregating over a moving target”Step 1: t(a,2) is addedStep 2: r(a,d)is addedStep 3: t(a,3) added, t(a,2) no longer true …

hence r(a,d) should not have been added

Page 54: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Outline

Syntax of the Datalog languageSemantics of a Datalog programRelational algebra = Safe Datalog with

negation and without recursionOptimization techniquesConclusions

Page 55: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

RA = Non-Recursive Datalog

Every operator of RA can be simulated by non-recursive datalog Project on the first attribute of the ternary relation r:

query (A) :–r(A, B, C).

Cartesian product of relations r1 and r2.

query (X1, X2, ..., Xn, Y1, Y1, Y2, ..., Ym ) :–r1 (X1, X2, ..., Xn ), r2 (Y1, Y2, ...,

Ym ). Union of relations r1 and r2.

query (X1, X2, ..., Xn ) :–r1 (X1, X2, ..., Xn ), query (X1, X2, ..., Xn ) :–r2 (X1, X2, ..., Xn ),

Set difference of r1 and r2.

query (X1, X2, ..., Xn ) :–r1(X1, X2, ..., Xn ), not r2 (X1, X2, ..., Xn )

Page 56: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

RA = Non-Recursive Datalog

Every operator of RA can be simulated by non-recursive datalog Result of our construction is always safe, and

equivalent for stratified semantics

$1=$3 (($1R) x R)

query1(A) :- R(A,B).

query2(A,B,C) :- query1(A), R(B,C).

result(A,B,A) :- query2(A,B,A)

Page 57: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

RA = Non-Recursive Datalog

Every rule can be expressed by one RA expression: Translate every atom separately

Negation/arithmetic: use “complement” construction Essential: safety

Combine atoms with Cartesian product Do the joins with a selection Project on the relevant attributes

Strata determine the order of evaluationBecause of no recursion: every rule only

executed once.

Page 58: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

RA = Non-Recursive Datalog

sister(X,Y) :- person(X,f), parent(Z,X), parent(Z,Y), not(X=Y).

person(X,f): $2=‘f’ Personparent(Z,X) and parent(Z,Y): Parentnot(X=Y): complement construction

X comes from parent(Z,X) $2 Parent

Y from parent(Z,Y) $2 Parent

not(X=Y) $1$2 ($2 Parent x $2 Parent)

Page 59: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

RA = Non-Recursive Datalog

sister(X,Y) :- person(X,f), parent(Z,X), parent(Z,Y), not(X=Y).

$1,$6

$1=$4, $3=$5, $1=$7, $6=$8

($2=‘f’ Person x Parent x Parent

x $1$2 ($2 Parent x $2

Parent))

Page 60: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

RA = Non-Recursive Datalog

Hence, the following two are equivalent in expressive power:

Safe Datalog with negation, without recursion or aggregation, under the stratified semantics

Relational Algebra

Every rule separately can be expressed by a relational algebra expression Makes it very suitable for implementation on top

of a relational database

Page 61: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Outline

Syntax of the Datalog languageSemantics of a Datalog programRelational algebra = Datalog with

negation and without recursionOptimization techniquesConclusions

Page 62: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Evaluation of Datalog Programs

Running example:root(r). child(r,a). child(r,b). child(a,c).child(a,d). child(c,e). child(d,f). child(b,h).

sg(X,Y) :- root(X),root(Y).sg(X,Y) :- child(X,U), sg(U,V),

child(Y,V).

r

a b

c d

fe

h

Page 63: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Evaluation of Datalog Programs: Issues

Repeated inferences: recursive rules are repeatedly applied in the naïve way; same inferences in several iterations.

Unnecessary inferences: if we just want to find sg of a particular node, say e, computing the fixpoint of the sg program and then selecting tuples with e in the first column is wasteful, in that we compute many irrelevant facts.

Page 64: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Evaluation of Datalog Programs

Running example:Query: ? sg(e,X)

1. (r, r)2. (a,a), (b,b), (a,b), (b,a)3. (c,c), (c,d), (c,h), (d,c), (d,d), …4. (e,e), (f,f), (e,f), (f,e)

r

a b

c d

fe

h

Page 65: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Avoiding Repeated Inferences

Seminaive Fixpoint Evaluation: Avoid repeated inferences: at least one of the body facts generated in the most recent iteration. For each recursive table P, use a table delta_P. Rewrite the program to use the delta tables.

A second evaluation of the rule:r(X,Y) :- s(X), t(Y), u(X,Z), v(Y,Z).

only gives new tuples (X,Y) for ground instantiations in which at least one of the atoms is new.

Page 66: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Avoiding Unnecessary Inferences

Still, in the running example: many unnecessary deductions when query is:

? sg(e,X)

Compare with top-down as in Prolog only facts that are connected to the ultimate goal

are being considered

Page 67: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

The “Prolog Way”

sg(X,Y) :- root(X),root(Y).sg(X,Y) :- child(X,U), sg(U,V), child(Y,V).

? sg(e,X).try: root(e) FAILtry: child(e,U)

U=ctry: sg(c,V)

try: root(e) FAILtry: child(c,U) U=a

r

a b

c d

fe

h

Page 68: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

The “Prolog Way”

sg(X,Y) :- root(X),root(Y).sg(X,Y) :- child(X,U), sg(U,V), child(Y,V).

? sg(e,X).try: root(e) FAILtry: child(e,U)

U=ctry: sg(c,V)

try: root(e) FAILtry: child(c,U) U=a

r

a b

c d

fe

h

Page 69: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

“Magic Sets” Idea

We want to do something similar for Datalog

Idea: Define a “filter” table: computes all relevant values, restricts the computation of sg(e,X).

sg(X,Y) :- m(X), root(X), root(Y).sg(X,Y) :- m(X), child(X,U), sg(U,V),child(Y,V).m(X) :- m(Y), child(Y,X).m(e).

Page 70: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Magic Sets

It is always possible to do this in such a way that bottom-up becomes as efficient as top-down!

Different proposals exist in literature how to introduce the magic filters

Page 71: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Optimization Techniques

Many other techniques exist as well Standard relational indexing techniques

(partly) materializing intensional relations on beforehand Trade-off memory query time performance (See also the OLAP-part for a similar technique)

Different representations for relations BDD (Stanford)

Page 72: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Outline

Syntax of the Datalog languageSemantics of a Datalog programRelational algebra = Datalog with

negation and without recursionOptimization techniquesConclusions

Page 73: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Conclusions

Datalog adds deductive capabilities to databases extensional relations intensional relations

Datalog without recursion, with negation safety requirement stratification equal in power to relational algebra Closed World Assumption

Page 74: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Conclusions

Datalog without Negation Always a unique minimal model

Datalog with negation and recursion semantics not always clear stratified negation

Evaluation of datalog queries: without negation = RA-optimization with recursion:

semi-naive recursion magic sets

Page 75: TOON CALDERS T.CALDERS@TUE.NL Deductive databases

Conclusions

Very nice idea, but …Deductive databases did not make it as

a database paradigm

Yet, many ideas survived recursion in SQL …

And others may re-surface in future. Increasing need for adding meta-information in

databases