Transcript
Page 1: Datalog –Another query language –cleaner – closer to a “logic” notation, prolog – more convenient for analysis – can express queries that are not expressible

Datalog

– Another query language– cleaner– closer to a “logic” notation, prolog– more convenient for analysis– can express queries that are not expressible

in relational algebra or SQL (recursion).– No grouping and aggregation, bags,

orderby.

Page 2: Datalog –Another query language –cleaner – closer to a “logic” notation, prolog – more convenient for analysis – can express queries that are not expressible

Predicates and Atoms- relations are represented by predicates- tuples are represented by atoms.

Purchase( “joe”, “bob”, “Nike Town”, “Nike Air”)

- arithmetic comparison atoms:

X < 100, X+Y+5 > Z/2

- negated atoms:

NOT Product(“Brooklyn Bridge”, $100, “Microsoft”)

Page 3: Datalog –Another query language –cleaner – closer to a “logic” notation, prolog – more convenient for analysis – can express queries that are not expressible

Datalog Rules and QueriesA datalog rule has the following form:

head :- atom1, atom2, …., atom,…

Examples: PerformingComp(name) :- Company(name,sp,c), sp > $50

AmericanProduct(prod) :- Product(prod,pr,cat,mak), Company(mak, sp,“USA”)

All the variables in the head must appear in the body.A single rule can express exactly select-from-where queries.

Page 4: Datalog –Another query language –cleaner – closer to a “logic” notation, prolog – more convenient for analysis – can express queries that are not expressible

The Meaning of Datalog Rules

AmericanProduct(prod) :- Product(prod,pr,cat,mak), Company(mak, sp,“USA”)

Consider every assignment from the variables in the bodyto the constants in the database.

If each of the atoms in the body is in the database, then the tuple for the head is in the resulting.

Page 5: Datalog –Another query language –cleaner – closer to a “logic” notation, prolog – more convenient for analysis – can express queries that are not expressible

More ExamplesCREATE VIEW Seattle-view AS

SELECT buyer, seller, product, store

FROM Person, Purchase

WHERE Person.city = “Seattle” AND

Person.per-name = Purchase.buyer

SeattleView(buyer,seller,product,store) :-

Person(buyer, “Seattle”, phone),

Purchase(buyer, seller, product, store).

Page 6: Datalog –Another query language –cleaner – closer to a “logic” notation, prolog – more convenient for analysis – can express queries that are not expressible

More Examples (negation, union)SeattleView(buyer,seller,product,store) :-

Person(buyer, “Seattle”, phone),

Purchase(buyer, seller, product, store)

not Purchase(buyer, seller, product, “The Bon”)

Q5(buyer) :- Purchase(buyer, “Joe”, prod, store)

Q5(buyer) :- Purchase(buyer, seller, store, prod),

Product(prod, price, cat, maker)

Company(maker, sp, country),

sp > 50.

Page 7: Datalog –Another query language –cleaner – closer to a “logic” notation, prolog – more convenient for analysis – can express queries that are not expressible

Rule Safety

Every variable that appears anywhere in the query must appear also in a relational, non-negatedatom in the query.

Q(X,Y,Z) :- R1(X,Y) & X < Z not safe

Q(X,Y,Z) :- R1(X,Y) & NOT R2(X,Y,Z) not safe

Page 8: Datalog –Another query language –cleaner – closer to a “logic” notation, prolog – more convenient for analysis – can express queries that are not expressible

Defining ViewsSeattleView(buyer,seller,product,store) :-

Person(buyer, “Seattle”, phone),

Purchase(buyer, seller, product, store)

not Purchase(buyer, seller, product, “The Bon”)

Q6(buyer) :- SeattleView(buyer, “Joe”, prod, store)

Q6(buyer) :- SeattleView(buyer, seller, store, prod),

Product(prod, price, cat, maker)

Company(maker, sp, country),

sp > 50.

Page 9: Datalog –Another query language –cleaner – closer to a “logic” notation, prolog – more convenient for analysis – can express queries that are not expressible

From Relational Algebra to Datalog

We can translate any relational algebra operation to datalog:

- projection

- selection

- union

- intersection

- join

Page 10: Datalog –Another query language –cleaner – closer to a “logic” notation, prolog – more convenient for analysis – can express queries that are not expressible

Exercises

Product ( name, price, category, maker)Purchase (buyer, seller, store, product)Company (name, stock price, country)Person( name, phone number, city)

Ex #1: Find people who bought telephony products.Ex #2: Find names of people who bought American productsEx #3: Find names of people who bought American products and did not buy French productsEx #4: Find names of people who bought American products and they live in Seattle.Ex #5: Find people who bought stuff from Joe or bought products from a company whose stock prices is more than $50.

Page 11: Datalog –Another query language –cleaner – closer to a “logic” notation, prolog – more convenient for analysis – can express queries that are not expressible

Transitive ClosureSuppose we are representing a graph by a relation Edge(X,Y):

Edge(a,b), Edge (a,c), Edge(b,d), Edge(c,d), Edge(d,e)

a

b

c

d e

How can I express the query: Find all nodes reachable from a.

Page 12: Datalog –Another query language –cleaner – closer to a “logic” notation, prolog – more convenient for analysis – can express queries that are not expressible

Only in Datalog

Recursive queries:

Path( X, Y ) :- Edge( X, Y )

Path( X, Y ) :- Path( X, Z ), Path( Z, Y ).

A query is recursive if there is a cycle in the dependency graph of the predicates.

Page 13: Datalog –Another query language –cleaner – closer to a “logic” notation, prolog – more convenient for analysis – can express queries that are not expressible

Evaluating Recursive QueriesPath( X, Y ) :- Edge( X, Y )Path( X, Y ) :- Path( X, Z ), Path( Z, Y ).

Semantics: evaluate the rules until a fixed point:

Iteration #0: Edge: {(a,b), (a,c), (b,d), (c,d), (d,e)} Path: {}Iteration #1: Path: {(a,b), (a,c), (b,d), (c,d), (d,e)}

Iteration #2: Path gets the new tuples: (a,d), (b,e), (c,e)Iteration #3: Path gets the new tuple: (a,e)Iteration #4: Nothing changes -> We stop.Note: number of iterations depends on the data. Cannot be anticipated by only looking at the query!

Page 14: Datalog –Another query language –cleaner – closer to a “logic” notation, prolog – more convenient for analysis – can express queries that are not expressible

Another Recursive Query

• Flight(x,y) : there is a flight from city x to y.• Hub(x): x is a hub.• Query: find pairs of cities for which there is a

flight that goes through some hub.

Fl(x,y) :- Flight(x,y)

Fl(x,y) :- Flight(x,z), Fl(z,y)

Answer(x,y) :- Fl(x,z), Hub(z), Fl(z,y).

Page 15: Datalog –Another query language –cleaner – closer to a “logic” notation, prolog – more convenient for analysis – can express queries that are not expressible

Deductive Databases• General idea: some relations are stored (extensional),

others are defined by datalog queries (intensional).

• Many research projects (MCC, Stanford, Wisconsin) [Great Ph.D theses!]

• SQL3 realized that recursion is useful, and added linear recursion.

• Hard problem: optimizing datalog performance.

• Ideas from deductive databases made it into the mainstream.


Top Related