![Page 1: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/1.jpg)
CSE544Data Management
Lectures 12Advanced Query Processing
CSE 544 - Winter 2020 1
![Page 2: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/2.jpg)
Announcements
• Project Milestone due on Friday
• Homework 4 posted; due next Friday
• There will be a short Homework 5, on transactions
2
![Page 3: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/3.jpg)
Quick Recap
• Name 3 join processing algorithms
3
![Page 4: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/4.jpg)
Outline
Algorithms for multi-joins
• AGM formula for maximum output size
• Generic-join algorithm matching that formula
4
![Page 5: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/5.jpg)
Multi-join
• select * from R, S, T, … where …• Standard approach:
– Compute one join at a time– Optimizer chooses an “optimal” join order
• Issues:– Cardinality estimation is hard– Even “optimal” plan may be suboptimal
5
![Page 6: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/6.jpg)
Plans Are Suboptimal
Because intermediate results are much larger than the final query answer
6
![Page 7: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/7.jpg)
Example
select * -- natural joinfrom R, S, Twhere R.Y = S.Y and S.Z = T.Z and T.X = R.X
⨝
T(Z,X)
R(X,Y)
⨝
S(Y,Z)
Query plan
Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,X)
[Ngo’2013]
![Page 8: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/8.jpg)
Example
select * -- natural joinfrom R, S, Twhere R.Y = S.Y and S.Z = T.Z and T.X = R.X
⨝
T(Z,X)
R(X,Y)
⨝
S(Y,Z)
Query planR:
X Y
0 1
0 2
0 3
... ...
0 N/2
1 0
2 0
... ...
N/2 0
N
Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,X)
[Ngo’2013]
![Page 9: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/9.jpg)
Example
select * -- natural joinfrom R, S, Twhere R.Y = S.Y and S.Z = T.Z and T.X = R.X
⨝
T(Z,X)
R(X,Y)
⨝
S(Y,Z)
Query planR:
X Y
0 1
0 2
0 3
... ...
0 N/2
1 0
2 0
... ...
N/2 0
S: (same as R)
Y Z
0 1
0 2
0 3
... ...
0 N/2
1 0
2 0
... ...
N/2 0
T: (same as R)
Z X
0 1
0 2
0 3
... ...
0 N/2
1 0
2 0
... ...
N/2 0
N
Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,X)
[Ngo’2013]
![Page 10: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/10.jpg)
Example
select * -- natural joinfrom R, S, Twhere R.Y = S.Y and S.Z = T.Z and T.X = R.X
⨝
T(Z,X)
R(X,Y)
⨝
S(Y,Z)
Query plan
Θ(N2) tuples
N
0 tuples
Q(X,Y,Z) = R(X,Y) ∧ S(Y,Z) ∧ T(Z,X)
R:
X Y
0 1
0 2
0 3
... ...
0 N/2
1 0
2 0
... ...
N/2 0
S: (same as R)
Y Z
0 1
0 2
0 3
... ...
0 N/2
1 0
2 0
... ...
N/2 0
T: (same as R)
Z X
0 1
0 2
0 3
... ...
0 N/2
1 0
2 0
... ...
N/2 0
[Ngo’2013]
![Page 11: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/11.jpg)
Optimal Algorithm
To define “optimal” we need to answer two questions:
Q1: How large is the output of a query?
Q2: How can we compute it in time no larger than the largest output?
11
![Page 12: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/12.jpg)
Worst-Case OptimalityFix input statistics for D• Runtime = O(maxD satisfies stats(|Q(D)|))
E.g. R(X,Y) ⨝ S(Y,Z), |R|, |S| ≤ N• No other info: |Output| ≤ N2
• S.Y is a key: |Output| ≤ N• S.Y has degree ≤ d: |Output| ≤ d×N
E.g. R(X,Y) ⨝ S(Y,Z) ⨝ T(Z,X)|Output| ≤ N3/2
12
![Page 13: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/13.jpg)
Worst-Case OptimalityFix input statistics for D• Runtime = O(maxD satisfies stats(|Q(D)|))
E.g. R(X,Y) ∧ S(Y,Z), |R|, |S| ≤ N• No other info: |Output| ≤ N2
• S.Y is a key: |Output| ≤ N• S.Y has degree ≤ d: |Output| ≤ d×N
E.g. R(X,Y) ∧ S(Y,Z) ∧ T(Z,X)|Output| ≤ N3/2
13
![Page 14: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/14.jpg)
Worst-Case OptimalityFix input statistics for D• Runtime = O(maxD satisfies stats(|Q(D)|))
E.g. R(X,Y) ∧ S(Y,Z), |R|, |S| ≤ N• No other info: |Q(D)| ≤ N2
• S.Y is a key: |Output| ≤ N• S.Y has degree ≤ d: |Output| ≤ d×N
E.g. R(X,Y) ∧ S(Y,Z) ∧ T(Z,X)|Output| ≤ N3/2
14
![Page 15: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/15.jpg)
Worst-Case OptimalityFix input statistics for D• Runtime = O(maxD satisfies stats(|Q(D)|))
E.g. R(X,Y) ∧ S(Y,Z), |R|, |S| ≤ N• No other info: |Q(D)| ≤ N2
• S.Y is a key: |Output| ≤ N• S.Y has degree ≤ d: |Output| ≤ d×N
E.g. R(X,Y) ∧ S(Y,Z) ∧ T(Z,X)|Output| ≤ N3/2
15
![Page 16: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/16.jpg)
Worst-Case OptimalityFix input statistics for D• Runtime = O(maxD satisfies stats(|Q(D)|))
E.g. R(X,Y) ∧ S(Y,Z), |R|, |S| ≤ N• No other info: |Q(D)| ≤ N2
• S.Y is a key: |Q(D)| ≤ N• S.Y has degree ≤ d: |Output| ≤ d×N
E.g. R(X,Y) ∧ S(Y,Z) ∧ T(Z,X)|Output| ≤ N3/2
16
![Page 17: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/17.jpg)
Worst-Case OptimalityFix input statistics for D• Runtime = O(maxD satisfies stats(|Q(D)|))
E.g. R(X,Y) ∧ S(Y,Z), |R|, |S| ≤ N• No other info: |Q(D)| ≤ N2
• S.Y is a key: |Q(D)| ≤ N• S.Y has degree ≤ d: |Output| ≤ d×N
E.g. R(X,Y) ∧ S(Y,Z) ∧ T(Z,X)|Output| ≤ N3/2
17
![Page 18: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/18.jpg)
Worst-Case OptimalityFix input statistics for D• Runtime = O(maxD satisfies stats(|Q(D)|))
E.g. R(X,Y) ∧ S(Y,Z), |R|, |S| ≤ N• No other info: |Q(D)| ≤ N2
• S.Y is a key: |Q(D)| ≤ N• S.Y has degree ≤ d: |Q(D)| ≤ d×N
E.g. R(X,Y) ∧ S(Y,Z) ∧ T(Z,X)|Output| ≤ N3/2
18
![Page 19: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/19.jpg)
Worst-Case OptimalityFix input statistics for D• Runtime = O(maxD satisfies stats(|Q(D)|))
E.g. R(X,Y) ∧ S(Y,Z), |R|, |S| ≤ N• No other info: |Q(D)| ≤ N2
• S.Y is a key: |Q(D)| ≤ N• S.Y has degree ≤ d: |Q(D)| ≤ d×N
E.g. R(X,Y) ∧ S(Y,Z) ∧ T(Z,X)|Output| ≤ N3/2
19
![Page 20: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/20.jpg)
Worst-Case OptimalityFix input statistics for D• Runtime = O(maxD satisfies stats(|Q(D)|))
E.g. R(X,Y) ∧ S(Y,Z), |R|, |S| ≤ N• No other info: |Q(D)| ≤ N2
• S.Y is a key: |Q(D)| ≤ N• S.Y has degree ≤ d: |Q(D)| ≤ d×N
E.g. R(X,Y) ∧ S(Y,Z) ∧ T(Z,X)No other info: |Q(D)| ≤ N3/2
20
![Page 21: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/21.jpg)
The Two Questions
Q1: Given statistics, what is max(|Q(D)|)?
Q2: How can we compute Q in timeO(max(|Q(D)|))?
21
![Page 22: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/22.jpg)
Simple Fact #1
• Consider any query:
• Its output size is no larger than the product of all cardinalities:
22
Q(X1, ..., Xk) = R1(Vars1)∧...∧Rm(Varsm)
|Q| ≤ |R1| ×… ×|Rm| Why?
![Page 23: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/23.jpg)
• An undirected graph G = (V, E) where each edge e ∈ E is a set of two nodes
Graphs and Hypergraphs
23
ab c
d
![Page 24: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/24.jpg)
• An undirected graph G = (V, E) where each edge e ∈ E is a set of two nodes
• A hypergraph is G = (V, E) where each edge is some set (of 1 or 2 or >2 nodes)
Graphs and Hypergraphs
24
ab c
d
e
ab c
d
![Page 25: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/25.jpg)
Conjunctive Queries are Hypergraphs
Q(x,y,z) = R(x,y)∧S(y,z)∧T(z,x)
Q(x,y,z) = A(x,y,z)∧B(x,y,u)∧C(x,z,u)∧D(y,z,u)
25
x
y
z
x y
zu
![Page 26: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/26.jpg)
Edge Cover
• An edge cover of a (hyper)graph is a subset of edges that contain all the vertices
x
y
z
![Page 27: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/27.jpg)
Edge Cover
• An edge cover of a (hyper)graph is a subset of edges that contain all the vertices
x
y
z x
y
z
![Page 28: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/28.jpg)
Edge Cover
• An edge cover of a (hyper)graph is a subset of edges that contain all the vertices
x
y
z
x y
zu
x
y
z
![Page 29: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/29.jpg)
Edge Cover
• An edge cover of a (hyper)graph is a subset of edges that contain all the vertices
x
y
z
x y
zu
x
y
z
x y
zu
![Page 30: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/30.jpg)
Simple Fact #2
• Consider any query:
• Let R%&, R%(, … , R%* be an edge cover. Then the output size is no larger than their product:
30
Q(X1, ..., Xk) = R1(Vars1)∧...∧Rm(Varsm)
|Q| ≤ R%+ ×⋯×|R%*| Why?
![Page 31: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/31.jpg)
ExamplesQ(x,y,z) = R(x,y)∧S(y,z)∧T(z,x)
31
![Page 32: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/32.jpg)
ExamplesQ(x,y,z) = R(x,y)∧S(y,z)∧T(z,x)• Edge covers:
R(x,y)∧S(y,z) or R(x,y)∧T(z,x) or S(y,z)∧T(z,x)
32
![Page 33: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/33.jpg)
ExamplesQ(x,y,z) = R(x,y)∧S(y,z)∧T(z,x)• Edge covers:
R(x,y)∧S(y,z) or R(x,y)∧T(z,x) or S(y,z)∧T(z,x)
33
|Q| ≤ min(|R|×|S|, |R|×|T|, |S|×|T|)
![Page 34: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/34.jpg)
ExamplesQ(x,y,z) = R(x,y)∧S(y,z)∧T(z,x)• Edge covers:
R(x,y)∧S(y,z) or R(x,y)∧T(z,x) or S(y,z)∧T(z,x)
Q(x,y,z,u) = A(x,y,z)∧B(x,y,u)∧C(x,z,u)∧D(y,z,u)
34
|Q| ≤ min(|R|×|S|, |R|×|T|, |S|×|T|)
![Page 35: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/35.jpg)
ExamplesQ(x,y,z) = R(x,y)∧S(y,z)∧T(z,x)• Edge covers:
R(x,y)∧S(y,z) or R(x,y)∧T(z,x) or S(y,z)∧T(z,x)
Q(x,y,z,u) = A(x,y,z)∧B(x,y,u)∧C(x,z,u)∧D(y,z,u)• Edge covers:
A(x,y,z)∧B(x,y,u) or A(x,y,z)∧C(x,z,u) or …
35
|Q| ≤ min(|R|×|S|, |R|×|T|, |S|×|T|)
![Page 36: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/36.jpg)
ExamplesQ(x,y,z) = R(x,y)∧S(y,z)∧T(z,x)• Edge covers:
R(x,y)∧S(y,z) or R(x,y)∧T(z,x) or S(y,z)∧T(z,x)
Q(x,y,z,u) = A(x,y,z)∧B(x,y,u)∧C(x,z,u)∧D(y,z,u)• Edge covers:
A(x,y,z)∧B(x,y,u) or A(x,y,z)∧C(x,z,u) or …
36
|Q| ≤ min(|R|×|S|, |R|×|T|, |S|×|T|)
|Q| | ≤ min(|A|×|B|, |A|×|C|, …)
![Page 37: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/37.jpg)
Fractional Edge Cover
• A fractional edge cover of a (hyper)graph is a set of non-negative numbers we, one for each edge e, such that, for every vertex v: ∑e: v∈e we ≥ 1
![Page 38: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/38.jpg)
Fractional Edge Cover
• A fractional edge cover of a (hyper)graph is a set of non-negative numbers we, one for each edge e, such that, for every vertex v: ∑e: v∈e we ≥ 1
x
y
z
![Page 39: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/39.jpg)
Fractional Edge Cover
• A fractional edge cover of a (hyper)graph is a set of non-negative numbers we, one for each edge e, such that, for every vertex v: ∑e: v∈e we ≥ 1
x
y
z
½
½½
![Page 40: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/40.jpg)
Fractional Edge Cover
• A fractional edge cover of a (hyper)graph is a set of non-negative numbers we, one for each edge e, such that, for every vertex v: ∑e: v∈e we ≥ 1
x
y
z
x y
zu
½
½½
![Page 41: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/41.jpg)
Fractional Edge Cover
• A fractional edge cover of a (hyper)graph is a set of non-negative numbers we, one for each edge e, such that, for every vertex v: ∑e: v∈e we ≥ 1
x
y
z
x y
zu
½
½½
1/3
1/3
1/3
1/3
![Page 42: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/42.jpg)
Fractional Edge Cover
• A fractional edge cover of a (hyper)graph is a set of non-negative numbers we, one for each edge e, such that, for every vertex v: ∑e: v∈e we ≥ 1
• Fact: every edge cover is also a fractional edge cover. Why?
x
y
z
x y
zu
½
½½
1/3
1/3
1/3
1/3
![Page 43: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/43.jpg)
Not so Simple Fact #3
• Consider any query:
• Let w1, w2, …, wm be a fractional edge cover. Then the output size is no larger than:
43
Q(X1, ..., Xk) = R1(Vars1)∧...∧Rm(Varsm)
|Q| ≤ R+ /&×⋯× R0 /1
![Page 44: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/44.jpg)
ExamplesQuery w1, w2, …, wm |R1|w1 ×… ×|Rm|wm |Q| ≤ …
R(x,y)∧S(y,z) 1,1 |R|×|S| ≤|R|×|S|
½, ½, ½ (|R|×|S|×|T|)½
1/3, 1/3, 1/3, 1/3 (|A|×|B|×|C|×|D|)1/3
![Page 45: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/45.jpg)
ExamplesQuery w1, w2, …, wm |R1|w1 ×… ×|Rm|wm |Q| ≤ …
R(x,y)∧S(y,z) 1,1 |R|×|S| ≤|R|×|S|
R(x,y)∧S(y,z)∧T(z,x)
½, ½, ½ (|R|×|S|×|T|)½
1/3, 1/3, 1/3, 1/3 (|A|×|B|×|C|×|D|)1/3
![Page 46: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/46.jpg)
ExamplesQuery w1, w2, …, wm |R1|w1 ×… ×|Rm|wm |Q| ≤ …
R(x,y)∧S(y,z) 1,1 |R|×|S| ≤|R|×|S|
R(x,y)∧S(y,z)∧T(z,x)
½, ½, ½ (|R|×|S|×|T|)½
1/3, 1/3, 1/3, 1/3 (|A|×|B|×|C|×|D|)1/3
![Page 47: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/47.jpg)
ExamplesQuery w1, w2, …, wm |R1|w1 ×… ×|Rm|wm |Q| ≤ …
R(x,y)∧S(y,z) 1,1 |R|×|S| ≤|R|×|S|
R(x,y)∧S(y,z)∧T(z,x)
½, ½, ½ (|R|×|S|×|T|)½
≤min((|R|×|S|×|T|)½, |R|×|S|, |R|×|T|,
|S|×|T|)1,1,0 |R|×|S|1,0,1 |R|×|T|0,1,1 |S|×|T|
1/3, 1/3, 1/3, 1/3 (|A|×|B|×|C|×|D|)1/3
![Page 48: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/48.jpg)
ExamplesQuery w1, w2, …, wm |R1|w1 ×… ×|Rm|wm |Q| ≤ …
R(x,y)∧S(y,z) 1,1 |R|×|S| ≤|R|×|S|
R(x,y)∧S(y,z)∧T(z,x)
½, ½, ½ (|R|×|S|×|T|)½
≤min((|R|×|S|×|T|)½, |R|×|S|, |R|×|T|,
|S|×|T|)1,1,0 |R|×|S|1,0,1 |R|×|T|0,1,1 |S|×|T|
A(x,y,z)∧B(x,y,u)∧C(x,z,u)∧D(y,z,u)
1/3, 1/3, 1/3, 1/3 (|A|×|B|×|C|×|D|)1/3
![Page 49: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/49.jpg)
ExamplesQuery w1, w2, …, wm |R1|w1 ×… ×|Rm|wm |Q| ≤ …
R(x,y)∧S(y,z) 1,1 |R|×|S| ≤|R|×|S|
R(x,y)∧S(y,z)∧T(z,x)
½, ½, ½ (|R|×|S|×|T|)½
≤min((|R|×|S|×|T|)½, |R|×|S|, |R|×|T|,
|S|×|T|)1,1,0 |R|×|S|1,0,1 |R|×|T|0,1,1 |S|×|T|
A(x,y,z)∧B(x,y,u)∧C(x,z,u)∧D(y,z,u)
1/3, 1/3, 1/3, 1/3 (|A|×|B|×|C|×|D|)1/3
![Page 50: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/50.jpg)
ExamplesQuery w1, w2, …, wm |R1|w1 ×… ×|Rm|wm |Q| ≤ …
R(x,y)∧S(y,z) 1,1 |R|×|S| ≤|R|×|S|
R(x,y)∧S(y,z)∧T(z,x)
½, ½, ½ (|R|×|S|×|T|)½
≤min((|R|×|S|×|T|)½, |R|×|S|, |R|×|T|,
|S|×|T|)1,1,0 |R|×|S|1,0,1 |R|×|T|0,1,1 |S|×|T|
A(x,y,z)∧B(x,y,u)∧C(x,z,u)∧D(y,z,u)
1/3, 1/3, 1/3, 1/3 (|A|×|B|×|C|×|D|)1/3
min( … )1,1,0,0 |A|×|B|1,0,1,0 |A|×|C|
… …
![Page 51: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/51.jpg)
ExamplesQuery w1, w2, …, wm |R1|w1 ×… ×|Rm|wm |Q| ≤ …
R(x,y)∧S(y,z) 1,1 |R|×|S| ≤|R|×|S|
R(x,y)∧S(y,z)∧T(z,x)
½, ½, ½ (|R|×|S|×|T|)½
≤min((|R|×|S|×|T|)½, |R|×|S|, |R|×|T|,
|S|×|T|)1,1,0 |R|×|S|1,0,1 |R|×|T|0,1,1 |S|×|T|
A(x,y,z)∧B(x,y,u)∧C(x,z,u)∧D(y,z,u)
1/3, 1/3, 1/3, 1/3 (|A|×|B|×|C|×|D|)1/3
min( … )1,1,0,0 |A|×|B|1,0,1,0 |A|×|C|
… …
R(x,y)∧S(y,z)∧T(z,u)∧K(u,v)
![Page 52: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/52.jpg)
ExamplesQuery w1, w2, …, wm |R1|w1 ×… ×|Rm|wm |Q| ≤ …
R(x,y)∧S(y,z) 1,1 |R|×|S| ≤|R|×|S|
R(x,y)∧S(y,z)∧T(z,x)
½, ½, ½ (|R|×|S|×|T|)½
≤min((|R|×|S|×|T|)½, |R|×|S|, |R|×|T|,
|S|×|T|)1,1,0 |R|×|S|1,0,1 |R|×|T|0,1,1 |S|×|T|
A(x,y,z)∧B(x,y,u)∧C(x,z,u)∧D(y,z,u)
1/3, 1/3, 1/3, 1/3 (|A|×|B|×|C|×|D|)1/3
min( … )1,1,0,0 |A|×|B|1,0,1,0 |A|×|C|
… …
R(x,y)∧S(y,z)∧T(z,u)∧K(u,v)1,0,1,1 |R|×|T|×|K|1,1,0,1 |R|×|S|×|K|
![Page 53: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/53.jpg)
ExamplesQuery w1, w2, …, wm |R1|w1 ×… ×|Rm|wm |Q| ≤ …
R(x,y)∧S(y,z) 1,1 |R|×|S| ≤|R|×|S|
R(x,y)∧S(y,z)∧T(z,x)
½, ½, ½ (|R|×|S|×|T|)½
≤min((|R|×|S|×|T|)½, |R|×|S|, |R|×|T|,
|S|×|T|)1,1,0 |R|×|S|1,0,1 |R|×|T|0,1,1 |S|×|T|
A(x,y,z)∧B(x,y,u)∧C(x,z,u)∧D(y,z,u)
1/3, 1/3, 1/3, 1/3 (|A|×|B|×|C|×|D|)1/3
min( … )1,1,0,0 |A|×|B|1,0,1,0 |A|×|C|
… …
R(x,y)∧S(y,z)∧T(z,u)∧K(u,v)1,0,1,1 |R|×|T|×|K|1,1,0,1 |R|×|S|×|K|
1, ½, ½, 1 (no need; why?)
![Page 54: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/54.jpg)
ExamplesQuery w1, w2, …, wm |R1|w1 ×… ×|Rm|wm |Q| ≤ …
R(x,y)∧S(y,z) 1,1 |R|×|S| ≤|R|×|S|
R(x,y)∧S(y,z)∧T(z,x)
½, ½, ½ (|R|×|S|×|T|)½
≤min((|R|×|S|×|T|)½, |R|×|S|, |R|×|T|,
|S|×|T|)1,1,0 |R|×|S|1,0,1 |R|×|T|0,1,1 |S|×|T|
A(x,y,z)∧B(x,y,u)∧C(x,z,u)∧D(y,z,u)
1/3, 1/3, 1/3, 1/3 (|A|×|B|×|C|×|D|)1/3
min( … )1,1,0,0 |A|×|B|1,0,1,0 |A|×|C|
… …
R(x,y)∧S(y,z)∧T(z,u)∧K(u,v)1,0,1,1 |R|×|T|×|K|
min(|R|×|T|×|K|, |R|×|S|×|K|)1,1,0,1 |R|×|S|×|K|
1, ½, ½, 1 (no need; why?)
![Page 55: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/55.jpg)
Upper Bound of a Query
55
Theorem |Q| ≤ min5&,⋯,56
𝑅+ 5&×⋯ × 𝑅8 56
This is called the AGM bound* of Q. It is tight.
Note: it suffices to consider only those fractional edge coversw1, …, wm that are not convex combinations of others
We will prove tightness on a special case.
But first, let’s discuss an algorithm for computing Q with this runtime
*Atserias, Grohe, Marx introduced this bound
![Page 56: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/56.jpg)
Generic Join – Overview• Choose a variable order
• Sort every relation Ri according to this order: time is O(|Ri| log |Ri|) = Õ(|Ri| )
• Generic join assumes relations are sorted;it computes Q in time Õ(AGM(Q))
• “Worst case optimal”
56
AGM(Q) = min5&,⋯,56
𝑅+ 5&×⋯ × 𝑅8 56
![Page 57: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/57.jpg)
Generic Join – The IntersectionIntersection is the main building block of G.J.
Q(x) = R(x)∧S(x)• Discuss merge-join in class – what is runtime?
57
![Page 58: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/58.jpg)
Generic Join – The IntersectionIntersection is the main building block of G.J.
Q(x) = R(x)∧S(x)• Discuss merge-join in class – what is runtime?
• Edge covers of Q: 1,0 and 0,1; |Q| ≤ min(|R|, |S|)
58
![Page 59: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/59.jpg)
Generic Join – The IntersectionIntersection is the main building block of G.J.
Q(x) = R(x)∧S(x)• Discuss merge-join in class – what is runtime?
• Edge covers of Q: 1,0 and 0,1; |Q| ≤ min(|R|, |S|)• Discuss improved merge-join in class
Runtime: Õ(min(|R|, |S|))
59
![Page 60: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/60.jpg)
Generic Join Algorithm
60
Let x be the first variableLet Ri1, Ri2, … be all relations containing xCompute D = 𝛱x(Ri1) ∩ 𝛱x(Ri2) ∩ …for every value v ∈ D do:
Compute Q,where Ri1, Ri2, … are restricted to x = v
needs tobe done in timeÕ(minj Πx(Rj))
![Page 61: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/61.jpg)
Generic Join ExampleQ(x,y,z) = R(x,y)∧S(y,z)∧T(z,x),Assume |R|=|S|=|T|=N, then:
A = Πx(R(x,y)) ∩ Πx(T(z,x))for a in A do
/* compute Q(a,y,z) = R(a,y)∧S(y,z)∧T(z,a) */B = Πy(R(a,y)) ∩ Πy(S(y,z)) for b in B do/* compute Q(a,b,z) = R(a,b)∧S(b,z)∧T(z,a) */
C = Πz(S(b,z)) ∩ Πz(T(z,a))for c in C do
output (a,b,c)
61
|Q| ≤ N3/2
![Page 62: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/62.jpg)
Generic Join ExampleQ(x,y,z) = R(x,y)∧S(y,z)∧T(z,x),Assume |R|=|S|=|T|=N, then:
A = Πx(R(x,y)) ∩ Πx(T(z,x))for a in A do
/* compute Q(a,y,z) = R(a,y)∧S(y,z)∧T(z,a) */B = Πy(R(a,y)) ∩ Πy(S(y,z)) for b in B do/* compute Q(a,b,z) = R(a,b)∧S(b,z)∧T(z,a) */
C = Πz(S(b,z)) ∩ Πz(T(z,a))for c in C do
output (a,b,c)
62
|Q| ≤ N3/2
![Page 63: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/63.jpg)
Generic Join ExampleQ(x,y,z) = R(x,y)∧S(y,z)∧T(z,x),Assume |R|=|S|=|T|=N, then:
A = Πx(R(x,y)) ∩ Πx(T(z,x))for a in A do
/* compute Q(a,y,z) = R(a,y)∧S(y,z)∧T(z,a) */B = Πy(R(a,y)) ∩ Πy(S(y,z)) for b in B do/* compute Q(a,b,z) = R(a,b)∧S(b,z)∧T(z,a) */
C = Πz(S(b,z)) ∩ Πz(T(z,a))for c in C do
output (a,b,c)
63
|Q| ≤ N3/2
![Page 64: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/64.jpg)
Generic Join ExampleQ(x,y,z) = R(x,y)∧S(y,z)∧T(z,x),Assume |R|=|S|=|T|=N, then:
A = Πx(R(x,y)) ∩ Πx(T(z,x))for a in A do
/* compute Q(a,y,z) = R(a,y)∧S(y,z)∧T(z,a) */B = Πy(R(a,y)) ∩ Πy(S(y,z)) for b in B do/* compute Q(a,b,z) = R(a,b)∧S(b,z)∧T(z,a) */
C = Πz(S(b,z)) ∩ Πz(T(z,a))for c in C do
output (a,b,c)
64
|Q| ≤ N3/2
![Page 65: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/65.jpg)
Generic Join ExampleQ(x,y,z) = R(x,y)∧S(y,z)∧T(z,x),Assume |R|=|S|=|T|=N, then:
A = Πx(R(x,y)) ∩ Πx(T(z,x))for a in A do
/* compute Q(a,y,z) = R(a,y)∧S(y,z)∧T(z,a) */B = Πy(R(a,y)) ∩ Πy(S(y,z)) for b in B do/* compute Q(a,b,z) = R(a,b)∧S(b,z)∧T(z,a) */
C = Πz(S(b,z)) ∩ Πz(T(z,a))for c in C do
output (a,b,c)
65
Runtime:Õ(N3/2)
|Q| ≤ N3/2
![Page 66: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/66.jpg)
Discussion• All relations need to be presorted, or
indexed
• Runtime is guaranteed to be worst-case optimal, no matter what variable order we choose
• In practice, the variable order does matter;in class: discuss R(x,y)∧S(y,z)
66
![Page 67: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/67.jpg)
Comparison to Naïve Nested LoopNaïve nested loop:
// tuple at a time:For t1 in R1 dofor t2 in R2 do…
// value at a time:For x in Domain doFor y in Domain do
For z in Domain do...
Generic-join
A = ∩ domains for xFor x in A do
B = ∩ domains for yFor y in B do
C = ∩ domains for zFor z in C do...
67
![Page 68: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/68.jpg)
Comparison to Naïve Nested LoopNaïve nested loop:
// tuple at a time:For t1 in R1 dofor t2 in R2 do…
// value at a time:For x in Domain doFor y in Domain do
For z in Domain do...
Generic-join
A = ∩ domains for xFor x in A do
B = ∩ domains for yFor y in B do
C = ∩ domains for zFor z in C do...
68
![Page 69: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/69.jpg)
Comparison to Naïve Nested LoopNaïve nested loop:
// tuple at a time:For t1 in R1 dofor t2 in R2 do…
// value at a time:For x in Domain doFor y in Domain do
For z in Domain do...
Generic-join
A = ∩ domains for xFor x in A do
B = ∩ domains for yFor y in B do
C = ∩ domains for zFor z in C do...
69
![Page 70: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/70.jpg)
Tightness
• There exists instances R1, R2, … such that the size of the query’s output is AGM(Q)
• Proof is simple and instructive; we will show for special case |R1| = … = |Rm| = N
• In this case AGM(Q) = Nmin(w1+…+wm)
70
![Page 71: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/71.jpg)
Fractional Edge Covering Number
• The fractional edge covering number of a hypergraph is 𝜌* = min ∑e we , where the minimum is over all fractional edge covers of the hypergraph.
71
Fact Assume |R1| = … = |Rm| = N. Then AGM(Q)= N𝜌*
Why?
![Page 72: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/72.jpg)
Fractional Vertex Packing
• A fractional vertex packing of a (hyper)graph is a set of non-negative numbers vx, one for each node x, such that, for every edge e: ∑x: x∈e vx ≤ 1
![Page 73: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/73.jpg)
Fractional Vertex Packing
• A fractional vertex packing of a (hyper)graph is a set of non-negative numbers vx, one for each node x, such that, for every edge e: ∑x: x∈e vx ≤ 1Fact For any v, w: ∑x vx ≤ ∑e we
![Page 74: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/74.jpg)
Fractional Vertex Packing
• A fractional vertex packing of a (hyper)graph is a set of non-negative numbers vx, one for each node x, such that, for every edge e: ∑x: x∈e vx ≤ 1
Theorem maxv ∑x vx = 𝜌* = minw ∑e we
Fact For any v, w: ∑x vx ≤ ∑e we
![Page 75: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/75.jpg)
Fractional Vertex Packing
• A fractional vertex packing of a (hyper)graph is a set of non-negative numbers vx, one for each node x, such that, for every edge e: ∑x: x∈e vx ≤ 1
Theorem maxv ∑x vx = 𝜌* = minw ∑e we
Fact For any v, w: ∑x vx ≤ ∑e we
![Page 76: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/76.jpg)
Fractional Vertex Packing
• A fractional vertex packing of a (hyper)graph is a set of non-negative numbers vx, one for each node x, such that, for every edge e: ∑x: x∈e vx ≤ 1
𝜌* = 1
1 1
Theorem maxv ∑x vx = 𝜌* = minw ∑e we
Fact For any v, w: ∑x vx ≤ ∑e we
![Page 77: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/77.jpg)
Fractional Vertex Packing
• A fractional vertex packing of a (hyper)graph is a set of non-negative numbers vx, one for each node x, such that, for every edge e: ∑x: x∈e vx ≤ 1
01 1
𝜌* = 1
1 1
Theorem maxv ∑x vx = 𝜌* = minw ∑e we
Fact For any v, w: ∑x vx ≤ ∑e we
![Page 78: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/78.jpg)
Fractional Vertex Packing
• A fractional vertex packing of a (hyper)graph is a set of non-negative numbers vx, one for each node x, such that, for every edge e: ∑x: x∈e vx ≤ 1
01 1
𝜌* = 1
1 1
Theorem maxv ∑x vx = 𝜌* = minw ∑e we
Fact For any v, w: ∑x vx ≤ ∑e we
![Page 79: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/79.jpg)
Fractional Vertex Packing
• A fractional vertex packing of a (hyper)graph is a set of non-negative numbers vx, one for each node x, such that, for every edge e: ∑x: x∈e vx ≤ 1
𝜌* = 3/2
01 1
𝜌* = 1
1 1
½
½
½
Theorem maxv ∑x vx = 𝜌* = minw ∑e we
Fact For any v, w: ∑x vx ≤ ∑e we
![Page 80: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/80.jpg)
Fractional Vertex Packing
• A fractional vertex packing of a (hyper)graph is a set of non-negative numbers vx, one for each node x, such that, for every edge e: ∑x: x∈e vx ≤ 1
𝜌* = 3/2½
½½
01 1
𝜌* = 1
1 1
½
½
½
Theorem maxv ∑x vx = 𝜌* = minw ∑e we
Fact For any v, w: ∑x vx ≤ ∑e we
![Page 81: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/81.jpg)
Fractional Vertex Packing
• A fractional vertex packing of a (hyper)graph is a set of non-negative numbers vx, one for each node x, such that, for every edge e: ∑x: x∈e vx ≤ 1
𝜌* = 3/2½
½½
01 1
𝜌* = 1
1 1
½
½
½
Theorem maxv ∑x vx = 𝜌* = minw ∑e we
Fact For any v, w: ∑x vx ≤ ∑e we
![Page 82: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/82.jpg)
Fractional Vertex Packing
• A fractional vertex packing of a (hyper)graph is a set of non-negative numbers vx, one for each node x, such that, for every edge e: ∑x: x∈e vx ≤ 1
𝜌* = 3/2½
½½
01 1
𝜌* = 1 𝜌* = 3
1 1
½
½
½
1 110
Theorem maxv ∑x vx = 𝜌* = minw ∑e we
Fact For any v, w: ∑x vx ≤ ∑e we
![Page 83: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/83.jpg)
Fractional Vertex Packing
• A fractional vertex packing of a (hyper)graph is a set of non-negative numbers vx, one for each node x, such that, for every edge e: ∑x: x∈e vx ≤ 1
𝜌* = 3/2½
½½
01 1
𝜌* = 1 𝜌* = 3
1 0 1 0 1
1 1
½
½
½
1 110
Theorem maxv ∑x vx = 𝜌* = minw ∑e we
Fact For any v, w: ∑x vx ≤ ∑e we
![Page 84: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/84.jpg)
The Bound is Tight
Fact Fix a fractional vertex packing v = 𝑣< <∈=>?@A. Then there exists a database such thatR1| ≤ N, …, |Rm| ≤ N and 𝑄 = 𝑁EF GF
![Page 85: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/85.jpg)
The Bound is Tight
Proof. For every relation Rj with variables 𝑥I&, 𝑥I(, …define the instance 𝑅J = 𝑁GK& × 𝑁GK( ×⋯where [k] = {1,2,…,k}. Then:(a) 𝑅J = 𝑁GK&OGK(O⋯ ≤ 𝑁 (why?)(b) 𝑄 = 𝑁EF GF(why?)
Fact Fix a fractional vertex packing v = 𝑣< <∈=>?@A. Then there exists a database such thatR1| ≤ N, …, |Rm| ≤ N and 𝑄 = 𝑁EF GF
![Page 86: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/86.jpg)
The Bound is Tight
Proof. For every relation Rj with variables 𝑥I&, 𝑥I(, …define the instance 𝑅J = 𝑁GK& × 𝑁GK( ×⋯where [k] = {1,2,…,k}. Then:(a) 𝑅J = 𝑁GK&OGK(O⋯ ≤ 𝑁 (why?)(b) 𝑄 = 𝑁EF GF(why?)
Fact Fix a fractional vertex packing v = 𝑣< <∈=>?@A. Then there exists a database such thatR1| ≤ N, …, |Rm| ≤ N and 𝑄 = 𝑁EF GF
![Page 87: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/87.jpg)
The Bound is Tight
Proof. For every relation Rj with variables 𝑥I&, 𝑥I(, …define the instance 𝑅J = 𝑁GK& × 𝑁GK( ×⋯where [k] = {1,2,…,k}. Then:(a) 𝑅J = 𝑁GK&OGK(O⋯ ≤ 𝑁 (why?)(b) 𝑄 = 𝑁EF GF(why?)
Fact Fix a fractional vertex packing v = 𝑣< <∈=>?@A. Then there exists a database such thatR1| ≤ N, …, |Rm| ≤ N and 𝑄 = 𝑁EF GF
![Page 88: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/88.jpg)
ExampleQ(x,y,z) = R(x,y)∧S(y,z)∧T(z,x),• Assume |R|=|S|=|T|=N.• Optimal vertex packing: vx = vy = vz = 1/2• Define: Dx = [N½] = {1, 2, …, N½}
Dy = [N½]Dz = [N½]
• Define R = Dx×Dy, S = Dy×Dz, T = Dz×Dx. • Then |R| = |S| = |T| = N,
Q = Dx×Dy× Dz and |Q| = N3/2
88
![Page 89: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/89.jpg)
ExampleQ(x,y,z) = R(x,y)∧S(y,z)∧T(z,x),• Assume |R|=|S|=|T|=N.• Optimal vertex packing: vx = vy = vz = 1/2• Define: Dx = [N½] = {1, 2, …, N½}
Dy = [N½]Dz = [N½]
• Define R = Dx×Dy, S = Dy×Dz, T = Dz×Dx. • Then |R| = |S| = |T| = N,
Q = Dx×Dy× Dz and |Q| = N3/2
89
![Page 90: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/90.jpg)
ExampleQ(x,y,z) = R(x,y)∧S(y,z)∧T(z,x),• Assume |R|=|S|=|T|=N.• Optimal vertex packing: vx = vy = vz = 1/2• Define: Dx = [N½] = {1, 2, …, N½}
Dy = [N½]Dz = [N½]
• Define R = Dx×Dy, S = Dy×Dz, T = Dz×Dx. • Then |R| = |S| = |T| = N,
Q = Dx×Dy× Dz and |Q| = N3/2
90
![Page 91: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/91.jpg)
ExampleQ(x,y,z) = R(x,y)∧S(y,z)∧T(z,x),• Assume |R|=|S|=|T|=N.• Optimal vertex packing: vx = vy = vz = 1/2• Define: Dx = [N½] = {1, 2, …, N½}
Dy = [N½]Dz = [N½]
• Define R = Dx×Dy, S = Dy×Dz, T = Dz×Dx. • Then |R| = |S| = |T| = N,
Q = Dx×Dy× Dz and |Q| = N3/2
91
![Page 92: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/92.jpg)
ExampleQ(x,y,z) = R(x,y)∧S(y,z)∧T(z,x),• Assume |R|=|S|=|T|=N.• Optimal vertex packing: vx = vy = vz = 1/2• Define: Dx = [N½] = {1, 2, …, N½}
Dy = [N½]Dz = [N½]
• Define R = Dx×Dy, S = Dy×Dz, T = Dz×Dx. • Then |R| = |S| = |T| = N,
Q = Dx×Dy× Dz and |Q| = N3/2
92
![Page 93: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/93.jpg)
KeysR(X,Y) ∧ S(Y,Z), |R|, |S| ≤ N• No other info: |Q(D)| ≤ N2
• S.Y is a key: |Q(D)| ≤ N
The Query Expansion method:• If Y is a key in some relation S, then add
all attributes of S relations containing Y• Compute AGM(Qexpanded)
93
![Page 94: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/94.jpg)
ExamplesQ(X,Y,Z) = R(X,Y)∧S(Y,Z)• Qexp(X,Y,Z) = R(X,Y,Z)∧S(Y,Z),• Edge cover: 1,0• AGM(Qexp) = |R|
Q(X,Y,Z) = R(X,Y)∧S(Y,Z)∧T(Z,X)• Qexp(X,Y,Z) = R(X,Y,Z)∧S(Y,Z)∧T(Z,X)• Edge covers: 1,0,0 or 0,1,1• AGM(Qexp) = min(|R|, |S|×|T|)
94
![Page 95: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/95.jpg)
ExamplesQ(X,Y,Z) = R(X,Y)∧S(Y,Z)• Qexp(X,Y,Z) = R(X,Y,Z)∧S(Y,Z),• Edge cover: 1,0• AGM(Qexp) = |R|
Q(X,Y,Z) = R(X,Y)∧S(Y,Z)∧T(Z,X)• Qexp(X,Y,Z) = R(X,Y,Z)∧S(Y,Z)∧T(Z,X)• Edge covers: 1,0,0 or 0,1,1• AGM(Qexp) = min(|R|, |S|×|T|)
95
![Page 96: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/96.jpg)
ExamplesQ(X,Y,Z) = R(X,Y)∧S(Y,Z)• Qexp(X,Y,Z) = R(X,Y,Z)∧S(Y,Z),• Edge cover: 1,0• AGM(Qexp) = |R|
Q(X,Y,Z) = R(X,Y)∧S(Y,Z)∧T(Z,X)• Qexp(X,Y,Z) = R(X,Y,Z)∧S(Y,Z)∧T(Z,X)• Edge covers: 1,0,0 or 0,1,1• AGM(Qexp) = min(|R|, |S|×|T|)
96
![Page 97: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/97.jpg)
ExamplesQ(X,Y,Z) = R(X,Y)∧S(Y,Z)• Qexp(X,Y,Z) = R(X,Y,Z)∧S(Y,Z),• Edge cover: 1,0• AGM(Qexp) = |R|
Q(X,Y,Z) = R(X,Y)∧S(Y,Z)∧T(Z,X)• Qexp(X,Y,Z) = R(X,Y,Z)∧S(Y,Z)∧T(Z,X)• Edge covers: 1,0,0 or 0,1,1• AGM(Qexp) = min(|R|, |S|×|T|)
97
![Page 98: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/98.jpg)
ExamplesQ(X,Y,Z) = R(X,Y)∧S(Y,Z)• Qexp(X,Y,Z) = R(X,Y,Z)∧S(Y,Z),• Edge cover: 1,0• AGM(Qexp) = |R|
Q(X,Y,Z) = R(X,Y)∧S(Y,Z)∧T(Z,X)• Qexp(X,Y,Z) = R(X,Y,Z)∧S(Y,Z)∧T(Z,X)• Edge covers: 1,0,0 or 0,1,1• AGM(Qexp) = min(|R|, |S|×|T|)
98
![Page 99: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/99.jpg)
ExamplesQ(X,Y,Z) = R(X,Y)∧S(Y,Z)• Qexp(X,Y,Z) = R(X,Y,Z)∧S(Y,Z),• Edge cover: 1,0• AGM(Qexp) = |R|
Q(X,Y,Z) = R(X,Y)∧S(Y,Z)∧T(Z,X)• Qexp(X,Y,Z) = R(X,Y,Z)∧S(Y,Z)∧T(Z,X)• Edge covers: 1,0,0 or 0,1,1• AGM(Qexp) = min(|R|, |S|×|T|)
99
![Page 100: CSE544 Data Management...Announcements •Project Milestone due on Friday •Homework 4 posted; due next Friday •There will be a short Homework 5, on transactions 2](https://reader035.vdocuments.us/reader035/viewer/2022071110/5fe567ace508e354ac0012ec/html5/thumbnails/100.jpg)
SummaryGiven cardinalities of all input tables:• AGM bound gives upper bound on query size• GJ computes the query in this time
Generic Join:• A nested loop algorithm• No longer one-join-at-a-time• Theoretical optimality means it will be
efficient for very expensive queries;less so for cheaper queries
100