Download - Answering Tree Pattern Queries Using Views
![Page 1: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/1.jpg)
Answering Tree Pattern Queries Using Views
Laks V.S. Lakshmanan, Hui (Wendy) Wang, and Zheng (Jessica) Zhao
University of British ColumbiaVancouver, BC
Amazon.com
![Page 2: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/2.jpg)
Outline
Motivation Problems Studied Without schema With schema Recursive schemas Related Work Summary & Future Work
![Page 3: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/3.jpg)
Motivation 1/3
Integration of existing data sources. Local as view (LAV) – one of the well-
known approaches. Each source = a materialized view over
some global database. Answer to query over global DB =
answer to query using (materialized) views.
![Page 4: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/4.jpg)
Motivation 2/3 <Trial> (3) <Patient> (4) John Doe </Patient> … <Status> (10) Complete </Status> </Trial> <Trial> (11) <Patient> (12) Jen Bloe </Patient> … </Trial> <Trial> (14) <Patient> (15) Mary Moore </Patient> … </Trial>
Source = View “//Trials//Trial” over some DB containing clinical data – trials, their status, patient data, etc.
Consider query Q: //Trials[//Status]//Trial over [unknown] original DB.
How can and should we answer it using above source?
![Page 5: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/5.jpg)
Motivation 3/3<PharmaLab> (1) <Trials @type=“T1”> (2) <Trial> (3) <Patient> (4) John Doe </Patient> … <Status> (10) Complete </Status> </Trial> <Trial> (11) <Patient> (12) Jen Bloe </Patient> … </Trial> </Trials> <Trials @type=“T2”> (13) <Trial> (14) <Patient> (15) Mary Moore </Patient> … </Trial> </Trials> </PharmaLab>
<Trial> (3) <Patient> (4) John Doe </Patient> … <Status> (10) Complete </Status> </Trial> <Trial> (11) <Patient> (12) Jen Bloe </Patient> … </Trial> <Trial> (14) <Patient> (15) Mary Moore </Patient> … </Trial>
//Trials//Trial//Trials//Trial?? ??
On
e p
ossib
le o
rig
inal D
B
![Page 6: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/6.jpg)
Motivation 3/3<PharmaLab> (1) <Trials @type=“T1”> (2) <Trial> (3) <Patient> (4) John Doe </Patient> … <Status> (10) Complete </Status> </Trial> <Trial> (11) <Patient> (12) Jen Bloe </Patient> …
</Trial> </Trials> <Trials @type=“T2”> (13) <Trial> (14) <Patient> (15) Mary Moore </Patient> … </Trial> </Trials> </PharmaLab>
<Trial> (3) <Patient> (4) John Doe </Patient> … <Status> (10) Complete </Status> </Trial> <Trial> (11) <Patient> (12) Jen Bloe </Patient> … </Trial> <Trial> (14) <Patient> (15) Mary Moore </Patient> … </Trial>
//Trials//Trial//Trials//Trial?? ??
On
e p
ossib
le o
rig
inal D
BQ: //Trials[//Status]//Trial
![Page 7: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/7.jpg)
Motivation 3/3<PharmaLab> (1) <Trials @type=“T1”> (2) <Trial> (3) <Patient> (4) John Doe </Patient> … <Status> (10) Complete </Status> </Trial> <Trial> (11) <Patient> (12) Jen Bloe </Patient> … </Trial> </Trials> <Trials @type=“T2”> (13) <Trial> (14) <Patient> (15) Mary Moore </Patient> … </Trial> </Trials> </PharmaLab>
<Trial> (3) <Patient> (4) John Doe </Patient> … <Status> (10) Complete </Status> </Trial> <Trial> (11) <Patient> (12) Jen Bloe </Patient> … </Trial> <Trial> (14) <Patient> (15) Mary Moore </Patient> … </Trial>
//Trials//Trial//Trials//Trial
On
e p
ossib
le o
rig
inal D
B
◦ “●◦ “●[//Status]”[//Status]” { (3) }
Contained rewriting
![Page 8: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/8.jpg)
Problems Studied 1/3 Equivalent Rewriting: Given Q and
views V, find an equivalent rewriting of Q using V, i.e., an expression E s.t. V◦E Ξ Q, over all possible input DBs. Appropriate for query optimization.
Contained Rewriting: Given Q and V, find an expression E s.t. V◦E Q overall all possible input DBs, and V◦E is maximal among all such rewritings. Most appropriate for information integration
[Halevy, Lenzerini, Pottinger & Halevy].
![Page 9: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/9.jpg)
Problems Studied 2/3No Schema: Given Q and V, find a
maximally contained rewriting (MCR) of Q using V.
With Schema: Given Q and V, and a schema prescribing possible input DBs, find a maximally contained rewriting of Q using V.
Focus: Tree Pattern Queries (XP/,//, [ ]). Schema without cycles, union, and
recursion.
![Page 10: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/10.jpg)
Problems Studied 3/3
Given Q & V: R Ξ V ◦ E Q.
Compensation queryRewriting query
Want MCR in the absence and in the presence of a schema.
//a[//b]/c //a
b c
![Page 11: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/11.jpg)
Without Schema 1/6
Question 1: Does an MCR always exist?
/a/a
bb
cc
VV
/b/b
d
Q1Q1
/a/a
b
dd
Q2Q2
No MCR for Q1 and for Q2.
What went wrong?
distinguished (answer) node
![Page 12: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/12.jpg)
Without Schema 2/6
Trial
//Trials//Trials //Trials//Trials
StatusStatus Patient
(1)(1)
(2)(2) (3)(3)Trial
//Trials//Trials
StatusStatus Patient
(1)(1)
(2)(2) (3)(3)
Unfulfilled obligationsClip Away Tree (CAT)
f
f – useful embedding
VVQQ
V
E
![Page 13: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/13.jpg)
Without Schema 3/6Theorem: Q, V – tree pattern queries.
Then Q is answerable using V iff there is a useful embedding from Q to V.
aa
dd
baa
//a//a
c c
bba a aa
b
cc
2
1
77
33
44
55V ee
bb
ccQ
1,21,2
1:{2}, 2:{}1:{2}, 2:{}
2:{6}2:{6}
6:{7}6:{7}
1:{2,3}, 2:{3}1:{2,3}, 2:{3}
2:{6}, 3:{4} 2:{6}, 3:{4}
4:{5}, 6:{7} 4:{5}, 6:{7}
Testing Existence of MCR:
//a//a
6
![Page 14: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/14.jpg)
Without Schema 4/6
Two embeddings – corresponding irredundant CRs.
aa
baa
cb
cc
//a//a
d e
aa
baa
cb
cc
//a//a
a eb
c d
need for expressing
MCR!
![Page 15: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/15.jpg)
Without Schema 5/6Can test existence of MCR in poly time. However, MCRs can be exponentially
large (closure issue).
eedd
aa aa
c
bb
//a//a
aa
//a//a
c
bb
c
bbVVQQ
How many irredundant CRs
are possible?
![Page 16: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/16.jpg)
Without Schema 5/6
eedd
aa aa
c
bb
//a//a
aa
//a//a
c
bb
c
bbVV
QQc
bb
aa
//a//a
d e
![Page 17: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/17.jpg)
Without Schema 5/6
eedd
aa aa
c
bb
//a//a
aa
//a//a
c
bb
c
bbVV
QQc
bb
aa
//a//a
d
a/b/c/e
![Page 18: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/18.jpg)
Without Schema 5/6
eedd
aa aa
c
bb
//a//a
aa
//a//a
c
bb
c
bbVV
QQc
bb
aa
//a//a
ea/b
e
c
![Page 19: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/19.jpg)
Without Schema 5/6
eedd
aa aa
c
bb
//a//a
aa
//a//a
c
bb
c
bbVV
QQc
bb
aa
//a//a
a/b/c/ea/b
e
cMCR = union of exponential
# CRs in the worst case!
![Page 20: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/20.jpg)
Without Schema 6/6
Summary: Can test existence of MCR in poly
time. Exact characterization.
MCR may be union of exponentially many CRs in the worst case.
Algorithm for generating MCR.
![Page 21: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/21.jpg)
With Schema 1/6
Given Query Q, view V, schema S. Infer all constraints C implied by S. Chase V w.r.t. C. Look for MCR of Q w.r.t. chased view.
![Page 22: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/22.jpg)
With Schema 2/6AuctionsAuctions
AuctionAuction
open_auctionopen_auctionclosed_auctionclosed_auction
bidsbids
personperson itemitem
namename
**
** ??
++ ??
++ ++
E.g. constraints:•c_a has ≤ 1 bids
child•Every Auction
having a person desc also
has an item desc.
•every path from Auction to name
goes via bids.
![Page 23: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/23.jpg)
With Schema 3/6
//Auction//Auction
o_ao_a c_ac_a
bidsbids bidsbids
VV
//Auction//Auction
bidsbids bidsbids
personperson itemitem
namenameQQ
![Page 24: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/24.jpg)
With Schema 3/6AuctionsAuctions
AuctionAuction
open_auctionopen_auction closed_auctionclosed_auction
bidsbids
personperson itemitem
namename
**
** ??
++ ??
++ ++
o_ao_a c_ac_a
bidsbids bidsbids
//Auction//Auction
person item
name
p i
n
![Page 25: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/25.jpg)
With Schema 4/6
o_ao_a c_ac_a
bidsbids bidsbids
//Auction//Auction
person item
name
p i
n
//Auction//Auction
bidsbids bidsbids
personperson itemitem
namenameQQ
MCR = identity query.
![Page 26: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/26.jpg)
With Schema 5/6Another Example: AuctionsAuctions
AuctionAuction
closed_auctionclosed_auction
bidsbids
personperson
itemitem
namename
**
** ??
++
++
open_auctionopen_auction
buyer?
//Auction
nameitem
person
Q
//Auction
VHow to answer Q using V?
![Page 27: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/27.jpg)
With Schema 5/6 Another Example: AuctionsAuctions
AuctionAuction
closed_auctionclosed_auction
bidsbids
personperson
itemitem
name
**
** ??
++
++
open_auctionopen_auction
buyer?
//Auction
nameitem
person
Q
//Auction
namnamee
item
So what’s the compensation query?
![Page 28: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/28.jpg)
With Schema 5/6 Another Example: AuctionsAuctions
AuctionAuction
closed_auctionclosed_auction
bidsbids
personperson
itemitem
name
**
** ??
++
++
open_auctionopen_auction
buyer?
//Auction
nameitem
person
Q
//Auction
namename
item
MCR = V ◦ “●//name”
![Page 29: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/29.jpg)
With Schema 6/6
Challenges and Highlights: Naïve chase can explode.
Make chase context aware.
Exact characterization of schema w/o recursion and union in terms of constraints.
Efficient algo. for inferring the constraints. Efficient algo. for chase. And for finding MCR. MCR is unique, if it exists.
![Page 30: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/30.jpg)
Recursive Schemas 1/2a
b
c d**
?
//a
b
V
//a
b b
c dQ
What is the MCR?
![Page 31: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/31.jpg)
Recursive Schemas 2/2a
b
c d**
?
//a
b
V
//a
b b
c dQ//a
b
c d
![Page 32: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/32.jpg)
Recursive Schemas 2/2a
b
c d**
?
//a
b
V
//a
b b
c dQ//a
b
c
d
b
![Page 33: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/33.jpg)
Recursive Schemas 2/2a
b
c d**
?
//a
b
V
//a
b b
c dQ//a
b
c
db
![Page 34: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/34.jpg)
Recursive Schemas 2/2a
b
c d**
?
//a
b
V
//a
b b
c dQ//a
b
c
b
d
b
MCR = union of four CRs.
Behavior similar to no schema.
![Page 35: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/35.jpg)
Related Work 1/2
QAV for relational – huge body of work [Halevy 01].
Regular path queries and semi-structured DBs [Grahne&Thomo 03, Calvenese 00,Papakonstantinou&Vassalos 99].
Equivalent rewrites for fragments of XQuery and XPath [Deutsch&Tannen 03, Tang&Zhou 05, Xu&Ozsoyoglu 05].
![Page 36: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/36.jpg)
Related Work 2/2 Key differences b/w equivalent &
contained rewriting: Unique rewriting (even w/o schema). MCR may involve union of (possibly
exponentially many) CRs. Study of contained rewriting in
presence of schema. Lot of work on semantic caching
[Chen+ 02], heuristics for using materialized views for optimizing XPath [Balmin+ 04], mine views worth materializing, XPath containment, … .
![Page 37: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/37.jpg)
Summary & Future Work 1/2
QAV using (maximally) contained rewriting ( information integration).
Without schema: existence, characterization, closure, generation of MCR.
With Schema: extract essence using constraints, chase, similar problems as above.
Impact of recursion. Experiments.
![Page 38: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/38.jpg)
Summary & Future Work 2/2
Impact of wildcard, disjunction, order …
Impact of union, recursion, … Other integration models (e.g., GLAV) QAV for XQuery.
![Page 39: Answering Tree Pattern Queries Using Views](https://reader036.vdocuments.us/reader036/viewer/2022062804/56814bfc550346895db8f724/html5/thumbnails/39.jpg)