Download - A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization
![Page 1: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/1.jpg)
Yuqing Wu, Dirk Van Gucht Indiana UniversityMarc Gyssens Hasselt UniversityJan Paredaens University of Antwerp
A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form
and Minimization
![Page 2: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/2.jpg)
Research in XMLXML data modelXML query languages
XPathXQuery……
XML data repositories Support from DB vendorsLORE, Niagara, TIMBER……
2
![Page 3: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/3.jpg)
Research in XMLCharacteristics of XML query languages
XPath and fragmentsCharacteristics: expressiveness,
distinguishibility, complexity, …System design
Query processing and evaluationNew access methods: structural join
Integrity, security, ……
3
![Page 4: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/4.jpg)
Theoretical Study System Design RDB did very well in this aspectOur work at Indiana University
Coupling the theoretical study of XML data and query language and the system design of XML search engines [ICDT-EROW 07]
Coupling the partition of XML documents induced by the structure of XML document with the partition induced by fragments of XPath algebras. [DBPL 07, IS 08]
Applying the coupling in the design of structural indices for XML [WebDB 08]
Designing workload sensitive structural indices for XML. [in submission]
4
![Page 5: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/5.jpg)
OutlineWhat we studiedEquivalences of query languagesNormal formResolution expressivenessEfficient query evaluationSummary and discussion
5
![Page 6: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/6.jpg)
OutlineWhat we studied
XML documentsPath+ algebraTree queries
Equivalences of query languagesNormal formResolution expressivenessEfficient query evaluationSummary and discussion
6
![Page 7: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/7.jpg)
XML DocumentsA labeled tree (V, Ed, l),
whereV is the set of nodesEd is the set of edgesl : VL is a node-
labeling function.
7
![Page 8: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/8.jpg)
Querying XML Documentfor $i in doc(…)//a/b
for $j in $i/c/*/d[e] for $k in $j/*/f
return ($i, $k)intersectfor $i in doc(…)//a/b
for $j in $i/c/a/d for $k in $j/c/f
return ($i, $k)
8
![Page 9: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/9.jpg)
Path+ Algebra – Path Semantics
¼ (1)
})(|),{()(ˆ)(
}|),{()(
lmVmmmDl
DVmmmD
l
1)(
)(
EdD
EdD
)()()(
))()(()(;)}(),(:|),{()()}(),(:|),{()(
2121
21324,121
2
1
DEDEDEE
DEDEDEEDEnmmnnDEDEnmnmmDE
9
![Page 10: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/10.jpg)
Path+ Expression – An Example
¼ (1)
;ˆ;;ˆ;ˆ;;);))ˆ;;(
);ˆ;(((;;)ˆ;;ˆ(;)ˆ;;ˆ(;)(
1
12221
dccc
acacdE
E(D) = {(n8,n11), (n8,n12)}
10
;ˆ;;ˆ;ˆ
;
;);))ˆ;;();ˆ;(((
;
;)ˆ;;ˆ(;)ˆ;;ˆ(;)(
1
12
221
dcc
ca
cacdE
![Page 11: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/11.jpg)
Interesting Sub-languages
212121 |||;|||||: EEEEEElE
Path+ :
Path+(1, 2) :
DPath+(1) : EEElE 121 |;||||:
11
EEEElE 2121 ||;|||||:
![Page 12: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/12.jpg)
Tree Query for XML A tree query T is a 3-
tuple (T, s, d), with T: a labeled tree –
nodes of T are either labeled with a symbol of L or with a wildcard *.
s and d: nodes of T, called the source and destination nodes.
12
![Page 13: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/13.jpg)
OutlineWhat we studiedEquivalences of query languages
Normal formResolution expressivenessEfficient query evaluationSummary and discussion
13
![Page 14: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/14.jpg)
Equivalences of Query LanguagesTheorem
The query languages Path+ , T and Path+
(1, 2) are all equivalent in expressive power, and there exist translation algorithms between any two of them.
14
![Page 15: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/15.jpg)
Path+ Expression Tree Query T
¼ (1)
})(|),{()(ˆ)(
}|),{()(
lmVmmmDl
DVmmmD
l
1)(
)(
EdD
EdD
)}(),(:|),{()(
)}(),(:|),{()(
2
1
DEnmmnnDEDEnmnmmDE
)()()(
))()(()(;
2121
21324,121
DEDEDEE
DEDEDEE
*
l*
*s
d
*
*d
s
15
![Page 16: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/16.jpg)
Transformation of Composition
16 ))()(()(; 21324,121 DEDEDEE
![Page 17: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/17.jpg)
Transformation of Intersection
E1 E2
17 )()()( 2121 DEDEDEE
![Page 18: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/18.jpg)
Tree Query T Path+ Expression
Base cases: Empty tree(<{n},>, n,n)
(<{n1, n2},{(n1, n2)}>, n1, n2)
(<{n1, n2},{(n1, n2)}>, n2, n1)
s (d)
s d
d s
s (d)
l l̂
18
![Page 19: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/19.jpg)
Tree Query T Path+ Expression
dpTTtoP ,,; 2
Recursive case #1: s is not an ancestor of d.
s has no child and l(s)=*
d is parent of s, d has no ancestor, no other child and l(d)=*
d s
p
T1
T2 d
s
dpTTtoPssTTtoP ,,;;,, 21
;,,1 ssTTtoP
19
![Page 20: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/20.jpg)
Recursive case #2: s is not the root.
s has no child and l(s)=*
Tree Query T Path+ Expression
dsTTtoPsrTTtoP ,,;,, 122
d
s
r
T1
T2
s
r
d srTTtoP ,,22
20
![Page 21: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/21.jpg)
Recursive case #3: s is a strict ancestor of d.
d has no child and l(d)=*
s is parent of d, s has no child other than d and l(d)=*
Tree Query T Path+ Expression
;,,2 psTTtoP
ddTTtoPpsTTtoP ,,;;,, 12
ddTTtoP ,,; 1
d
s
d
s
T1
T2
p
21
![Page 22: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/22.jpg)
Recursive case #4: s = d is the root.
l(s)=*
Tree Query T Path+ Expression
)(;,,;;,, 1111 scsTTtoPcsTTtoP nn l
s, d…
T1
s, d
Tn
c1
cn
nn csTTtoP
csTTtoP
,,
;;,,
1
111
22
![Page 23: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/23.jpg)
Equivalences of Query LanguagesTheorem
The query languages Path+ , T and Path+
(1, 2) are all equivalent in expressive power, and there exist translation algorithms between any two of them.
Path+ exp T query Path+(1, 2) exp
23
![Page 24: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/24.jpg)
OutlineWhat we studiedEquivalences of query languagesNormal formResolution expressivenessEfficient query evaluationSummary and discussion
24
![Page 25: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/25.jpg)
Normal Form Observation about the tree query Path+(1,2)
transformation:The resultant Path+(1,2) expression is of the form
where m≥0 and n ≥0Ci (i = um ,…, u1, d1 ,…, dn) are of the formCtop is of the form
E is a DPath+(1) expression.
nm ddtopuu CCCCC 11
?*1 ]ˆ[)]([ lE?*
12 ]ˆ[)]()][([ lEE
EEElE 121 |;||ˆ||:
d
rt
s
25
![Page 26: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/26.jpg)
Normal FormE(Tts) -1; E(Ttt) ; 2 (E(Trt)); E(Ttd)
E(Tts), E(Ttt),E(Trt),E(Ttd) areDPath+(1) expressions Tt
s Tt
d
Tr
t
Tt
t d
r
t
s
26
![Page 27: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/27.jpg)
OutlineWhat we studiedEquivalences of query languagesNormal formResolution expressivenessEfficient query evaluationSummary and discussion
27
![Page 28: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/28.jpg)
Resolution ExpressivenessResolution expressiveness: a language’s
ability to distinguish a pairs of nodes of a pair of paths in the document.
28
![Page 29: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/29.jpg)
Expression EquivalenceNodes m1 and m2 are expression-related
(m1 ≥exp m2), if for each expression E, E(D)(m1) implies E(D)(m2) , where E(D)(m) = {n | (m,n) E(D)}.
m1 =exp m2 if m1 ≥exp m2 and m2 ≥exp m1
29
![Page 30: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/30.jpg)
1-equivalence Nodes m1 and m2 are downward 1-related (m1 ≥
1 m2) iffl (m1) = l (m2);For each child n1 of m1, there exist a child n2 of m2 such that
n1 ≥1 n2.
Nodes m1 and m2 are 1-related (m1 ≥1 m2) iffm1 ≥
1 m2
if m1 is not the root and p1 is the parent of m1 , then m2 is not the root with parent p2 such that p1 ≥1 p2 .
m1 =1 m2 if m1 ≥1 m2 and m2 ≥1 m1.
(m1, n1) ≥1 (m2, n2) if m1 ≥1 m2 and n1 ≥1 n2 and sig(m1, n1) = sig(m2, n2) .
30
![Page 31: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/31.jpg)
Resolution ExpressivenessTheorem :
m1 =exp m2 iff m1 =1 m2
Theorem : (m1,n1) E(D) implies (m2,n2) E(D) iff (m1,n1) ≥1 (m2,n2)
31
![Page 32: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/32.jpg)
OutlineWhat we studiedEquivalences of query languagesNormal formResolution expressivenessEfficient query evaluationSummary and discussion
32
![Page 33: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/33.jpg)
Tree Query Minimization1st Reduction: merging 1-equivalent nodes
in a tree query;*
a a
* c
*
c
*dd
sd
c
*d
*
a a
* c
*
c
*dd
sd
33
![Page 34: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/34.jpg)
Tree Query Minimization 1-*-related (≥*
1 ): relax 1-related with l (m1) + l (m2) = l (m2);
2nd Reduction: deleting from a tree query in a top-down fashion every node m1 for which there exists another node m2 such that m1 ≥*
1
m2 . *
a a
* c
*
c
*dd
sd
*
a
c
* d
sd
34
![Page 35: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/35.jpg)
Efficient Query Evaluation Path+
Expression
E(Tts) -1; E(Ttt) ; 2 (E(Trt)); E(Ttd)E(Tts) , E(Ttt), E(Trt), E(Ttd)
are DPath+(1) expressionsTt
s Tt
d
Trt
Ttt d
r
t
s
Minimum Tree
Query
Tree Query
1st & 2nd Reduction
Normal Form
35
![Page 36: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/36.jpg)
Efficient Query EvaluationExp = E(Tts) -1; E(Ttt) ; 2 (E(Trt)); E(Ttd)
E(Tts) , E(Ttt), E(Trt), E(Ttd) are DPath+
(1) expressionsDPath+(1) queries can be evaluated via
index-only plan using P(k)-Trie index. [Bre08]
[Bre08]: Sofia Brenes, Yuqing Wu, Dirk Van Gucht, Pablo Santa Cruz. Trie Indices for Efficient XML Query Evaluation. WebDB 2008.36
![Page 37: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/37.jpg)
OutlineWhat we studiedEquivalences of query languagesNormal formResolution expressivenessEfficient query evaluationSummary and discussion
37
![Page 38: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/38.jpg)
Summary
212121 |||;|||ˆ||: EEEEEElE
Objects of study:XML document: a treePath+ language : Tree queries
Areas of study:ExpressivenessEquivalenceNormal formQuery evaluation
38
![Page 39: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/39.jpg)
Extending the Path+ language
**
Adding operators:
Will the results hold? ExpressivenessEquivalenceNormal formQuery evaluation
39
![Page 40: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/40.jpg)
Yuqing Wu, Dirk Van Gucht Indiana UniversityMarc Gyssens Hasselt UniversityJan Paredaens University of Antwerp
A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form
and Minimization
Thank you.
Questions? A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form
and Minimization
![Page 41: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/41.jpg)
41
SIGMOD/PODS 2010 Indianapolis, Indiana, USA
Conference date: Jun, 2010Deadlines: SIGMOD early Nov, 2009
PODS early Dec, 2009
![Page 42: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/42.jpg)
P[k]-Trie IndexA1
D1C2
B3B2C1
B4A2B1
C3
B5
C4
Keep track of the P[k]-partitions
Use the reverse label path as key P
[2](A)(B)
(C)
(D)
{(A1, A1), (A2, A2)}{(B1, B1), (B2, B2), (B3, B3), (B4, B4), (B5, B5)}{(C1, C1), (C2, C2), (C3, C3), (C4, C4)}{(D1, D1)}
(A,A)(A,B)(B,B)(B,C)(B,D)
{(A1, A2)}{(A1, B1), (A2, B2), (A2, B3), (A1, B4)}{(B4, B5)}{(B1, C1), (B2, C2), (B3, C3), (B5, C4)}{(B2, D1)}
(A,A,B)(A,B,B)(A,B,C)(A,B,D)(B,B,C)
{(A1, B2), (A1, B3)}{(A1, B5)}{(A1, C1), (A2, C2), (A2, C3)}{(A2, D1)} {(B4, C4)}
42
![Page 43: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/43.jpg)
Query Evaluation with P[k]-Trie IndexQuery 1: //A/B/C
A1
D1C2
B3B2C1
B4A2B1
C3
B5
C4
43
![Page 44: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/44.jpg)
Query Evaluation with P[k]-Trie IndexQuery 2: //B/C
A1
D1C2
B3B2C1
B4A2B1
C3
B5
C4
44
![Page 45: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/45.jpg)
Query Evaluation with P[k]-Trie IndexQuery 3: //A/B[./D]/C A1
D1C2
B3B2C1
B4A2B1
C3
B5
C4
45
![Page 46: A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization](https://reader031.vdocuments.us/reader031/viewer/2022020501/5681650d550346895dd7850f/html5/thumbnails/46.jpg)
Query Evaluation with P[k]-Trie IndexQuery 3: //A/B[./D]/C A1
D1C2
B3B2C1
B4A2B1
C3
B5
C4
46