holistic twig joins: optimal xml pattern matching nicholas bruno, nick koudas, divesh srivastava acm...
DESCRIPTION
Problem Statement Given a query twig pattern Q, and a XML database D, compute ALL the answers to Q in D. Example: QueryXML documentTRANSCRIPT
Holistic Twig Joins: Optimal XML Pattern Matching
Nicholas Bruno, Nick Koudas, Divesh Srivastava
ACM SIGMOD 02
Presented by: Li Wei, Dragomir Yankov
Outline• Problem Statement• PathStack Algorithm• TwigStack Algorithm• Experimental Results
Problem Statement• Given a query twig pattern Q, and a XML database D, compute
ALL the answers to Q in D. • Example:
author
l n
j ane doe
fn
book(1, 1: 150, 1)
t i t l e(1, 2: 4, 2)
al l authors(1, 5: 60, 2)
year(1, 61:63, 2)
chapter(1, 64:93, 2)
XML(1, 3, 3)
author(1, 6:20, 3)
fn(1, 7: 9, 4)
l n
j ane(1, 8, 5)
poe(1, 11, 5)
2000(1, 62, 3)
t i t l e(1, 65:67, 3)
XML(1, 66, 4)
secti on(1, 68: 78, 3)
head(1, 69: 71, 4)
Ori gi ns(1, 70, 5)
author
fn l n
j ohn doe(1, 26, 5)
author
fn l n
j ane(1, 43, 5)
doe(1, 46, 5)
Query XML document
Binary Structural Joins• The approach
– Decompose the twig pattern into binary structural relationships
– Use structural join algorithms to match the binary relationships against the XML database
– Stitch together the basic matches• The problem
– The intermediate result sizes can get large, even when the input and output sizes are more manageable.
Example
author
l n
j ane doe
fn
book(1, 1:150,1)
t i t l e(1, 2: 4, 2)
al l authors(1, 5: 60, 2)
year(1, 61: 63, 2)
chapter(1, 64: 93, 2)
XML(1, 3, 3)
author(1, 6: 20, 3)
f n(1, 7: 9, 4)
l n
j ane(1, 8, 5)
poe(1, 11, 5)
2000(1, 62, 3)
t i t l e(1, 65: 67, 3)
XML(1, 66, 4)
secti on(1, 68: 78, 3)
head(1, 69: 71, 4)
Ori gi ns(1, 70, 5)
author
fn l n
j ohn doe(1, 26, 5)
author
fn l n
j ane(1, 43, 5)
doe(1, 46, 5)
Query XML document
Example
author
l n
j ane doe
fn
book(1, 1:150,1)
t i t l e(1, 2: 4, 2)
al l authors(1, 5: 60, 2)
year(1, 61: 63, 2)
chapter(1, 64: 93, 2)
XML(1, 3, 3)
author(1, 6: 20, 3)
f n(1, 7: 9, 4)
l n
j ane(1, 8, 5)
poe(1, 11, 5)
2000(1, 62, 3)
t i t l e(1, 65: 67, 3)
XML(1, 66, 4)
secti on(1, 68: 78, 3)
head(1, 69: 71, 4)
Ori gi ns(1, 70, 5)
author
fn l n
j ohn doe(1, 26, 5)
author
fn l n
j ane(1, 43, 5)
doe(1, 46, 5)
Query XML document
Decomposition
author – fn
author – ln
fn – jane
ln – doe
Example
Decomposition Number of Intermediate Results3
author
l n
j ane doe
fn
book(1, 1:150,1)
t i t l e(1, 2: 4, 2)
al l authors(1, 5: 60, 2)
year(1, 61: 63, 2)
chapter(1, 64: 93, 2)
XML(1, 3, 3)
author(1, 6: 20, 3)
f n(1, 7: 9, 4)
l n
j ane(1, 8, 5)
poe(1, 11, 5)
2000(1, 62, 3)
t i t l e(1, 65: 67, 3)
XML(1, 66, 4)
secti on(1, 68: 78, 3)
head(1, 69: 71, 4)
Ori gi ns(1, 70, 5)
author
fn l n
j ohn doe(1, 26, 5)
author
fn l n
j ane(1, 43, 5)
doe(1, 46, 5)
Query XML document
author – fn
author – ln
fn – jane
ln – doe
Example
author
l n
j ane doe
fn
book(1, 1:150,1)
t i t l e(1, 2: 4, 2)
al l authors(1, 5: 60, 2)
year(1, 61: 63, 2)
chapter(1, 64: 93, 2)
XML(1, 3, 3)
author(1, 6: 20, 3)
f n(1, 7: 9, 4)
l n
j ane(1, 8, 5)
poe(1, 11, 5)
2000(1, 62, 3)
t i t l e(1, 65: 67, 3)
XML(1, 66, 4)
secti on(1, 68: 78, 3)
head(1, 69: 71, 4)
Ori gi ns(1, 70, 5)
author
fn l n
j ohn doe(1, 26, 5)
author
fn l n
j ane(1, 43, 5)
doe(1, 46, 5)
Query XML document
Decomposition Number of Intermediate Results3
3
author – fn
author – ln
fn – jane
ln – doe
Example
author
l n
j ane doe
fn
book(1, 1:150,1)
t i t l e(1, 2: 4, 2)
al l authors(1, 5: 60, 2)
year(1, 61: 63, 2)
chapter(1, 64: 93, 2)
XML(1, 3, 3)
author(1, 6: 20, 3)
f n(1, 7: 9, 4)
l n
j ane(1, 8, 5)
poe(1, 11, 5)
2000(1, 62, 3)
t i t l e(1, 65: 67, 3)
XML(1, 66, 4)
secti on(1, 68: 78, 3)
head(1, 69: 71, 4)
Ori gi ns(1, 70, 5)
author
fn l n
j ohn doe(1, 26, 5)
author
fn l n
j ane(1, 43, 5)
doe(1, 46, 5)
Query XML document
Decomposition Number of Intermediate Results3
3
2
author – fn
author – ln
fn – jane
ln – doe
Example
author
l n
j ane doe
fn
book(1, 1:150,1)
t i t l e(1, 2: 4, 2)
al l authors(1, 5: 60, 2)
year(1, 61: 63, 2)
chapter(1, 64: 93, 2)
XML(1, 3, 3)
author(1, 6: 20, 3)
f n(1, 7: 9, 4)
l n
j ane(1, 8, 5)
poe(1, 11, 5)
2000(1, 62, 3)
t i t l e(1, 65: 67, 3)
XML(1, 66, 4)
secti on(1, 68: 78, 3)
head(1, 69: 71, 4)
Ori gi ns(1, 70, 5)
author
fn l n
j ohn doe(1, 26, 5)
author
fn l n
j ane(1, 43, 5)
doe(1, 46, 5)
Query XML document
Decomposition
author – fn
author – ln
fn – jane
ln – doe
Number of Intermediate Results3
3
2
2
Example
author
l n
j ane doe
fn
book(1, 1:150,1)
t i t l e(1, 2: 4, 2)
al l authors(1, 5: 60, 2)
year(1, 61: 63, 2)
chapter(1, 64: 93, 2)
XML(1, 3, 3)
author(1, 6: 20, 3)
f n(1, 7: 9, 4)
l n
j ane(1, 8, 5)
poe(1, 11, 5)
2000(1, 62, 3)
t i t l e(1, 65: 67, 3)
XML(1, 66, 4)
secti on(1, 68: 78, 3)
head(1, 69: 71, 4)
Ori gi ns(1, 70, 5)
author
fn l n
j ohn doe(1, 26, 5)
author
fn l n
j ane(1, 43, 5)
doe(1, 46, 5)
Query XML document
Decomposition
author – fn
author – ln
fn – jane
ln – doe
Number of Intermediate Results3
3
2
2
Output
1
Holistic Twig Joins• The approach
– Uses linked stacks to compactly represent partial results to query paths
– Merges results to query paths to obtain matches for the twig pattern
• The advantage– It ensures that no intermediate solutions is
larger than the final answer to the query.
Example
author
l n
j ane doe
fn
Query XML documentbook
(1, 1: 150, 1)
t i t l e(1, 2: 4, 2)
al l authors(1, 5: 60, 2)
year(1, 61:63, 2)
chapter(1, 64: 93, 2)
XML(1, 3, 3)
author1(1, 6: 20, 3)
fn1(1, 7: 9, 4)
l n1
j ane1(1, 8, 5)
poe(1, 11, 5)
2000(1, 62, 3)
t i t l e(1, 65: 67, 3)
XML(1, 66, 4)
secti on(1, 68:78, 3)
head(1, 69:71, 4)
Ori gi ns(1, 70, 5)
author2
fn2 l n2
j ohn doe1(1, 26, 5)
author3
fn3 l n3
j ane2(1, 43, 5)
doe2(1, 46, 5)
Example
Decomposition
author – fn – jane
author – ln – doe
Intermediate Results
1
1
Output
author
l n
j ane doe
fn
Query XML document
1
book(1, 1: 150, 1)
t i t l e(1, 2: 4, 2)
al l authors(1, 5: 60, 2)
year(1, 61:63, 2)
chapter(1, 64: 93, 2)
XML(1, 3, 3)
author1(1, 6: 20, 3)
fn1(1, 7: 9, 4)
l n1
j ane1(1, 8, 5)
poe(1, 11, 5)
2000(1, 62, 3)
t i t l e(1, 65: 67, 3)
XML(1, 66, 4)
secti on(1, 68:78, 3)
head(1, 69:71, 4)
Ori gi ns(1, 70, 5)
author2
fn2 l n2
j ohn doe1(1, 26, 5)
author3
fn3 l n3
j ane2(1, 43, 5)
doe2(1, 46, 5)
Number of Intermediate Results
author3 – fn3 – jane2
author3 – ln3 – doe2
book(1, 1: 150, 1)
t i t l e(1, 2: 4, 2)
al l authors(1, 5: 60, 2)
year(1, 61:63, 2)
chapter(1, 64:93, 2)
XML(1, 3, 3)
author1(1, 6: 20, 3)
fn1(1, 7: 9, 4)
l n1
j ane1(1, 8, 5)
poe(1, 11, 5)
2000(1, 62, 3)
t i t l e(1, 65:67, 3)
XML(1, 66, 4)
secti on(1, 68: 78, 3)
head(1, 69: 71, 4)
Ori gi ns(1, 70, 5)
author2
fn2 l n2
j ohn doe1(1, 26, 5)
author3
fn3 l n3
j ane2(1, 43, 5)
doe2(1, 46, 5)
author
l n
j ane doe
fn
Query
isLeaf (author) = false
isRoot (author) = true
parent (fn) = author
children (author) = {fn, ln}
subtreeNodes (author) = {fn, ln, jane, doe}
XML document
StreamsTa: a1, a2, a3
Tfn: fn1, fn3
Tln: ln2, ln3
Tj: j1, j2
Td: d1, d2
eof (Ta) = false
advance (Ta) => Ta: a1, a2, a3
next (Ta) = a1
nextL (Ta) = 6
nextR (Ta) = 20
Notation
SaSfnSl nSjSd
a3f 3
Stacks
empty (Sa) = false
pop (Sf)
push (Sln, ln3, pointer to a3)
topL (Sa) = LeftPos of a3
topR (Sa) = RightPos of a3
Algorithm: PathStack
A1
B1
A2
B2
C1
SASBSC
A1B1A2B2
C1
While the streams of the leaves are not empty (i.e. a solution could be found) do:- select the node with minimal LeftPos value and push it into stack- if it is a leaf, print the solution
A
B
C
A1
B1
A2
B2
C1
A1B1C1
A1B2C1
A2B2C1
Intuition:
TA: A1, A2
TB: B1, B2
TC: C1
Stacks Comments
A1
B1
A2
B2
C1
A
B
C
SASBSC
Streams
A1B1
A2B2C1
qmin = A
06) moveStreamToStack(TA, SA, null)
TA: A1, A2
TB: B1, B2
TC: C1
Stacks Comments
A1
B1
A2
B2
C1
A
B
C
Streams
A1B1
A2B2C1
qmin = B
06) moveStreamToStack(TB, SB, A1)SASBSC
A1
SASBSC
A1B1
TA: A1, A2
TB: B1, B2
TC: C1
Stacks Comments
A1
B1
A2
B2
C1
A
B
C
Streams
A1B1
A2B2C1
qmin = A
06) moveStreamToStack(TA, SA, null)
SASBSC
A1B1A2
TA: A1, A2
TB: B1, B2
TC: C1
Stacks Comments
A1
B1
A2
B2
C1
A
B
C
Streams
A1B1
A2B2C1
qmin = B
06) moveStreamToStack(TB, SB, A2)
SASBSC
A1B1A2B2
TA: A1, A2
TB: B1, B2
TC: C1
Stacks Comments
A1
B1
A2
B2
C1
A
B
C
Streams
A1B1
A2B2C1
qmin = C
06) moveStreamToStack(TC, SC, B2)
SASBSC
A1B1A2B2
C1
TA: A1, A2
TB: B1, B2
TC: C1
Stacks Comments
A1
B1
A2
B2
C1
A
B
C
Streams
A1B1
A2B2C1
07) isLeaf(C) = true
08) showSolutions(SC, 1)
09) pop(SC)
SASBSC
A1B1A2B2
TA: A1, A2
TB: B1, B2
TC: C1
Stacks Comments
A1
B1
A2
B2
C1
A
B
C
Streams
A1B1
A2B2C1
01) end(q) = true
Algorithm ends.
Procedure: showSolutions
SASBSC
A1B1A2B2
C1
Intuition:- stacks have the compact encodings of the anwers
- output is in leaf-to-root order
A
B
C
A1
B1
A2
B2
C1 C1B1A1
C1B2A1
C1B2A2
Analysis: PathStack• Correctness
– (Theorem 3.1) Given a query path pattern Q and an XML database D, Algorithm PathStack correctly returns all answers for Q on D.
• Optimality– (Theorem 3.2) Algorithm PathStack has worst
case I/O and CPU time complexities linear in the sum of sizes of the input lists and the output list.
PathMPMJ
• A naïve extension of MPMGJN could be to backtrack all possible solutions – PathMPMJNaive
• A much faster approach is to keep “k” pointers on the streams and prune part of the solutions - PathMPMJ
A
B
C
TA = A1, A2, A3…
TB = B1, B2 … BK…
TC = C1, C2, C3 …
author
l n
j ane doe
fn
PathStack Limitations• Merging the path queries for twig joins is
not optimalExample:
allauthors(1,5:60,2)
author1(1,6:20,3)
fn1(1,7:9,4)
ln1
jane1(1,8,5)
poe(1,11,5)
author2
fn2 ln2
john doe1(1,26,5)
author3
fn3 ln3
jane2(1,43,5)
doe2(1,46,5)
...
Query result:
(a3, fn3, ln3, j2, d2)
Query:
author
jane
fn
author
doe
ln
(a1, fn1, j1)
(a3, fn3, j3)
(a2, ln2, d2)
(a3, ln3, d3)
TwigStackallauthors
(1,5:60,2)
author1(1,6:20,3)
fn1(1,7:9,4)
ln1
jane1(1,8,5)
poe(1,11,5)
author2
fn2 ln2
john doe1(1,26,5)
author3
fn3 ln3
jane2(1,43,5)
doe2(1,46,5)
...
Intuition:
author
l n
j ane doe
fn
While the streams of the leaves are not empty (i.e. a solution could be found) do:
- select a node that could be expanded to a solution - if it is a leaf, print the solution
author
l n
j ane doe
fn
StreamsTa: a1, a2, a3
Tfn: fn1, fn2, fn3
Tln: ln1, ln2, ln3
Tj: j1, j2
Td: d1, d2
SaSf nSl nSjSd
StacksComments: Phase101: while (notEmpty(Tj) || notEmpty(Td)) do:
TwigStack: Example...
allauthors(1,5:60,2)
author1(1,6:20,3)
fn1(1,7:9,4)
jane1(1,8,5)
poe(1,11,5)
author2
fn2 ln2
john doe1(1,26,5)
author3
fn3 ln3
jane2(1,43,5)
doe2(1,46,5)
ln1
author
l n
j ane doe
fn
SaSf nSl nSjSd
StacksComments: iteration1qact = getNext(a) fn getNext(fn) fn getNext(j) j nmin=nmax=8 (j1) getNext(ln) ln getNext(d) d nmin=nmax=26 (d1)
advance(ln) nmin=7(fn1) nmax=ln2 advance(Ta)advance(Tfn)
TwigStack: Example...
allauthors(1,5:60,2)
author1(1,6:20,3)
fn1(1,7:9,4)
jane1(1,8,5)
poe(1,11,5)
author2
fn2
john doe1(1,26,5)
author3
fn3 ln3
jane2(1,43,5)
doe2(1,46,5)
ln1 ln2
StreamsTa: a1, a2, a3
Tfn: fn1, fn2, fn3
Tln: ln1, ln2, ln3
Tj: j1, j2
Td: d1, d2
author
l n
j ane doe
fn
SaSf nSl nSjSd
StacksComments: iteration2qact = getNext(a) j getNext(fn) j getNext(j) j nmin=nmax=8 (j1) getNext(ln) ln getNext(d) d nmin=nmax=26 (d1) nmin=8(j1) nmax=ln2advance(Tj)
TwigStack: Example...
allauthors(1,5:60,2)
author1(1,6:20,3)
fn1(1,7:9,4)
jane1(1,8,5)
poe(1,11,5)
author2
fn2
john doe1(1,26,5)
author3
fn3 ln3
jane2(1,43,5)
doe2(1,46,5)
ln1 ln2
StreamsTa: a1, a2, a3
Tfn: fn1, fn2, fn3
Tln: ln1, ln2, ln3
Tj: j1, j2
Td: d1, d2
author
l n
j ane doe
fn
SaSf nSl nSjSd
StacksComments: iteration3qact = getNext(a) ln getNext(fn) fn getNext(j) j nmin=nmax=43 (j2) advance(fn) getNext(ln) ln getNext(d) d nmin=nmax=26 (d1) nmin=ln2 nmax=fn3 advance(Ta)advance(Tln)
TwigStack: Example
StreamsTa: a1, a2, a3
Tfn: fn1, fn2, fn3
Tln: ln1, ln2, ln3
Tj: j1, j2
Td: d1, d2
...allauthors
(1,5:60,2)
author1(1,6:20,3)
fn1(1,7:9,4)
jane1(1,8,5)
poe(1,11,5)
author2
fn2
john doe1(1,26,5)
author3
fn3 ln3
jane2(1,43,5)
doe2(1,46,5)
ln1 ln2
author
l n
j ane doe
fn
SaSf nSl nSjSd
StacksComments: iteration4qact = getNext(a) d getNext(fn) fn getNext(j) j nmin=nmax=43 (j2) getNext(ln) d getNext(d) d nmin=nmax=26 (d1) nmin=26(d1) nmax=fn3advance(Td)
TwigStack: Example...
allauthors(1,5:60,2)
author1(1,6:20,3)
fn1(1,7:9,4)
jane1(1,8,5)
poe(1,11,5)
author2
fn2
john doe1(1,26,5)
author3
fn3 ln3
jane2(1,43,5)
doe2(1,46,5)
ln1 ln2
StreamsTa: a1, a2, a3
Tfn: fn1, fn2, fn3
Tln: ln1, ln2, ln3
Tj: j1, j2
Td: d1, d2
author
l n
j ane doe
fn
StacksComments: iteration5qact = getNext(a) a getNext(fn) fn getNext(j) j nmin=nmax=43 (j2) getNext(ln) ln getNext(d) d nmin=nmax=46 (d2) nmin=fn3 nmax=ln3moveStreamToStack(Ta) advance(Ta)
TwigStack: Example
SaSfnSlnSjSd
a3
...allauthors
(1,5:60,2)
author1(1,6:20,3)
fn1(1,7:9,4)
jane1(1,8,5)
poe(1,11,5)
author2
fn2
john doe1(1,26,5)
author3
fn3 ln3
jane2(1,43,5)
doe2(1,46,5)
ln1 ln2
StreamsTa: a1, a2, a3
Tfn: fn1, fn2, fn3
Tln: ln1, ln2, ln3
Tj: j1, j2
Td: d1, d2
author
l n
j ane doe
fn
StacksComments: iteration6qact = getNext(a) fn getNext(fn) fn getNext(j) j nmin=nmax=43 (j2) getNext(ln) ln getNext(d) d nmin=nmax=46 (d2) nmin=fn3 nmax=ln3moveStreamToStack(Tfn) advance(Tfn)
TwigStack: Example
SaSfnSlnSjSd
a3fn3
...allauthors
(1,5:60,2)
author1(1,6:20,3)
fn1(1,7:9,4)
jane1(1,8,5)
poe(1,11,5)
author2
fn2
john doe1(1,26,5)
author3
fn3 ln3
jane2(1,43,5)
doe2(1,46,5)
ln1 ln2
StreamsTa: a1, a2, a3
Tfn: fn1, fn2, fn3
Tln: ln1, ln2, ln3
Tj: j1, j2
Td: d1, d2
author
l n
j ane doe
fn
StacksComments: iteration7qact = getNext(a) j getNext(fn) j getNext(j) j nmin=nmax=43 (j2) getNext(ln) ln getNext(d) d nmin=nmax=46 (d2) nmin=43(j2) nmax=ln3moveStreamToStack(Tj) advance(Tj) pop(Sj)showSolutionsWithBlocking(j)
TwigStack: Example
“Merge-joinable” root-to-leaf path: (j2, fn3, a3)
SaSfnSlnSjSd
a3fn3j2
...allauthors
(1,5:60,2)
author1(1,6:20,3)
fn1(1,7:9,4)
jane1(1,8,5)
poe(1,11,5)
author2
fn2
john doe1(1,26,5)
author3
fn3 ln3
jane2(1,43,5)
doe2(1,46,5)
ln1 ln2
StreamsTa: a1, a2, a3
Tfn: fn1, fn2, fn3
Tln: ln1, ln2, ln3
Tj: j1, j2
Td: d1, d2
author
l n
j ane doe
fn
StacksComments: iteration8qact = getNext(a) ln3 getNext(fn) nil getNext(j) nil nmin=nmax=nil getNext(ln) ln getNext(d) d nmin=nmax=46 (d2) nmin=ln3 nmax=ln3moveStreamToStack(Tln) advance(Tln)
TwigStack: Example
“Merge-joinable” root-to-leaf path: (j2, fn3, a3)
SaSfnSlnSjSd
a3fn3ln3
...allauthors
(1,5:60,2)
author1(1,6:20,3)
fn1(1,7:9,4)
jane1(1,8,5)
poe(1,11,5)
author2
fn2
john doe1(1,26,5)
author3
fn3 ln3
jane2(1,43,5)
doe2(1,46,5)
ln1 ln2
StreamsTa: a1, a2, a3
Tfn: fn1, fn2, fn3
Tln: ln1, ln2, ln3
Tj: j1, j2
Td: d1, d2
author
l n
j ane doe
fn
StacksComments: iteration9qact = getNext(a) ln3 getNext(fn) nil getNext(j) nil nmin=nmax=nil getNext(ln) d getNext(d) d nmin=nmax=46 (d2) nmin=d nmax=dmoveStreamToStack(Td) advance(Td) pop(Sd)showSolutionsWithBlocking(d)
TwigStack: Example
“Merge-joinable” root-to-leaf paths: (j2, fn3, a3)
(d2, ln3, a3)
SaSfnSlnSjSd
a3fn3ln3d2
...allauthors
(1,5:60,2)
author1(1,6:20,3)
fn1(1,7:9,4)
jane1(1,8,5)
poe(1,11,5)
author2
fn2
john doe1(1,26,5)
author3
fn3 ln3
jane2(1,43,5)
doe2(1,46,5)
ln1 ln2
StreamsTa: a1, a2, a3
Tfn: fn1, fn2, fn3
Tln: ln1, ln2, ln3
Tj: j1, j2
Td: d1, d2
author
l n
j ane doe
fn
StacksComments: Phase212: MergeAllPathSolutions()
TwigStack: Example
TwigStack solution:
(j2, fn3, d2, ln3, a3)
SaSfnSlnSjSd
a3fn3ln3
StreamsTa: a1, a2, a3
Tfn: fn1, fn2, fn3
Tln: ln1, ln2, ln3
Tj: j1, j2
Td: d1, d2
...allauthors
(1,5:60,2)
author1(1,6:20,3)
fn1(1,7:9,4)
jane1(1,8,5)
poe(1,11,5)
author2
fn2
john doe1(1,26,5)
author3
fn3 ln3
jane2(1,43,5)
doe2(1,46,5)
ln1 ln2
Analysis of TwigStack• Let getNext(q) = qN
– qN has minimum descendant extension
– for all qi subtreeNodes(qN) next(Tqi) = hqi
– Either q=qN or parent(qN) has no min right extension
• Any ancestor of qN whose extension uses hqn is returned by getNext before qN => correctness (TwigStack finds all solutions to q)
• TwigStack is time and space optimal for ancestor-descendant edges
Suboptimality for parent-child edges
Example
A1
A2 B2
B1
C2
C1
A
B C
final solutions
TS Phase1 solutions:
(A1, B2, C2)
(A2, B1, C1)
(A1, B1, C1)
(A1, B1, C2)Would be optimal for:
A
B C
TwigStack and XB-Treesa1
(2:95)
a2(3:50)
a3(6:48)
a4(10:45)
a5(20:30)
a6(55:58)
a7(60:94)
a8(62:75)
a10(80:88)
a9(70:72)
a11(80:88)
• XB-Trees - B+ trees with some additional features1
-Internal nodes have the form [L:R], sorted on L
-Parent node interval includes child node intervals
-Each page P has pointer P.parent
• TwigStackXB – same as TwigStack with the following modifications
-Tq for a query node with an index is now the XB tree rather than a stream
-The advance operation is modified according to the pointer act=(actPage,actIndex)
- The drilldown operation is introduced
2:95 20:88
2:95 6:48
2:95
3:50
6:48
10:45
20:58 60:94
2:95
50:58
60:94
62:75
80:88
82:86
80:88
70:72
1. “An Evaluation of XML indexes for Structural Join” demonstrates that while all – B+, XR and XB trees build the same tree structure, for “highly recursive” XML XB trees outperform the other two
Experimental Results
PS vs TS for binary twig query PS vs TS for parent-child query
Questions?