selectivity estimation of xpath for cyclic graphs
DESCRIPTION
Selectivity Estimation of XPath for Cyclic Graphs. Yun Peng. Outline. Motivation Problem definition Prime number labeling Selectivity estimation Implementation. Motivation. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Selectivity Estimation of XPath for Cyclic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022062321/56813631550346895d9dad24/html5/thumbnails/1.jpg)
Selectivity Estimation of XPath for Cyclic Graphs
Yun Peng
![Page 2: Selectivity Estimation of XPath for Cyclic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022062321/56813631550346895d9dad24/html5/thumbnails/2.jpg)
Outline
Motivation Problem definition Prime number labeling Selectivity estimation Implementation
![Page 3: Selectivity Estimation of XPath for Cyclic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022062321/56813631550346895d9dad24/html5/thumbnails/3.jpg)
![Page 4: Selectivity Estimation of XPath for Cyclic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022062321/56813631550346895d9dad24/html5/thumbnails/4.jpg)
Motivation To retrieve sub graphs from large graph
databases efficiently, selectivity estimation is one of the most important query optimization technologies
![Page 5: Selectivity Estimation of XPath for Cyclic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022062321/56813631550346895d9dad24/html5/thumbnails/5.jpg)
An Example
Query q=//faculty[//RA][//TA] means to list all faculties that have both RA and TA To evaluate this query, we have two evaluation plans
One plan Find out faculties having RA. Result set size is 3. Find out faculties having TA from the intermediate results
Another plan Find out faculties having TA. Result set size is 2. Find out faculties having RA from the intermediate results
department
facul ty facul ty facul ty facul ty
name RA name TA RA TA RA RAname name
![Page 6: Selectivity Estimation of XPath for Cyclic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022062321/56813631550346895d9dad24/html5/thumbnails/6.jpg)
Problem Definition
Selectivity estimation is that given a query, estimate how many results are produced by this query without costly evaluation
department
facul ty facul ty facul ty facul ty
name RA name TA RA TA RA RAname name
q=//faculty[//RA]
Selectivity(q) = 3
![Page 7: Selectivity Estimation of XPath for Cyclic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022062321/56813631550346895d9dad24/html5/thumbnails/7.jpg)
Our methodology skeleton
Step1: label the graph nodes (pre-prepared)
Step2: Estimate query selectivity based on the pre-prepared labels (after a query comes)
![Page 8: Selectivity Estimation of XPath for Cyclic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022062321/56813631550346895d9dad24/html5/thumbnails/8.jpg)
Prime number labeling
Label each graph node with an integer, which is production of some prime numbers
![Page 9: Selectivity Estimation of XPath for Cyclic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022062321/56813631550346895d9dad24/html5/thumbnails/9.jpg)
Prime number labeling (cont.) Divisibility of labels implies ancestor-descendent
relationship
For example, 3*5*7*11 is divisible by 11, so node g is descendent of node a
![Page 10: Selectivity Estimation of XPath for Cyclic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022062321/56813631550346895d9dad24/html5/thumbnails/10.jpg)
Optimization
Replace integers by vectors
1 1 1 1
1 1 0 0
1 0 1 1
1 0 0 0
0 1 0 0
0 0 1 0
1 0 0 1
a
b
c
d
e
f
g
![Page 11: Selectivity Estimation of XPath for Cyclic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022062321/56813631550346895d9dad24/html5/thumbnails/11.jpg)
Optimization (cont.)
( ) ( ) 0VL a VL b implies node b is descendent of node a
![Page 12: Selectivity Estimation of XPath for Cyclic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022062321/56813631550346895d9dad24/html5/thumbnails/12.jpg)
Our methodology skeleton
Step1: label the graph nodes (pre-prepared)
Step2: Estimate query selectivity based on the pre-prepared labels (after a query comes)
![Page 13: Selectivity Estimation of XPath for Cyclic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022062321/56813631550346895d9dad24/html5/thumbnails/13.jpg)
Selectivity Estimation
Two dimensional histogram Originally designed for selectivity estimation on
trees [Jargadish 2004] Label each tree node by an interval, e.g. (l, r) Represent the interval by a dot (l, r) on the XOY
coordination system Partition the XOY plain to grids as buckets Estimate results using this histogram
![Page 14: Selectivity Estimation of XPath for Cyclic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022062321/56813631550346895d9dad24/html5/thumbnails/14.jpg)
Selectivity Estimation (cont.)
![Page 15: Selectivity Estimation of XPath for Cyclic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022062321/56813631550346895d9dad24/html5/thumbnails/15.jpg)
Optimization
Replace integers by vectors
1 1 1 1
1 1 0 0
1 0 1 1
1 0 0 0
0 1 0 0
0 0 1 0
1 0 0 1
a
b
c
d
e
f
g
![Page 16: Selectivity Estimation of XPath for Cyclic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022062321/56813631550346895d9dad24/html5/thumbnails/16.jpg)
Consecutive Ones Property Matrix Given a 0/1 matrix, if we can find an order of
columns such that all row’s 1s are consecutive, this matrix is called consecutive ones property matrix (C1P matrix)
Reorganization is linear Find the largest C1P sub matrix is NP and if 1s
number of each column is larger than 3, it is un- polynomial time approximatable
![Page 17: Selectivity Estimation of XPath for Cyclic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022062321/56813631550346895d9dad24/html5/thumbnails/17.jpg)
Add extra columns
0 1 2 3
1 1 1 1
1 1 0 0
1 0 1 1
1 0 0 0
0 1 0 0
0 0 1 0
1 0 0 1
a
b
c
d
e
f
g
0 1 2 3 4
1 1 1 1 0
1 1 0 0 0
0 0 1 1 1
: 4 01 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 1 1
a
b
c
Mapd
e
f
g
![Page 18: Selectivity Estimation of XPath for Cyclic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022062321/56813631550346895d9dad24/html5/thumbnails/18.jpg)
Add extra columns
Given a 0/1 matrix, add minimum number of extra columns such that result matrix is a C1P matrix is NP?
![Page 19: Selectivity Estimation of XPath for Cyclic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022062321/56813631550346895d9dad24/html5/thumbnails/19.jpg)
Heuristic algorithm
Duplicate Merge
1
2
3
1 2 3 4 5 6
1 1 1 0 1 1
0 1 1 0 1 0
0 0 1 1 1 1
r
r
r
![Page 20: Selectivity Estimation of XPath for Cyclic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022062321/56813631550346895d9dad24/html5/thumbnails/20.jpg)
Heuristic algorithm (cont.)
![Page 21: Selectivity Estimation of XPath for Cyclic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022062321/56813631550346895d9dad24/html5/thumbnails/21.jpg)
Heuristic Algorithm (cont.)
1
2
3
1 2 3 4 5 6
1 1 1 0 1 1
0 1 1 0 1 0
0 0 1 1 1 1
r
r
r
1
2
3
1 2 3 6 5 4
1 1 1 1 1 0
0 1 1 0 1 0 0
0 0 1 1 1 1
r
r
r
![Page 22: Selectivity Estimation of XPath for Cyclic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022062321/56813631550346895d9dad24/html5/thumbnails/22.jpg)
Selectivity Estimation (cont.)
![Page 23: Selectivity Estimation of XPath for Cyclic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022062321/56813631550346895d9dad24/html5/thumbnails/23.jpg)
Implementation
![Page 24: Selectivity Estimation of XPath for Cyclic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022062321/56813631550346895d9dad24/html5/thumbnails/24.jpg)
Implementation
![Page 25: Selectivity Estimation of XPath for Cyclic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022062321/56813631550346895d9dad24/html5/thumbnails/25.jpg)
Implementation
![Page 26: Selectivity Estimation of XPath for Cyclic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022062321/56813631550346895d9dad24/html5/thumbnails/26.jpg)