1 hypersphere dominance: an optimal approach cheng long, raymond chi-wing wong, bin zhang, min xie...

27
1 Hypersphere Dominance: An Optimal Approach Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min Xie The Hong Kong University of Science and Technology Prepared by Cheng Long Presented by Cheng Long 24 June, 2014

Upload: barrie-rose

Post on 11-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Hypersphere Dominance: An Optimal Approach Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min Xie The Hong Kong University of Science and Technology Prepared

1

Hypersphere Dominance: An Optimal Approach

Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min XieThe Hong Kong University of Science and Technology

Prepared by Cheng LongPresented by Cheng Long

24 June, 2014

Page 2: 1 Hypersphere Dominance: An Optimal Approach Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min Xie The Hong Kong University of Science and Technology Prepared

Hyperspheres

A hypersphere in a d-dimensional space (center, radius) the set of all points that have their distances from

the center bounded by the radius

2

π‘π‘Ÿ π‘Ÿ

𝑐

2D: a disk 3D: a ball

Page 3: 1 Hypersphere Dominance: An Optimal Approach Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min Xie The Hong Kong University of Science and Technology Prepared

Hyperspheres are commonly used Uncertain databases

the location of an uncertain object Spatial databases

SS-tree, SS+-tree, M-tree, VP-tree and SR-tree

3

SS-tree: similar to R-tree with hyperrectangles replaced by hyperspheres

SS-tree based on A-Hlayout of 8 objects: A-H

Page 4: 1 Hypersphere Dominance: An Optimal Approach Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min Xie The Hong Kong University of Science and Technology Prepared

Motivating example Scenario

Ada has her location uncertain, but constrained in a disk Sa. Bob has his location uncertain, but constrained in a disk Sb. Connie has her location uncertain, but constrained in a disk Sq.

Question Is Ada always closer to Connie than Bob?

4

(Ada)

Sb (Bob)

Sq (Connie) Sq

(Connie)

(Ada)

Sb (Bob)

No

For this specification of the locations, Ada is closer to Connie than Bob

In fact, for all specifications of the locations, Ada is closer to Connie than Bob

Yes

Page 5: 1 Hypersphere Dominance: An Optimal Approach Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min Xie The Hong Kong University of Science and Technology Prepared

Hypersphere dominance: definition

5

Definition 1: Hypersphere dominanceGiven

, , and , it decides whether

Dominance condition

Yes: No:

Basic operator used in many queries Probabilistic RkNN query [Lian and Chen, VLDBJ’09] AkNN query [Emrich et al., SSDBM’10] kNN query [Long et al., SIGMOD’14]

Page 6: 1 Hypersphere Dominance: An Optimal Approach Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min Xie The Hong Kong University of Science and Technology Prepared

Hypersphere dominance: existing solutionsβ€”overview

MinMax [Roussopoulos et al., SIGMOD Record’95; Hjaltason and Samet, TODS’99]

MBR [Emrich et al., SIGMOD’10]

GP [Lian and Chen, VLDBJ’09]

Trigonometric [Emrich et al., SSDBM’10]

6

Page 7: 1 Hypersphere Dominance: An Optimal Approach Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min Xie The Hong Kong University of Science and Technology Prepared

Hypersphere dominance: existing solutionsβ€”MinMax (1)

7

π‘†π‘Ž 𝑆𝑏

𝑐 π‘Ž 𝑐 π‘π‘Ÿ π‘Ž π‘Ÿ 𝑏

π‘€π‘Žπ‘₯𝐷𝑖𝑠𝑑 (π‘†π‘Ž ,𝑆𝑏)=𝐷𝑖𝑠𝑑 (𝑐 π‘Ž ,𝑐𝑏 )+π‘Ÿπ‘Ž+π‘Ÿ 𝑏 =

( and Sb overlap), – – ( and Sb do not overlap)

π‘†π‘Žπ‘ π‘Ž 𝑐 π‘π‘Ÿ π‘Ž π‘Ÿ 𝑏

𝑆𝑏

Definition: the maximum distance between a point in and a point in Sb

Definition: the minimum distance between a point in Sa and a point in Sb

π‘€π‘Žπ‘₯𝐷𝑖𝑠𝑑 (π‘†π‘Ž ,𝑆𝑏) 𝑀𝑖𝑛𝐷𝑖𝑠𝑑 (π‘†π‘Ž ,𝑆𝑏)𝑀𝑖𝑛𝐷𝑖𝑠𝑑 (π‘†π‘Ž ,𝑆𝑏)=0

𝑆𝑏𝑆 π‘Ž

𝑐 π‘Ž 𝑐 π‘π‘Ÿ π‘Ž π‘Ÿ 𝑏

Page 8: 1 Hypersphere Dominance: An Optimal Approach Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min Xie The Hong Kong University of Science and Technology Prepared

Hypersphere dominance: existing solutionsβ€”MinMax (2)

8

MinMaxCompute Compute If

Return Else

Return

π‘†π‘Ž

SbSq π‘†π‘Ž

Sb

Sq

π‘€π‘Žπ‘₯𝐷𝑖𝑠𝑑 (π‘†π‘Ž ,π‘†π‘ž)𝑀𝑖𝑛𝐷𝑖𝑠𝑑 (𝑆𝑏 ,π‘†π‘ž)

π‘€π‘Žπ‘₯𝐷𝑖𝑠𝑑 (π‘†π‘Ž ,𝑆𝑏)

𝑀𝑖𝑛𝐷𝑖𝑠𝑑 (𝑆𝑏 ,π‘†π‘ž)

π·π‘œπ‘š(π‘†π‘Ž ,𝑆𝑏 ,π‘†π‘ž)=π‘‘π‘Ÿπ‘’π‘’MinMax returns

β€œfalse negative”

<

MinMax returns

>

correct π·π‘œπ‘š(π‘†π‘Ž ,𝑆𝑏 ,π‘†π‘ž)=π‘‘π‘Ÿπ‘’π‘’

bisector and

Page 9: 1 Hypersphere Dominance: An Optimal Approach Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min Xie The Hong Kong University of Science and Technology Prepared

Hypersphere dominance: existing solutions--Insufficiency

Methods Correct? Sound? Efficient?

MinMax Yes No Yes

MBR Yes No Yes

GP Yes No Yes

Trigonometric No Yes Yes

9

Criteria of a method:1. Correctness: No false positive2. Soundness: No false negative3. Efficiency: runs in O(d) where d is the number of dimensionality

Our approach is the only one which is correct, sound and efficient!

Our approach(Hyperbola)

Yes Yes Yes

Page 10: 1 Hypersphere Dominance: An Optimal Approach Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min Xie The Hong Kong University of Science and Technology Prepared

Our approach: major idea Step 1: pre-checking

Do the decision directly Step 2: dominance checking

Drive an equivalent condition of which is easier to decide Do the decision

10

For cases where it is easy to decide whether the dominance condition is true For cases where it is difficult to decide whether the dominance condition is true directly

Page 11: 1 Hypersphere Dominance: An Optimal Approach Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min Xie The Hong Kong University of Science and Technology Prepared

Our approach: pre-checking

11

π‘†π‘Ž

Sb

Sq π‘†π‘Ž

Sb

Sq

Step 1: Pre-checking:If and Sb overlap

Return If Sb and Sq overlap

Return and Sb overlapπ·π‘œπ‘š(π‘†π‘Ž ,𝑆𝑏 ,π‘†π‘ž)= π‘“π‘Žπ‘™π‘ π‘’

Sb and Sq overlapπ·π‘œπ‘š(π‘†π‘Ž ,𝑆𝑏 ,π‘†π‘ž)= π‘“π‘Žπ‘™π‘ π‘’

Page 12: 1 Hypersphere Dominance: An Optimal Approach Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min Xie The Hong Kong University of Science and Technology Prepared

Our approach: dominance checking (1)

12

Dominance condition:

Equivalent condition (1):

Proof of the equivalence between Condition (1) and Condition (2):β€œ=>”: By contradiction β€œ<=”:

Step 2: Dominance checking:Derive an equivalent condition of and check whether the derived condition is true

Page 13: 1 Hypersphere Dominance: An Optimal Approach Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min Xie The Hong Kong University of Science and Technology Prepared

Our approach: dominance checking (5)

13

Equivalent condition (2):

Equivalent condition (3):

π‘€π‘Žπ‘₯𝐷𝑖𝑠𝑑 (π‘ž ,π‘†π‘Ž)=𝐷𝑖𝑠𝑑 (π‘ž ,π‘π‘Ž )+π‘Ÿ π‘Ž+0=𝐷𝑖𝑠𝑑 (π‘ž ,π‘π‘Ž )+π‘Ÿ π‘Ž π‘€βˆˆπ·π‘–π‘ π‘‘(π‘ž ,𝑆𝑏)=𝐷𝑖𝑠𝑑 (π‘ž ,𝑐𝑏)βˆ’π‘Ÿ π‘βˆ’0=𝐷𝑖𝑠𝑑 (π‘ž ,𝑐𝑏)βˆ’π‘Ÿ 𝑏

Page 14: 1 Hypersphere Dominance: An Optimal Approach Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min Xie The Hong Kong University of Science and Technology Prepared

Our approach: dominance checking (3)

14

Space partitioning: Boundary : Region : Region :

Boundary : Region Ra

Region Rb

Equivalent condition (4): is in Region ( is in Region )

SaSb

ca

cb

Sqcq

Equivalent condition (3):

Page 15: 1 Hypersphere Dominance: An Optimal Approach Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min Xie The Hong Kong University of Science and Technology Prepared

Our approach: dominance checking (4)

15

Equivalent condition (5): is in Region and

Equivalent condition (4): is in Region

rq

π‘šπ‘–π‘›π‘₯βˆˆπ‘ƒπ·π‘–π‘ π‘‘ (π‘π‘ž ,π‘₯ )

SaSb

ca

cb

Region Ra

Region Rb

Sqcq

Boundary :

π‘šπ‘–π‘›π‘₯βˆˆπ‘ƒπ·π‘–π‘ π‘‘ (π‘π‘ž ,π‘₯ )>π‘Ÿπ‘ž

is Region

is in Region

Page 16: 1 Hypersphere Dominance: An Optimal Approach Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min Xie The Hong Kong University of Science and Technology Prepared

Our approach (2)

Compute constraint: objective: minimize

We use the Lagrange Multiplier (LM) method. Details could be found in the paper

16

correct sound efficientThe condition (3) is equivalent to the dominance conditionEach condition transformation takes O(d) time and the cost of LM is also O(d)

Equivalent condition (5): is in Region and

Space partitioning: Boundary : Region : Region :

Page 17: 1 Hypersphere Dominance: An Optimal Approach Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min Xie The Hong Kong University of Science and Technology Prepared

Empirical study: set-up

Datasets: Real datasets: NBA, Color, Texture, and Forest Synthetic datasets

Algorithms: MinMax, MBR, GP, Trigonometric, Hyperbola (our

method) Measures:

precision = TP/(TP+FP) recall = TP/(TP+FN) running time

17

A correct method has the precision always equal to 1A sound method has the recall always equal to 1

Criteria of a method:1. Correctness: No false positive (FP)2. Soundness: No false negative (FN)3. Efficiency: runs in O(d) where d is the number of dimensionality

Page 18: 1 Hypersphere Dominance: An Optimal Approach Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min Xie The Hong Kong University of Science and Technology Prepared

Empirical study: results (precision, NBA)

All algorithms except Trigonometric have precisions = 1.

18

Methods Correct? Sound? Efficient?

MinMax Yes No Yes

MBR Yes No Yes

GP Yes No Yes

Trigonometric No Yes Yes

Our approach Yes Yes Yes

Page 19: 1 Hypersphere Dominance: An Optimal Approach Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min Xie The Hong Kong University of Science and Technology Prepared

Empirical study: results (recall, NBA)

Only our approach (Hyperbola) and Trigonometirc have recalls = 1.

19

Methods Correct? Sound? Efficient?

MinMax Yes No Yes

MBR Yes No Yes

GP Yes No Yes

Trigonometric No Yes Yes

Our approach Yes Yes Yes

Page 20: 1 Hypersphere Dominance: An Optimal Approach Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min Xie The Hong Kong University of Science and Technology Prepared

Empirical study: results (running time, NBA)

MinMax < GP < Hyperbola (our method) < MBR < Trigonometric

20

Page 21: 1 Hypersphere Dominance: An Optimal Approach Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min Xie The Hong Kong University of Science and Technology Prepared

Conclusion

First solution for the hypersphere dominance problem, which is correct, sound and efficient for any dimension

An application study: kNN Experiments

21

Page 22: 1 Hypersphere Dominance: An Optimal Approach Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min Xie The Hong Kong University of Science and Technology Prepared

Q & A

22

Page 23: 1 Hypersphere Dominance: An Optimal Approach Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min Xie The Hong Kong University of Science and Technology Prepared

The following slides are for backup use only

23

Page 24: 1 Hypersphere Dominance: An Optimal Approach Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min Xie The Hong Kong University of Science and Technology Prepared

Hyperspheres in uncertain databases

Song and Roussopoulos [SSTD’01] Cheng et al. [TKDE’04] Chen and Cheng [ICDE’07] Beskales et al. [PVLDB’08]

24

Page 25: 1 Hypersphere Dominance: An Optimal Approach Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min Xie The Hong Kong University of Science and Technology Prepared

Our approach (1)

25

Dominance condition:

Equivalent condition (1): :

Major idea:Derive an equivalent condition of and check whether the derived condition is true

Equivalent condition (2):

Equivalent condition (3): and :

Definition 1: Hypersphere dominanceGiven

, , and , it decides whether

Dominance condition

Yes: No:

Page 26: 1 Hypersphere Dominance: An Optimal Approach Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min Xie The Hong Kong University of Science and Technology Prepared

An application study: kNN qeury

kNN query: Given a set D of hyperspheres, , , …, , a query

hypershere , and an integer , the query finds a set of hyperspheres in D each of

which is not dominated by wrt where is the hypersphere in D with the k-th smallest maximum distance from .

Solution: A best-first search algorithm based on SS-tree Some pruning strategies 26

Page 27: 1 Hypersphere Dominance: An Optimal Approach Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min Xie The Hong Kong University of Science and Technology Prepared

27

Boundary : Region Ra

Region RbIllustration 1: 2D space, and are two points (i.e., = 0, = 0)Sb ()

SqSa () cq