click to edit present’s name slice: reviving regions-based pruning for reverse k nearest neighbors...

18
Never Stand Still Faculty of Engineering Computer Science and Engineering Click to edit Present’s Name Never Stand Still Faculty of Engineering Computer Science and Engineering SLICE: Reviving Regions-Based Pruning for Reverse k Nearest Neighbors Queries Shiyu Yang 1 , Muhammad Aamir Cheema 2,1 , Xuemin Lin 1,3 , Ying Zhang 4,1 1 The University of New South Wales, Australia 2 Monash University, Australia 3 East China Normal University, China 4 University of Technology, Sydney, Australia

Upload: jasmin-francis

Post on 16-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Click to edit Present’s Name SLICE: Reviving Regions-Based Pruning for Reverse k Nearest Neighbors Queries Shiyu Yang 1, Muhammad Aamir Cheema 2,1, Xuemin

Never Stand Still Faculty of Engineering Computer Science and Engineering

Click to edit Present’s Name

Never Stand Still Faculty of Engineering Computer Science and Engineering

SLICE: Reviving Regions-Based Pruning for Reverse k Nearest Neighbors Queries

Shiyu Yang1, Muhammad Aamir Cheema2,1, Xuemin Lin1,3, Ying Zhang4,1

1The University of New South Wales, Australia2 Monash University, Australia

3 East China Normal University, China4University of Technology, Sydney, Australia

Page 2: Click to edit Present’s Name SLICE: Reviving Regions-Based Pruning for Reverse k Nearest Neighbors Queries Shiyu Yang 1, Muhammad Aamir Cheema 2,1, Xuemin

School of Computer Science and Engineering2

Introduction• k Nearest Neighbor Query

– Find the facility that is one of k-closest facilities to the query user.

• Reverse k Nearest Neighbor Query– Find every user for which the query

facility is one of the k-closest facilities.

• RkNNs are the potential customers of a facility

u1

f1

u2

u3f3

f2

K=1

Page 3: Click to edit Present’s Name SLICE: Reviving Regions-Based Pruning for Reverse k Nearest Neighbors Queries Shiyu Yang 1, Muhammad Aamir Cheema 2,1, Xuemin

School of Computer Science and Engineering3

Related Work

Pruning Verification

Half-space

Region-based

TPL (VLDB 2004),FINCH (VLDB 2008),InfZone (ICDE 2011)

Six-regions (SIGMOD 2000)

Six-regions (SIGMOD 2000)

TPL (VLDB 2004)

FINCH (VLDB 2008)

Boost (SIGMOD 2010)

InfZone (ICDE2011)

Page 4: Click to edit Present’s Name SLICE: Reviving Regions-Based Pruning for Reverse k Nearest Neighbors Queries Shiyu Yang 1, Muhammad Aamir Cheema 2,1, Xuemin

School of Computer Science and Engineering4

Related Work

• Regions-based Pruning:

-Six-regions(SIGMOD 2000)

1. Divide the whole space centred at the query q into six equal regions

2. Find the k-th nearest neighbor in each Partition.

3. The k-th nearest facility of q in each region defines the area that can be pruned

k=2

The user points that cannot be pruned should be verified by range query

ba

c

d

q

u1

u2

Page 5: Click to edit Present’s Name SLICE: Reviving Regions-Based Pruning for Reverse k Nearest Neighbors Queries Shiyu Yang 1, Muhammad Aamir Cheema 2,1, Xuemin

School of Computer Science and Engineering5

Related Work

• Half-space Pruning: the space that is contained by k half-

spaces can be pruned

-TPL (VLDB 2004)

1. Find the nearest facility f in the unpruned area.

2. Draw a bisector between q and f, prune by using the half-space

3. Iteratively access the nearest facility in unpruned area.

k=2

ba

c

d

q

Page 6: Click to edit Present’s Name SLICE: Reviving Regions-Based Pruning for Reverse k Nearest Neighbors Queries Shiyu Yang 1, Muhammad Aamir Cheema 2,1, Xuemin

School of Computer Science and Engineering6

Related Work

• Half-space Pruning: -InfZone(ICDE 2011)

1. The influence zone corresponds to the unpruned area when the bisectors of all the facilities have been considered for pruning.

2. A point p is a RkNN of q if and only if p lies inside unpruned area.

3. No verification phase.

Half-space pruning is expensive especially when k is large.

k=2

ba

c

d

q

Page 7: Click to edit Present’s Name SLICE: Reviving Regions-Based Pruning for Reverse k Nearest Neighbors Queries Shiyu Yang 1, Muhammad Aamir Cheema 2,1, Xuemin

School of Computer Science and Engineering7

Related Work

Regions-based

Half-space

VS

Range query

Pruning CostO(m log k) O(km2

)

Pruning Power

Verification Cost

Low High

Can regions-based pruning do better?

O(log m)

SLICE

O(m log m)

High

O(k)

m is the # of facilities considered for pruning

Page 8: Click to edit Present’s Name SLICE: Reviving Regions-Based Pruning for Reverse k Nearest Neighbors Queries Shiyu Yang 1, Muhammad Aamir Cheema 2,1, Xuemin

School of Computer Science and Engineering8

Notations

• Partition: P

• Subtended angle: ∠a

• Maximum (minimal) subtended

angle w.r.t P (, )

• Upper (lower) arc– Center: q– Radius: =

q

f p

a

θmi

n θmax

PUppe

r

Lower 𝒅𝒊𝒔𝒕( 𝒇 ,𝒒)

𝟐𝒄𝒐𝒔(θ𝒎𝒂𝒙 )

𝒅𝒊𝒔𝒕 ( 𝒇 ,𝒒 )𝟐𝒄𝒐𝒔(θ𝒎𝒊𝒏)

Page 9: Click to edit Present’s Name SLICE: Reviving Regions-Based Pruning for Reverse k Nearest Neighbors Queries Shiyu Yang 1, Muhammad Aamir Cheema 2,1, Xuemin

School of Computer Science and Engineering9

Observation -- Pruning• A facility f prunes every point

p ∈ P for whichdist(p,q) > (UpperArc)< 90◦• We can prove a < b.

– a2=b2+c2-2bc∙cos()– b> = – c2-2bc∙cos() < c2-2 c∙cos() =

c2(1- ) <0

• Facility prunes area outside the upper arc of f for every partition P for which < 90◦

q

f p

θ

PUppe

r

a

cb 𝒅𝒊𝒔𝒕( 𝒇 ,𝒒)

𝟐𝒄𝒐𝒔(θ𝒎𝒂𝒙 )θmax

Page 10: Click to edit Present’s Name SLICE: Reviving Regions-Based Pruning for Reverse k Nearest Neighbors Queries Shiyu Yang 1, Muhammad Aamir Cheema 2,1, Xuemin

School of Computer Science and Engineering10

Comparison with Six-regions

q

fSix-region SLICE

Partitions Pruned

No. of Partitions

One

6

Area pruneddist(f,q) 𝑑𝑖𝑠𝑡 ( 𝑓 ,𝑞)2cos(𝜃)

< 90o

any

VS

Page 11: Click to edit Present’s Name SLICE: Reviving Regions-Based Pruning for Reverse k Nearest Neighbors Queries Shiyu Yang 1, Muhammad Aamir Cheema 2,1, Xuemin

School of Computer Science and Engineering11

Pruning Algorithm• Divide space into t partitions

• Compute the upper arc of each partition for facilities.

• The area outside the k-th smallest upper arc (rB) in each partition can be pruned.

• Users in the pruned area can be pruned

• Users in the unpruned area will be verified by accessing significant facilities

q

f1

f2

u1

u2

k=2

Page 12: Click to edit Present’s Name SLICE: Reviving Regions-Based Pruning for Reverse k Nearest Neighbors Queries Shiyu Yang 1, Muhammad Aamir Cheema 2,1, Xuemin

School of Computer Science and Engineering12

Significant Facility Verification• Significant facility:

– A facility f that prunes at least one point p ∈ P lying inside the bounding arc of P.

MN

𝐫 𝐁

P

𝐫 𝐁 𝐫 𝐁

Significant facility cannot be in red area

• Verification for a candidate

Issuing range query

for each candidate

Accessing significant

facilities (O(k))

High I/O cost No additional I/O cost

Regions-based

2

SLICE

q

Page 13: Click to edit Present’s Name SLICE: Reviving Regions-Based Pruning for Reverse k Nearest Neighbors Queries Shiyu Yang 1, Muhammad Aamir Cheema 2,1, Xuemin

School of Computer Science and Engineering13

Theoretical Analyses

• Number of significant facilities

• More analyses can be found in paper

• I/O Cost• Pruning phase:

– Same as circular range query centered at q with radius 2rB

• Verification phase:– Same as circular range query

centered at q with radius rB

2.34k ( θ ⇒ 0)

9k ( θ = 60o)

Page 14: Click to edit Present’s Name SLICE: Reviving Regions-Based Pruning for Reverse k Nearest Neighbors Queries Shiyu Yang 1, Muhammad Aamir Cheema 2,1, Xuemin

School of Computer Science and Engineering14

Experiments

• Data Set :• Synthetic data :

– Size:50000, 100000, 150000 or 200000

– Distribution: Uniform or Normal

• Real data: The real data set consists of 175, 812 points in North America

• Algorithms: – Six-regions, InfZone and

SLICE

– Page size 4KB and number of buffers for Six-regions is 10

– Number of partitions for SLICE is 12

Page 15: Click to edit Present’s Name SLICE: Reviving Regions-Based Pruning for Reverse k Nearest Neighbors Queries Shiyu Yang 1, Muhammad Aamir Cheema 2,1, Xuemin

School of Computer Science and Engineering15

Experiments

• Effect of different values of k

I/O CPU

Page 16: Click to edit Present’s Name SLICE: Reviving Regions-Based Pruning for Reverse k Nearest Neighbors Queries Shiyu Yang 1, Muhammad Aamir Cheema 2,1, Xuemin

School of Computer Science and Engineering16

Experiments

• Effect of data distribution • Effect of % users

Page 17: Click to edit Present’s Name SLICE: Reviving Regions-Based Pruning for Reverse k Nearest Neighbors Queries Shiyu Yang 1, Muhammad Aamir Cheema 2,1, Xuemin

School of Computer Science and Engineering17

Experiments

• Effect of partitions • Number of significant facilities

Number of partitions

Value of k

Page 18: Click to edit Present’s Name SLICE: Reviving Regions-Based Pruning for Reverse k Nearest Neighbors Queries Shiyu Yang 1, Muhammad Aamir Cheema 2,1, Xuemin

Thanks!

Q&A