monitoring k-nn queries over moving objects xiaohui yu university of toronto xhyu@cs.toronto.edu...

Post on 20-Jan-2016

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Monitoring k-NN Queries over Moving Objects

Xiaohui Yu

University of Toronto

xhyu@cs.toronto.edu

Joint work with Ken Pu and Nick Koudas

k-Nearest Neighbors

• k-NN search: Given a set of points, find the k points that are closest to the query point.

• We focus on k-NN for spatio-temporal data

Problem

• A set of moving objects P on a 2D plane

• Monitoring the nearest neighbors of query points (Q) in a specified region over time

t1

q

13

2

4

5 6

7

8

t2

q

1

3

2

4

56

7

8

Applications• Location-based advertising

– E-flyers distribution – identifying the customers closest to the store

• Location-based mixed reality games

Outline

• Related work• Object-Indexing

– Overhaul algorithm

– Incremental index update/query answering

• Query-Indexing• Hierarchical Object-Indexing• Experiments• Conclusions

Previous Work

• Most work has focused on predictive queries– “Who will be my NNs five minutes from now? ”

• Assumption: the trajectories of the objects are fully predictable – linear/non-linear/autoregressive functions– very frequent updates/re-evaluations when the a

ssumption does not hold [Sun et al. 2004]

Previous Work

• The assumption is often violated in real applications where the objects’ movements are non-predictable.

• No assumptions on the motion of objects: arbitrary speeds/directions

receive buffer

updates ……

),( ii yx

snapshot buffer

),( ti

ti yx

t

list :1 kNNq

list :2 kNNq

list : kNNq j

result

tt

Our approach

Grid-based index structures

• Residing in main memory

• Partition the space into N×N grids

• Easy to maintain, supporting fast query processing

Outline

• Related work• Object-Indexing

– Overhaul algorithm

– Incremental index update/query answering

• Query-Indexing• Hierarchical Object-Indexing• Experiments• Conclusions

Object Indexing

2oid1oid 3oid

object-id list

(x,y)

snapshot buffer

1nn 2nnq

kNN list

1

2

43

q

Example: Finding 3-NN

5

6

d

Current result: 3Current result: 3,2Current result: 3,2,1Final result: 3,2,4

The algorithm

1. Initial computation– Progressively enlarge the search region– Until k neighbors are found

2. Calculate the critical region Rcrit (guaranteed to contain the query’s k-NNs)

3. Search in the region for the k-NNs.

The overhaul algorithm

Overhaul algorithm: Analysis

• Notation– NP - number of objects

– NQ - number of queries

• Running time breakdown – Tindex: a0×NP

– Tquery: a1× NQ ×(# of cells in Rcrit) + a2 × NQ ×(# of objects in Rcrit)

• Optimal cell size to minimize Tquery (assuming uniformity) :– Proportional to

PN

1

Analysis – non-uniform data

• Reasonably skewed distributions: Tquery

• Highly skewed: Tquery

NNk

l

l

crit

crit

th - thequery to thefrom distance average :

step intial in theneighbor farthest thequery to thefrom distance :

PN

QPP NbNbNbT )( 02

21query

PN

Measure of non-uniformity

Outline

• Related work• Object-Indexing

– Overhaul algorithm

– Incremental index update/query answering

• Query-Indexing• Hierarchical Object-Indexing• Experiments• Conclusions

From overhaul to incremental…

• Incremental update of the object-index– Check if the new position falls in the same cell as in the

previous cycle• Yes – do nothing

• No – remove it from old cell, insert it into the new cell

• Incremental query answering– Compute the critical region based its previous k-NN

– Search the critical region for the current k-NN

Which one is better?

• Mobility is the key

• Index maintenance– The probability of exiting the current cell is crucial

• Incremental query answering– When mobility is low, the cost of query answeri

ng is– Worst case: O(NP )

)( PNO

Outline

• Related work• Object-Indexing

– Overhaul algorithm

– Incremental index update/query answering

• Query-Indexing• Hierarchical Object-Indexing• Experiments• Conclusions

Query-Indexing

• Cost of Object-Indexing dominated by indexing time when NQ is small

• Indexing queries instead of objects

Rcrit(q1 )

Rcrit(q2 )

Query-Index

q1 q2

Query List

q1

k-NN

q2

k-NN(x,y)

snapshot buffer

Constructing a Query Index

Rcrit(q) Query List

cell1

……

cell2

cell3

cell4

q

Query answering

Rcrit(q1 )

Rcrit(q2 )

q1 q2

Query List

q1

k-NN

q2

k-NN(x,y)

snapshot buffer

Query-Indexing: algorithm

• Index-building– Compute the critical region– Insert a query into cells contained in its critical

region

• Query-answering– For each object

• Determine the cell it belongs to• For each query registered with the cell, update its k-

NN if necessary.

Analysis• indexing time + query answering time• Theoretically,

– QI suffers from less localized access to objects

)indexobject ( index)query ( queryquery TT

Rcrit(q1 )

Rcrit(q2 )

q1 q2

Query List

q1

k-NN

q2

k-NN (x,y)

snapshot buffer

• QI is preferable when NQ is small

Outline

• Related work• Object-Indexing

– Overhaul algorithm

– Incremental index update/query answering

• Query-Indexing• Hierarchical Object-Indexing• Experiments• Conclusions

Problem with one-level object indexing

Hierarchical Object-Indexing

• Split the overly crowded cells into smaller cells

Refined approximation

q

Outline

• Related work• Object-Indexing

– Overhaul algorithm

– Incremental index update/query answering

• Query-Indexing• Hierarchical Object-Indexing• Experiments• Conclusions

Experiments

• Verify the analytical results– NP, NQ, k, velocity, cell-size

• Compare the performance of the proposed structures with that of R-tree-based methods– NP, NQ, k, velocity, skew

Highlights

1. Tindex and Tquery of Object-Indexing

2. Optimal cell size

3. Overhaul v.s. incremental computation as velocities of objects vary

4. Object-Indexing v.s. Query-Indexing

5. Comparison of grid-based algorithms with R-tree-based algorithms

Performance of overhaul w.r.t. NP

Effect of cell-size on performance

Overhaul v.s. incremental index maintenance

Object-Indexing v.s. Query Indexing

Comparison with R-trees – datasets

Simulation using Illinois road network (600km×600km)

uniform skewed hi-skewed

Comparison with R-trees

NP = 100,000, NQ = 5,000, k = 10

Conclusions

• We proposed two solutions to monitor k-NN– Object-Indexing– Query-Indexing

• Extensions to handle skewed data

• Outperform R-tree-based solutions

top related