monitoring k-nn queries over moving objects xiaohui yu university of toronto [email protected]...
TRANSCRIPT
Monitoring k-NN Queries over Moving Objects
Xiaohui Yu
University of Toronto
Joint work with Ken Pu and Nick Koudas
k-Nearest Neighbors
• k-NN search: Given a set of points, find the k points that are closest to the query point.
• We focus on k-NN for spatio-temporal data
Problem
• A set of moving objects P on a 2D plane
• Monitoring the nearest neighbors of query points (Q) in a specified region over time
t1
q
13
2
4
5 6
7
8
t2
q
1
3
2
4
56
7
8
Applications• Location-based advertising
– E-flyers distribution – identifying the customers closest to the store
• Location-based mixed reality games
Outline
• Related work• Object-Indexing
– Overhaul algorithm
– Incremental index update/query answering
• Query-Indexing• Hierarchical Object-Indexing• Experiments• Conclusions
Previous Work
• Most work has focused on predictive queries– “Who will be my NNs five minutes from now? ”
• Assumption: the trajectories of the objects are fully predictable – linear/non-linear/autoregressive functions– very frequent updates/re-evaluations when the a
ssumption does not hold [Sun et al. 2004]
Previous Work
• The assumption is often violated in real applications where the objects’ movements are non-predictable.
• No assumptions on the motion of objects: arbitrary speeds/directions
receive buffer
updates ……
),( ii yx
snapshot buffer
),( ti
ti yx
t
list :1 kNNq
list :2 kNNq
list : kNNq j
…
result
tt
Our approach
Grid-based index structures
• Residing in main memory
• Partition the space into N×N grids
• Easy to maintain, supporting fast query processing
Outline
• Related work• Object-Indexing
– Overhaul algorithm
– Incremental index update/query answering
• Query-Indexing• Hierarchical Object-Indexing• Experiments• Conclusions
Object Indexing
2oid1oid 3oid
object-id list
(x,y)
snapshot buffer
1nn 2nnq
kNN list
…
1
2
43
q
Example: Finding 3-NN
5
6
d
Current result: 3Current result: 3,2Current result: 3,2,1Final result: 3,2,4
The algorithm
1. Initial computation– Progressively enlarge the search region– Until k neighbors are found
2. Calculate the critical region Rcrit (guaranteed to contain the query’s k-NNs)
3. Search in the region for the k-NNs.
The overhaul algorithm
Overhaul algorithm: Analysis
• Notation– NP - number of objects
– NQ - number of queries
• Running time breakdown – Tindex: a0×NP
– Tquery: a1× NQ ×(# of cells in Rcrit) + a2 × NQ ×(# of objects in Rcrit)
• Optimal cell size to minimize Tquery (assuming uniformity) :– Proportional to
PN
1
Analysis – non-uniform data
• Reasonably skewed distributions: Tquery
• Highly skewed: Tquery
NNk
l
l
crit
crit
th - thequery to thefrom distance average :
step intial in theneighbor farthest thequery to thefrom distance :
PN
QPP NbNbNbT )( 02
21query
PN
Measure of non-uniformity
Outline
• Related work• Object-Indexing
– Overhaul algorithm
– Incremental index update/query answering
• Query-Indexing• Hierarchical Object-Indexing• Experiments• Conclusions
From overhaul to incremental…
• Incremental update of the object-index– Check if the new position falls in the same cell as in the
previous cycle• Yes – do nothing
• No – remove it from old cell, insert it into the new cell
• Incremental query answering– Compute the critical region based its previous k-NN
– Search the critical region for the current k-NN
Which one is better?
• Mobility is the key
• Index maintenance– The probability of exiting the current cell is crucial
• Incremental query answering– When mobility is low, the cost of query answeri
ng is– Worst case: O(NP )
)( PNO
Outline
• Related work• Object-Indexing
– Overhaul algorithm
– Incremental index update/query answering
• Query-Indexing• Hierarchical Object-Indexing• Experiments• Conclusions
Query-Indexing
• Cost of Object-Indexing dominated by indexing time when NQ is small
• Indexing queries instead of objects
Rcrit(q1 )
Rcrit(q2 )
Query-Index
q1 q2
Query List
q1
k-NN
q2
k-NN(x,y)
snapshot buffer
Constructing a Query Index
Rcrit(q) Query List
cell1
……
cell2
cell3
cell4
q
Query answering
Rcrit(q1 )
Rcrit(q2 )
q1 q2
Query List
q1
k-NN
q2
k-NN(x,y)
snapshot buffer
Query-Indexing: algorithm
• Index-building– Compute the critical region– Insert a query into cells contained in its critical
region
• Query-answering– For each object
• Determine the cell it belongs to• For each query registered with the cell, update its k-
NN if necessary.
Analysis• indexing time + query answering time• Theoretically,
– QI suffers from less localized access to objects
)indexobject ( index)query ( queryquery TT
Rcrit(q1 )
Rcrit(q2 )
q1 q2
Query List
q1
k-NN
q2
k-NN (x,y)
snapshot buffer
• QI is preferable when NQ is small
Outline
• Related work• Object-Indexing
– Overhaul algorithm
– Incremental index update/query answering
• Query-Indexing• Hierarchical Object-Indexing• Experiments• Conclusions
Problem with one-level object indexing
Hierarchical Object-Indexing
• Split the overly crowded cells into smaller cells
Refined approximation
q
Outline
• Related work• Object-Indexing
– Overhaul algorithm
– Incremental index update/query answering
• Query-Indexing• Hierarchical Object-Indexing• Experiments• Conclusions
Experiments
• Verify the analytical results– NP, NQ, k, velocity, cell-size
• Compare the performance of the proposed structures with that of R-tree-based methods– NP, NQ, k, velocity, skew
Highlights
1. Tindex and Tquery of Object-Indexing
2. Optimal cell size
3. Overhaul v.s. incremental computation as velocities of objects vary
4. Object-Indexing v.s. Query-Indexing
5. Comparison of grid-based algorithms with R-tree-based algorithms
Performance of overhaul w.r.t. NP
Effect of cell-size on performance
Overhaul v.s. incremental index maintenance
Object-Indexing v.s. Query Indexing
Comparison with R-trees – datasets
Simulation using Illinois road network (600km×600km)
uniform skewed hi-skewed
Comparison with R-trees
NP = 100,000, NQ = 5,000, k = 10
Conclusions
• We proposed two solutions to monitor k-NN– Object-Indexing– Query-Indexing
• Extensions to handle skewed data
• Outperform R-tree-based solutions