the optimal-location query donghui zhang northeastern university coauthors: yang du, tian xia
TRANSCRIPT
![Page 1: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/1.jpg)
The Optimal-Location Query
Donghui ZhangNortheastern University
Coauthors: Yang Du, Tian Xia
![Page 2: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/2.jpg)
Motivation
• “What is the optimal location in Boston area to build a new McDonald’s store?”
• Optimality: maximize the number of customers who think the new store is closer to them.
![Page 3: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/3.jpg)
Formal Definition
• Given a set S of sites, a set O of weighted objects, and a query range Q ,
• Find a location l Q which maximizes
oO o.weight s.t. sS, d(o, l) d(o,s).
• We consider the L1 distance:
|x1 - x2|+|y1 - y2|
![Page 4: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/4.jpg)
Formal Definition
• Given a set S of sites, a set O of weighted objects, and a query range Q ,
• Find a location l Q which maximizes
oO o.weight s.t. sS, d(o, l) d(o,s).
• We consider the L1 distance:
|x1 - x2|+|y1 - y2|
![Page 5: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/5.jpg)
Example
o :3 2
o :4 1 o :5 3
o :6 4
Q
1ss2
![Page 6: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/6.jpg)
Example
l1
1
s2
o :3 2
o :4 1
o :6 4
1210
s
Q
19
22 o :5 3
The “Influence” of l1 is 5+6=11.
![Page 7: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/7.jpg)
Example
l1
1
s2
o :3 2
o :4 1
o :6 4
1218
s
Q19
22 o :5 3
The “Influence” of l1 is 5+6=11.
l2
The Influence of l2 is 5.
![Page 8: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/8.jpg)
Content
• Problem Definition
• Straightforward Solution
• Problem Transformation
• The R-tree-based solution
• The OL-tree
• The VOL-tree
• Performance
![Page 9: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/9.jpg)
Using the RNN Algorithm…
l1
1
s2
o :3 2
o :4 1
o :6 4
1210
s
19
22 o :5 3
The RNNs of l1 are O3 and O4.
![Page 10: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/10.jpg)
Straightforward Solution
1
s2
o :3 2
o :4 1
o :6 4
s
o :5 3
Compute the influence for every location in Q.
Problematic: infinite number of candidates!.
![Page 11: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/11.jpg)
Content
• Problem Definition
• Straightforward Solution
• Problem Transformation
• The R-tree-based Solution
• The OL-tree
• The VOL-tree
• Performance
![Page 12: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/12.jpg)
nn_buffer of an Object
• Any location within the nn_buffer is a closer site if built.
• nn_buffer is a diamond.
O1:4
O2:3
O3:5 O4:6
S1S2
nn_buffer of O4.
![Page 13: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/13.jpg)
Problem Transformation
• Find a location with maximum overlap among objects’ nn_buffer.
O1:4
O2:3
O3:5 O4:6
S1S2
Q Any location here is an optimal location!
![Page 14: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/14.jpg)
The Rotated Coodinate
• Rotate the coordinate 45°.
• All nn_buffers become axis-parallel squares.• Focus on the rotated coordinate.
45o
oX'
X
Y
Y'
x
yx'
y'
![Page 15: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/15.jpg)
Content
• Problem Definition
• Straightforward Solution
• Problem Transformation
• The R-tree-based Solution
• The OL-tree
• The VOL-tree
• Performance
![Page 16: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/16.jpg)
The R-tree-based Solution
• Store the objects in an R-tree.• Retrieve the objects whose nn_buffers
intersect Q.• Plane sweep to find a region which has
maximum overlap.
![Page 17: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/17.jpg)
Two Contributions
1. Object retrieval:– Store point objects,– but retrieve nn_buffers in increasing order of
lower X.
2. Plane sweep:– Straightforwardly: O(n2).– Our method: O(n log n).
![Page 18: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/18.jpg)
Best-first Retrieval• Keep a heap of index entries + objects.
• Sorted in increasing order of nn_buffer’s lower X.
• While heap is not empty, pop an entry.
• If pop an object, send it to plane sweep.• If pop an index entry, push its children
(intersecting Q).
t t
![Page 19: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/19.jpg)
Naïve Plane Sweep
X
Y
O1:4O2:3
O3:52
5
89
12
4
O4:6
-∞ 2 5 8 9 12 +∞0 5 12 7 3 0
![Page 20: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/20.jpg)
Not Efficient! O(n2)
-∞ 2 5 8 9 12 +∞0 5 12 7 3 0
Suppose next insertion: add 2 to the Y-range [2,11].
+2
-∞ 2 5 8 9 12 +∞0 7 14 9 3 0
115
![Page 21: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/21.jpg)
The aSB-tree
-∞ 2 5 8 9 12 +∞0 5 12 7 3 0
-∞ 5 9 +∞0 0 0
Extended from the SB-tree [YW01]:• keeps max overlap information at index entries.• handle a query range Q.
![Page 22: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/22.jpg)
-∞ 2 5 8 9 12 +∞0 5 12 7 3 0
Suppose next insertion: add 2 to the Y range [2,11].
+2
-∞ 5 9 +∞0 0 0
The aSB-tree
![Page 23: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/23.jpg)
-∞ 2 5 8 9 12 +∞0 5 12 7 3 0
Suppose next insertion: add 2 to the Y range [2,11].
-∞ 5 9 +∞0 2 0
+2 +2
The aSB-tree
![Page 24: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/24.jpg)
-∞ 2 5 8 9 12 +∞0 7 12 7 3 0
Suppose next insertion: add 2 to the Y range [2,11].
-∞ 5 9 +∞0 2 0
511
The aSB-tree
![Page 25: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/25.jpg)
Content
• Problem Definition
• Straightforward Solution
• Problem Transformation
• The R-tree-based Solution
• The OL-tree
• The VOL-tree
• Performance
![Page 26: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/26.jpg)
The OL-tree
• Idea: partition the space, and keep max overlapped region for each partition!
• Like a k-d-B-tree.
• An nn_buffer may have multiple copies.
• Stores nn_buffers. 1
2
3
4
1: add to fullcover.2,3,4: recursively insert.
![Page 27: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/27.jpg)
• Index entry has, besides range:– fullcover: total weight of nn_buffers fully
covering the whole area;– localmax: among the nn_buffers inserted into the
sub-tree, max overlap.– maxrange: the region where localmax occurred.
• Leaf entry:– A rectangle and its weight.
Stored Information
![Page 28: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/28.jpg)
r1 , 0, 4)(
r2 , 1, 4)(
r 3 , 2, 7)(
r32( , 2, 3) r31, 4, 3)(
r33( , 1, 2)
rroot( , 0, 9)
sub-trees omitted
![Page 29: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/29.jpg)
r1 , 0, 4)(
r2 , 1, 4)(
r 3 , 2, 7)(
r32( , 2, 3) r31, 4, 3)(
r33( , 1, 2)
rroot( , 0, 9)
sub-trees omitted
fullcover: 2 nn_buffers fully cover r3
localmax: Among those inserted,
max overlap is 7
maxrange: where localmax occurred
![Page 30: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/30.jpg)
Query Processing• Start with root, insert index entries into heap.
• Sorting key: upper bound of real max overlap in the sub-tree.– localmax + fullcovers of ancestor entries.– Accurate if Q intersects with maxrange.
![Page 31: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/31.jpg)
r1 , 0, 4)(
r2 , 1, 4)(
r 3 , 2, 7)(
r32( , 2, 3) r31, 4, 3)(
r33( , 1, 2)
rroot( , 0, 9)
sub-trees omitted
localmax
Real max overlap = 0+2+1 +localmax = 5
![Page 32: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/32.jpg)
Query Processing• Start with root, insert index entries into heap.
• Sorting key: upper bound of real max overlap in the sub-tree.– localmax + fullcovers of ancestor entries.– Accurate if Q intersects with maxrange.
• Keep a running value: max overlap M.
• Pruning 1: Q intersects with maxrange.
• Pruning 2: upper bound of max overlap < M.
![Page 33: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/33.jpg)
r1 , 0, 4)(
r2 , 1, 4)(
r 3 , 2, 7)(
r32( , 2, 3) r31, 4, 3)(
r33( , 1, 2)
rroot( , 0, 9)
sub-trees omitted
Q • r2 is pruned since Q intersects r2.maxrange. M = 0+1+4=5.
• r1 is pruned since the upper bound of overlap = 4 < M.
![Page 34: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/34.jpg)
r1 , 0, 4)(
r2 , 1, 4)(
r 3 , 2, 7)(
r32( , 2, 3) r31, 4, 3)(
r33( , 1, 2)
rroot( , 0, 9)
sub-trees omitted
Sometimes, we need to examine a leaf node. Plane sweep it!
![Page 35: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/35.jpg)
OL-tree VOL-tree
• OL-tree is not practical – worst-case space complexity O(n2)– complex re-organization
• How to improve?– Only keep a few top levels of the OL-tree.
==> Virtual OL-tree!
![Page 36: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/36.jpg)
VOL-tree
![Page 37: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/37.jpg)
Example
If Q is here, perform range search on the R-tree.
![Page 38: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/38.jpg)
Comparison with R-tree Approach
• The R-tree approach examines all nn_buffers intersecting with Q.
• By using a small, in-memory VOL-tree, the new approach can prune the search space.
![Page 39: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/39.jpg)
Challenge
• With dynamic updates, to keep localmax and maxrange is expensive.
To insert an nn_buffer
here, recompute!
![Page 40: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/40.jpg)
• Index entry(range, fullcover, maxrange, localmax)
lowermax, uppermax
• lowermax ≤ localmax ≤ uppermax
Solution
![Page 41: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/41.jpg)
• Index entry(range, fullcover, maxrange, localmax)
lowermax, uppermax
• lowermax ≤ localmax ≤ uppermax• Any location in maxrange has overlap =
lowermax. • At a location outside maxrange, the overlap
can be more than lowermax, but < uppermax.
Solution
![Page 42: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/42.jpg)
Update
• Case 1: the new nn_buffer does not intersect with maxrange.
• Case 2: intersects.
Case 1: increase
uppermax.
Case 2: increase uppermax and
lowermax.
![Page 43: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/43.jpg)
Query
• Similar to the OL-tree.• To compute upper bound of max
overlap, use uppermax.• When Q intersects maxrange, may or
may not prune.
![Page 44: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/44.jpg)
Content
• Problem Definition
• Straightforward Solution
• Problem Transformation
• The R-tree-based Solution
• The OL-tree
• The VOL-tree
• Performance
![Page 45: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/45.jpg)
Setup
• Digital Chart from the R-tree Portal.– O: 24,493 populated places.
– S: 9,203 cultural landmarks.
• Pagesize: 1KB. Buffersize: 256 pages.• Object R-tree: 753 pages.• Pentium IV Dell PC, 3.2GHz. • Java.• Measure total I/O of 100 random queries.
![Page 46: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/46.jpg)
Size of the VOL-tree
![Page 47: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/47.jpg)
Small Query Area
![Page 48: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/48.jpg)
Large Query Area
![Page 49: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/49.jpg)
Varying Buffer Size
![Page 50: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/50.jpg)
Effect of Update
![Page 51: The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia](https://reader036.vdocuments.us/reader036/viewer/2022062511/5514475b550346414e8b4c6a/html5/thumbnails/51.jpg)
Conclusions
• Introduced the optimal-location query.• Proposed three solutions.• The VOL-tree approach is the best.• More improvement with larger query area.
(5% query area = 6 times improvement.)• More updates decreases the improvement.
(50% updates = no improvement.) But can bulk-load.