![Page 1: Cost Modeling of Spatial Query Operators Using Nonparametric Regression Songtao Jiang Department of Computer Science University of Vermont October 10,](https://reader037.vdocuments.us/reader037/viewer/2022083121/5a4d1b7b7f8b9ab0599b91c3/html5/thumbnails/1.jpg)
Cost Modeling of Spatial Query Operators Using Nonparametric Regression
Songtao Jiang
Department of Computer ScienceUniversity of Vermont
October 10, 2003
![Page 2: Cost Modeling of Spatial Query Operators Using Nonparametric Regression Songtao Jiang Department of Computer Science University of Vermont October 10,](https://reader037.vdocuments.us/reader037/viewer/2022083121/5a4d1b7b7f8b9ab0599b91c3/html5/thumbnails/2.jpg)
Three Commonly used Spatial Operators
Range queryRange (reference object, range)
K nearest neighborKNN (reference object, number of neighbors)
Window queryWindow (a rectangle)
![Page 3: Cost Modeling of Spatial Query Operators Using Nonparametric Regression Songtao Jiang Department of Computer Science University of Vermont October 10,](https://reader037.vdocuments.us/reader037/viewer/2022083121/5a4d1b7b7f8b9ab0599b91c3/html5/thumbnails/3.jpg)
Our Approach
Training process
Building model
![Page 4: Cost Modeling of Spatial Query Operators Using Nonparametric Regression Songtao Jiang Department of Computer Science University of Vermont October 10,](https://reader037.vdocuments.us/reader037/viewer/2022083121/5a4d1b7b7f8b9ab0599b91c3/html5/thumbnails/4.jpg)
Cost variables Range query: <x, y, distance>
Window query: <x_left, y_bottom, x_right, y_top>(x_left, y_bottom) is the low left corner(x_right, y_top) is the upper right corner
KNN: <x, y, number>
![Page 5: Cost Modeling of Spatial Query Operators Using Nonparametric Regression Songtao Jiang Department of Computer Science University of Vermont October 10,](https://reader037.vdocuments.us/reader037/viewer/2022083121/5a4d1b7b7f8b9ab0599b91c3/html5/thumbnails/5.jpg)
Data sets
Real data set: 500,000 meters by 300,000 meters two dimensional space, 15,000 spatial objects, the distribution is unknown (Urban Areas of Counties in the Pennsylvania State. URL: http://www.psu.edu/access/urban.shtml)
Synthetic data set: 10,000 meters by 10,000 meters two dimensional space, 1000 or 10,000 objects, the distributions are uniform or Gaussian.
![Page 6: Cost Modeling of Spatial Query Operators Using Nonparametric Regression Songtao Jiang Department of Computer Science University of Vermont October 10,](https://reader037.vdocuments.us/reader037/viewer/2022083121/5a4d1b7b7f8b9ab0599b91c3/html5/thumbnails/6.jpg)
Urban area of Adams County in Pennsylvania State
![Page 7: Cost Modeling of Spatial Query Operators Using Nonparametric Regression Songtao Jiang Department of Computer Science University of Vermont October 10,](https://reader037.vdocuments.us/reader037/viewer/2022083121/5a4d1b7b7f8b9ab0599b91c3/html5/thumbnails/7.jpg)
Statistical Model (an example)
Range query, Distance = 1000 meters
![Page 8: Cost Modeling of Spatial Query Operators Using Nonparametric Regression Songtao Jiang Department of Computer Science University of Vermont October 10,](https://reader037.vdocuments.us/reader037/viewer/2022083121/5a4d1b7b7f8b9ab0599b91c3/html5/thumbnails/8.jpg)
Results (1)
Varying spatial operatorGaussian data set
0102030405060708090
<10% 10%-20%
20%-30%
30%-40%
>40%
Relative CPU error
Perc
enta
ge o
f tes
ing
poin
ts
Range
KNN
Window
Gaussian data set
0102030405060708090
100
<10% 10%-20%
20%-30%
30%-40%
>40%
Relative IO error
Perc
enta
ge o
f tes
ting
poin
ts
Range
KNN
Window
![Page 9: Cost Modeling of Spatial Query Operators Using Nonparametric Regression Songtao Jiang Department of Computer Science University of Vermont October 10,](https://reader037.vdocuments.us/reader037/viewer/2022083121/5a4d1b7b7f8b9ab0599b91c3/html5/thumbnails/9.jpg)
Results (2) Varying spatial data set density
Range query operator
0
10
20
30
40
50
60
70
80
<10% 10%-20%
20%-30%
30%-40%
>40%
Relative CPU error
Prec
enta
ge o
f tes
ting
poin
ts
Denser
Sparser
Range query operator
0102030405060708090
100
<10% 10%-20%
20%-30%
30%-40%
>40%
Relative IO error
Perc
enta
ge o
f tes
ting
poin
ts
Denser
Sparser
![Page 10: Cost Modeling of Spatial Query Operators Using Nonparametric Regression Songtao Jiang Department of Computer Science University of Vermont October 10,](https://reader037.vdocuments.us/reader037/viewer/2022083121/5a4d1b7b7f8b9ab0599b91c3/html5/thumbnails/10.jpg)
Results (3) Varying training data set size
Range query operator
0
10
20
30
40
50
60
70
80
<10% 10%-20%
20%-30%
30%-40%
>40%
Relative CPU error
Perc
enta
ge o
f tes
ting
poin
ts
Large
Small
Range query operator
0102030405060708090
100
<10% 10%-20%
20%-30%
30%-40%
>40%
Relative IO error
Perc
enta
ge o
f tes
ting
poin
ts
Large
Small
![Page 11: Cost Modeling of Spatial Query Operators Using Nonparametric Regression Songtao Jiang Department of Computer Science University of Vermont October 10,](https://reader037.vdocuments.us/reader037/viewer/2022083121/5a4d1b7b7f8b9ab0599b91c3/html5/thumbnails/11.jpg)
Conclusion
Accuracy
Easy to use
Time toleranceTraining overhead is small