[vldb 2013] skyline operator on anti correlated distributions

20
Skyline Operator on Anti-correlated Distribution Proceedings of the VLDB(2013) Endowment, Vol. 6 No. 9 Haichuan Shang, Masaru Kitsuregawa Presenter: WooSung Choi ([email protected] ) DataKnow. Lab Korea UNIV.

Upload: woosung-choi

Post on 13-Apr-2017

154 views

Category:

Engineering


3 download

TRANSCRIPT

Page 1: [Vldb 2013] skyline operator on anti correlated distributions

Skyline Operator on Anti-correlated Distribution

Proceedings of the VLDB(2013) Endowment, Vol. 6 No. 9Haichuan Shang, Masaru Kitsuregawa

Presenter:

WooSung Choi([email protected])

DataKnow. LabKorea UNIV.

Page 2: [Vldb 2013] skyline operator on anti correlated distributions

Background

Related work

Page 3: [Vldb 2013] skyline operator on anti correlated distributions

Preliminaries• Formal definition of Dominates ()

Given a set of d-dimensional points

We say that a point DOMINATES another point If and only if

and Denoted by (simply saying, 이 자명하게 선호됨 )

Definition from http://www.comp.nus.edu.sg/~atung/publication/k_dominant.pdf

Note thatthe meaning of ‘dominates’ may differ according to type of application

www.caranddriver.com

Page 4: [Vldb 2013] skyline operator on anti correlated distributions

formal Definition (skyline)•The Skyline operator

Input - Given a set of objects {

A

B

CD

E

F

Dominating Area(B)

x axis

y ax

is

G

Common misconceptions“ ” , wrong“, ”, correct

Page 5: [Vldb 2013] skyline operator on anti correlated distributions

Suppose there are n objects in the given set Algorithm -Naïve 1

Naï

ve a

ppro

ach

Nested L

oop Stru

c-ture

Computational Cost -

Page 6: [Vldb 2013] skyline operator on anti correlated distributions

Motivation

Data Distribution

Page 7: [Vldb 2013] skyline operator on anti correlated distributions

Data Distribution?

Page 8: [Vldb 2013] skyline operator on anti correlated distributions

Related Work: Summary•Worst-case Analysis (2.1)

worst case complexity on arbitrary data distributions [16], [12]

•Elimination Category (2.2)Average Complexity with dimensional independence Idea: Eliminate non-skyline objects quickly!BNL[7], SFS[9], LESS[12], … [20], where is the skyline cardinality[20], where is the skyline cardinality

Page 9: [Vldb 2013] skyline operator on anti correlated distributions

Anti-Correlation 은 왜 중요한가 ?

Page 10: [Vldb 2013] skyline operator on anti correlated distributions

Anti-Correlated (2)•A relationship in which

the value in one dimension increases as the values in the other dimensions decrease

•Skyline Queries are used to find a set of non-dominated data points

for Multi-Criteria Decision Making•Data in real world

is more likely to be anti- correlated

Page 11: [Vldb 2013] skyline operator on anti correlated distributions

Anti-Correlated (3)•The anti-correlation significantly limits the practical usage of the existing algorithms

•and yields the demand of effective mathemati-cal models and efficient algorithms on anti-cor-related data

[20], where is the skyline cardinality tends to increase on anti-correlated distribution

These existing algorithms fall back to

Page 12: [Vldb 2013] skyline operator on anti correlated distributions

뭘 하겠다는 연구인가 ?

공헌도

Page 13: [Vldb 2013] skyline operator on anti correlated distributions

Contribution•1) General model for the anti-correlated distri-bution•2) Polynomial Estimation of the lower bound of the expected value of skyline cardinality•3) a “Determination and Elimination Frame-work” for efficient computation of skyline on anti-correlated distribution

Page 14: [Vldb 2013] skyline operator on anti correlated distributions

3. PRELIMINIARIES

Definition & Expectation of Skyline Cardinality

Page 15: [Vldb 2013] skyline operator on anti correlated distributions

Model: Anti-Correlated Distribution

0 2000 4000 6000 8000 10000 120000

1000

2000

3000

4000

5000

6000

7000

8000

Uniform

0 2000 4000 6000 8000 10000 120000

1000

2000

3000

4000

5000

6000

Anti c=1

0 2000 4000 6000 8000 10000 120000

1000

2000

3000

4000

5000

6000

Anti c=0.1

1) General model for the anti-correlated distribution

Page 16: [Vldb 2013] skyline operator on anti correlated distributions

1K Tuples

0 2000 4000 6000 8000 10000 120000

10002000300040005000600070008000

Uniform

0 2000 4000 6000 8000 10000 120000

1000

2000

3000

4000

5000

6000

Anti c=1

0 2000 4000 6000 8000 10000 120000

100020003000400050006000

Anti c=0.1

12 57 116

1) General model for the anti-correlated distribution

Page 17: [Vldb 2013] skyline operator on anti correlated distributions

1K Tuples

0 2000 4000 6000 8000 10000 120000

1000

2000

3000

4000

5000

6000

Anti c=1

57�̂�2,1000,1≈  √1000∗ 𝜋−1=55.04991222) Polynomal Estimation of the lowerbound of the expected value of skyline cardinality

Page 18: [Vldb 2013] skyline operator on anti correlated distributions

Generalization•Theorem 3

The expected value of the skyline cardinality

when •Where

2) Polynomal Estimation of the lowerbound of the expected value of skyline cardinality

[20], where is the skyline cardinality tends to increase on anti-correlated distribution

These existing algorithms: ) ~

Page 19: [Vldb 2013] skyline operator on anti correlated distributions

Pearson Correlation Coefficientor covariance based model

Page 20: [Vldb 2013] skyline operator on anti correlated distributions

공분산•확률론과 통계학에서 , 공분산 ( 共分散 ,  영어: covari-ance) 은 2 개의 확률변수의 상관정도를 나타내는 값•만약 2 개의 변수중 하나의 값이 상승하는 경향을 보일 때 , 다른 값도 상승하는 경향의 상관관계에 있다면 , 공분산의 값은 양수•반대로 2 개의 변수중 하나의 값이 상승하는 경향을 보일 때 , 다른 값이 하강하는 경향을 보인다면 공분산의 값은 음수