relaxing join selection & queries nick koudas et al

43
Relaxing Join Selection & Queries Nick Koudas et al Presented by Pradnya Chavarkar

Upload: december

Post on 17-Jan-2016

51 views

Category:

Documents


0 download

DESCRIPTION

Relaxing Join Selection & Queries Nick Koudas et al. Presented by Pradnya Chavarkar. Motivation. Complex queries in terms of predicates against large databases. Right parameter values not known. Parameter adjustment becomes a trial-and-error method. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join Selection & Queries Nick Koudas et al

Presented by Pradnya Chavarkar

Page 2: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

2

Motivation

Complex queries in terms of predicates against large databases.

Right parameter values not known. Parameter adjustment becomes a trial-

and-error method. More the predicates, difficult the

adjustment

Page 3: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

3

Example

Page 4: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

4

Example

How many conditions to adjust?? How much to adjust?? No. of choices exponential in number of

conditions

Page 5: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

5

Outline

Relaxation framework Relaxing select conditions Relaxing All conditions Lattice traversal Experimental results Conclusion

Page 6: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

6

Query Relaxation

Top–K queriesWeights given to each condition

Skyline Points which are not dominated by any

other points in the given set

Page 7: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

7

Relaxation Framework

Selection Condition

Join Condition

Page 8: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

8

Relaxation Skyline

Page 9: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

9

Simple Solution !

Compute Skyline on Jobs Compute Skyline on Candidates Join the skylines

Page 10: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

10

Simple Solution ! (Contd)

Incorrect results ! Computes skyline locally

Page 11: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

11

Relaxing Selection Conditions Many joins are on Identifier attributes Relaxation skyline should satisfy :

Page 12: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

12

Join First (JF)

Join without select conditions Compute skyline for select conditions

on the resulting tuples May not be efficient for joins which

return a large number of pairs

Page 13: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

13

Pruning Join (PJ)

Compute skyline during join step Assuming index on relation S For each tuple of R, find joining tuples

from S Call ‘update’ for each such pair and

current skyline Discard the pair if it is already

dominated else discard the pairs dominated by the current pair

Page 14: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

14

Pruning Join (PJ) (Contd.)

Advantage: Produces less pairs after join step

as compared to Join First Disadvantage

Need for modification of join methods

Page 15: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

15

Pruning Join+ (PJ+)

Modifies pruning Join Algorithm Computes “local skylines” for records

joining with a record of othe relation Does Dominance checking within local

skyline If a pair is locally dominated then it will

not be in global skyline

Page 16: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

16

Pruning Join+ (PJ+) (Contd)

R (A, B, C)S (C, D, E)

Page 17: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

17

Pruning Join+ (PJ+) (Contd)

Does local pruning hoping to eliminate some S tuples beforehand

May not be always beneficial, when many tuples are not eliminated

Experimental results show that the tradeoff depends upon factors like number of conditions

Page 18: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

18

Sorted Access Join (SAJ)

Constructs sorted list of tuple IDs for each select condition based on relaxation

Go through the lists in round robin fashion

For each record in Li find records which can join -- (p,q)

Stop if all the current records in list Lj (i≠j), have relaxation greater that (p,q)

Page 19: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

19

Sorted Access Join (SAJ) (Contd)

Page 20: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

20

Relaxing All Conditions

Relaxing join conditions along with selection conditions

Assumes multidimensional indexed structure on selection and join conditions

R-Tree on both relations

Page 21: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

21

MIDIR

Traverses both R-Trees top down to identify potential pairs which can be in relaxation skyline

Push such pairs in queue In each iteration, pop one pair and

perform dominance checking If object-object pair, call Update()

Page 22: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

22

Dominance checking in MIDIR

Compute lower and upper bound on relaxation Pair p’ dominates p if for each condition Ci

To compute the lower and upper boundfor Ci (selection condition) convert itinto interval

Page 23: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

23

Dominance checking in MIDIR (Contd) B: MRB or tuple A: attribute used in B I(B,A): interval of A in B

Selection Condition:

Join Condition

Page 24: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

24

Dominance checking in MIDIR (Contd) Minimum distance between two intervals

Then MINDIST = 0

else MINDIST =

Page 25: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

25

Variants of Query Relaxation

Top-k answers User can specify weights on the conditions K points that have smallest weighted

summation Store the k best skyline answers Modify Update() in PJ Priority queues in MIDIR

Page 26: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

26

Variants of Query Relaxation (Contd)

Queries with multiple joins JF – compute all joins beforehand SAJ – access multiple index structures to

find the joining tuples MIDIR – queue maintains possible vectors

Relaxation of nonnumeric attributes RELAX(r,c) an be modified All algorithms can refer to this computed

value

Page 27: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

27

Variants of Query Relaxation (Contd)

Lattice Structure

Page 28: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

28

Lattice Traversal At least one result should be returned Examine the reasons for the empty

result – Join condition Cj is strict

Relax Cj with the tuples provided by applying selection conditions

Either CR or Cs is strict Relax the strict selection condition If join still strict then relax join with either of the

selections

Page 29: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

29

Lattice Traversal (Contd)

Both CR or Cs are strict Relax both selection conditions If join condition still strict, relax join condition If still no result, relax all togather

Page 30: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

30

Experiments

Experimental Settings Two real and three synthetic dabases Three types of correlations

Independent Correlated Anti-correlated

Page 31: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

31

Experiments (different data size)

Page 32: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

32

Experiments (different data sizes)

Page 33: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

33

Experiments (Different join cardinality)

Page 34: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

34

Experiments (Different selection conditions)

Page 35: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

35

Summary

JF and PJ Almost same performance for relaxation of

selection conditions PJ and PJ+

PJ works faster, but performance gets affected by join cardinality

SAJ Efficient for correlated data and query

conditions

Page 36: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

36

Conclusion

Efficient relaxation algorithm for relaxing join and selection queries

Lattice traversal method for minimal relaxation

Page 37: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

37

Related Work (ML for query relaxation ) LOQR algorithm proposed which is for

queries in disjunctive normal form Three steps

Exacting domain knowledge Finding the most useful rule Relaxing the failing query

Page 38: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

38

Introduction Let the query by the user be

The query fails because Laptops with large screen weights more

than 3 pounds Fast laptops with large HDD cost more than

$2000

Page 39: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

39

Extracting domain knowledge Take subset D of the target database Find “typical values” of the other attributes for

each constraints Forms a new dataset Di, with values of the

constraint attribute indicated a Boolean

Page 40: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

40

Extracting domain knowledge (contd)

Page 41: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

41

Finding the most useful rule

For rules formed using all the constraints individually, find the most similar rule by nearest neighbour search

R1 is the most similar as they differ only on the price attribute

It is guaranteed to give a result as it gives a result on the subset of the target database

Page 42: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

42

Relaxing the failing query

Most similar query is combined with the original query to give the relaxed query

Page 43: Relaxing Join Selection &  Queries  Nick Koudas et al

Relaxing Join and Selection Queries

43

References

Relaxing Join and selection QueriesNick Koudas, Chen Li, and Anthony K. H. Tung, Rares Vernica

Machine learning for online query relaxationIon Muslea