deepquery - university of washington...deepquery deep reinforcement learning for query optimization...

DeepQueryDeep Reinforcement Learning for Query Optimization

1

Jennifer Ortiz (UW), Magdalena Balazinska (UW), Johannes Gehrke (Microsoft) and S. Sathiya Keerthi (Criteo Reserach)

Query Optimization is an Old Problem

2

1979

Query Optimization Is Still Hard Today

3

2015

Two Key Query Optimization Observations

Leis et al observe (among other) that:

• Accuracy of cardinality estimation is critical to good plans• Especially hard in the presence of correlations between tables

• Risk associated with certain plan choices should be considered• E.g., nested-loops are seldom beneficial without an index

4

Active Research Areas

5

Cardinality Estim ation Histogram Sam pling

627

39

433

6

Wavelets Synopses

Recent Success of Deep Learning

7Examples & Images (except indexing) from: https://machinelearningmastery.com/inspirational-applications-deep-learning/

Black-and-white im age colorization Text and im age translation

Object detection and classification

Autom atic gam e playing Indexing! [SIGM OD’18]

Our Vision

8

Can we rethink query optimization in the context of deep learning?

Data

SQL Query

Model

TS

⋈⋈

U

Query Plan

Our Vision

9

Data

SQL QueryTS

⋈⋈

U

Query Plan

…

…

Deep Learning Model

How to encode data and a

query?

Can we rethink query optimization in the context of deep learning?

W hat type of m odel do we

need?

Can Deep Learning Help Query Opt?

• Can we automatically learn a query (subquery) representation?• That suffices to predict the query cardinality?• That can serve to derive the representation for more complex queries?

• Can we use the learned representation to find good plans?• Combine it with reinforcement learning to incrementally build plans

10

A Deep Learning Model for Query Optimization

11

Deep Model

Data

SQLQuery

Input Vector

Basic Statistics:1D-Histogram s

Encode Join PredicatesEncode Selections

Need Ability to Encode all

Possible Queries!

Learning Subquery Representations• Use deep learning to capture properties of intermediate relations

12

ℎ"

#"ℎ"$%

State t+1State t

RepresentationofDatabaseProperties

Queryoperationattimet

SubqueryRepresentation

⋈State t+2

⋈⋈

Queryoperationattimet+1

StateTransitionFunction ""#$


Action t Action t+1

%&'%&(&)*+*,-)+**,/&*

%&'%&(&)*+*,-)+**,/&*+1

+0*,-)+**,/&*

! ⋈

A Recursive Function

ℎ"

#"ℎ"$%

ObservedVariables

…

2234

2256789:8;

13

Use cardinality as an observed variable

Query Operation

Prior subquery representation

Learning the Recursive Function• One approach is to train a recurrent neural network (RNN)

14

!"

ℎ" ℎ$

!$

%&'()*+,.,/+0*,1

ℎ2

!2

ℎ3

!3

ℎ4

!4

Preliminary Experiments• Synthetic database generated using the PDGF Framework

• 8 relations (1 fact table and 7 dimension tables)• Data size ~1.6 GB

• 8k random queries• 6k for training/2k for testing

• Select-join queries• All queries have 5 selection and/or joins

• Implementation in Tensorflow

15

• Predictions compared to Postgres estimates• First action is either a selection or a join

16

Results

First Action Second Action

• Need to improve predictions for later actions

17

Results

Fifth Action

Can Deep Learning Help Query Opt?

• Can we automatically learn a query (subquery) representation?• That suffices to predict the query cardinality?• That can serve to derive the representation for more complex queries?

• Can we use the learned representation to find good plans?• Combine it with reinforcement learning to incrementally build plans

18

How to Find Good Query Plans?• Use Subquery Representations with Reinforcement Learning

• Given: A subquery representation and a set of possible actions, which is best?

19

State t+1


⋈ AAction 1

? ⋈ #Action 2

⋈ $Action 3

Reinforcement Learning

20

AgentAction

Observe State

Reward

Environment

21

AgentAction

Observe State

Reward

Environm ent

State 0

State 2

State 3

action 2

action 2

reward

reward

reward

Q(s3,a2)

Q(s2,a2)

action 1

Q(s0,a1)State 1

Reinforcement Learning: Q-learning

Reinforcement Learning with Subquery Representations

22

state 0

Database

state 2state 1

action 2: join operation (S,T) action 1: join operation (U,T)

reward: -cardinality

TS

⋈TU

⋈

Q(s0,a2)Q(s0,a1)

"

Q-Learning

23

state 0

Database

state 2state 1

TS

⋈TU

⋈

Q(s0,a2)Q(s0,a1)

Value-based iteration

state 6

Q(s2,a2)

state 4

Q(s1,a1)TS

⋈⋈

Ustate 5

Q(s2,a1)

TS

⋈⋈

R

TU

⋈⋈

S

"

Q-Learning

24

state 0

Database

state 1

TS

⋈TU

⋈

Q(s0,a2)Q(s0,a1)

state 6

Q(s2,a2)

state 4

Q(s1,a1)TS

⋈⋈

Ustate 5

Q(s2,a1)

TS

⋈⋈

R

TU

⋈⋈

S

"

state 2

Value-based iteration

Project Status

• Short paper outlines the vision: Learning State Representations for Query Optimization with Deep Reinforcement Learning. Jennifer Ortiz, Magdalena Balazinska, Johannes Gehrke, and S. Sathiya KeerthiDEEM Workshop @ SIGMOD 2018

• Experimenting with different approaches for both components of the problem

• Expect more results soon

25

Related Work• The Case for Learned Index Structures. Kraska, et. al.

• Use models to replace/enhance indexes on single attributes

• Deep Reinforcement Learning for Join Order Enumeration Marcus, et. al.

• Using RL with Postgres estimates as the reward.

26

Conclusion

• Query optimization remains a hard problem• Can we use deep learning and reinforcement learning to improve

• Cardinality estimation?• Query plan selection?

• DeepQuery: Deep reinforcement learning for query optimization• Learn subquery representations through a recursive function• Use reinforcement learning for join enumeration• Preliminary results are promising

27

deepquery - university of washington...deepquery deep reinforcement learning for query optimization...

Documents