semantics-aware prediction for analytic queries in ...€¦ · and final output n developed a...

Semantics-Aware Prediction for Analytic Queries in MapReduce Environment

Weikuan Yu, Zhuo Liu, Xiaoning Ding

Florida State UniversityAuburn University

New Jersey Institute of Technology

P2S2: S2

Backgroundn MapReduce is a popular data-centric programmingmodel.

n Hive and Pig are popular data warehouse systems.Ø More than 40% of Hadoop jobs at Yahoo! were Pigprograms back in 2009, more with Hive now.

Ø In Facebook, 95% of MR jobs are generated by Hive.n In Hive, each SQL query is compiled and translatedinto a DAG (Directed Acyclic Graph) of MapReducejobs with inner-dependencies.

Source: ICDE’10 by Facebook

AGG(J1)

Join(J3)

Join(J2)

AGG(J4)

P2S2: S3

Motivationn Semantic gap: between Hive and Hadoop

Ø Hadoop is un-aware of such dependency and inter-jobrelationship, just treating all jobs as the same.

Ø Without such awareness, it will be difficult for Hadoop toschedule jobs that belong to a query efficiently.

n Problems:Ø Suboptimal query response timeØ Unfairness among queries

P2S2: S4

Efficiency issue for queries with varied sizes

n In this test, QA, QB and QC are issued in sequence, where QB is a large query.

n Under HCS, interleaved execution happens among queries’ jobs.

AGG(J1)

Join(J3)

Join(J2)

AGG(J4)

AGG(J1)

Sort(J2)

QB

QC

AGG(J1)

Sort(J2)QA

QC J2QBJ2QC

J1QBJ1 QAJ2

QAJ1

QBJ3

QBJ4

Resource Allocation

P2S2: S5

Query Delays shown by GANTT Chart

n QA arrives first with its J1 job, QB and QC afterwards with their jobs listed accordingly.

n QA-J2 gets delayed by QB-J1. QC-J2 gets delayed by QB-J2.n Query response time can be improved if the scheduler is aware of

the query semantics, therefore the relationship among the jobs.

0 100 200 300 400 500 600 700

QA

QB

QC

Time Elapse (seconds)

J1 J2 J3 J4

J1

J1

J2

J2ExecutionMap StallReduce Stall

P2S2: S6

Semantics-Aware Query Predictionn Three main techniquesØ Semantics extraction (DAG, operator type, predicates, etc.)Ø Selectivity EstimationØ Query prediction

JobTracker

TaskTracker

Hadoop JobListener

Semantics Extraction

TaskTracker TaskTracker

Queryprediction

Selectivity Estimation

Execution Engine

Parser

Semantics Analyzer

HiveQL Queries

Job &Semantics Results

Hive

P2S2: S7

n Selectivity estimationØ Predict each job node’s intermediate (Med) and output (Out) data sizesrecursively along the DAG (from bottom to top).

Ø For different job types, e.g., Groupby, Join, Select, we reply on certainformulas and offline-built histograms to estimate their selectivities.

n Logic: Selectivity estimation=>Job/query resource estimation andtime modeling =>Used for efficient query scheduling

Selectivity estimation for a query’s jobs

AGG

Join

Join

SORT

T1 T2 T3Med

Out

Map

Reduce

In(T1)

P2S2: S8

Selectivity Estimationn IS is used for estimating MOF size

Ø IS = DMed/DIn

n Final Selectivity (FS) is defined as:Ø FS = |Out|*WOut/DIn

n Predicate selectivity (ratio of selected rows to input rows)Ø Spred = |Med| /|In|

n Projection selectivity (ratio of selected cols to tuple width)Ø Sproj = ∑"#$%ℎ'() * +,),'-,. /01234"#$%ℎ

P2S2: S9

Intermediate Selectivity - ISØ For extract job such as select and order by,

IS = Spred * Sproj

Ø For join job: IS = Spred 1 * Sproj 1 *r1+Spred 2 * Sproj 2 *(1-r1)

Ø Groupby can involve local combine: IS = Scomb * SprojØ For clustered keys,

Scomb = min(1, !.#$%! ∗'()*#)* Spred = min(Spred,

!.#$%! )

Ø For randomly distributed keys, Scomb= min(Spred,

!.#$%! /N-.(/

)

P2S2: S10

Final Selectivity – Outputn For extract job,

Ø For “top k” job, |Out|=min(|In|, k)Ø For “order by” job, |Out|=|In|

n For groupby job, Ø |Out|= min(|T|*Spred, T.dxy)

n For join job, Ø Equ-join with uniform keys:

|Out| = |T1⋈ T2| = |T1|∗|T2|∗ &'()(+&.-., +0.-.)

Ø Chained joins: |Out| = |T1. pred1 ⋈ T2. pred2 ⋈ T3. pred3|

=Spred 1∗Spred 2∗Spred 3 max(|T1|, |T2|, |T3|)

P2S2: S11

An example for selectivity estimationn Predict jobs’ selectivity recursively in a query.

Pred

Group by

n

25 24

800000

10000 10000

9600 9600

800000

768000 768000 200000

Job 1 Job 2 Job 3

|Out|= 0.96*25*10000 *1/max(25,25) |Out|= 0.96 * max(25, 10000, 800000) |Out| = min(768000,200000)

sps

n⋈sMED2

MED1

Join

MED2

MED1

Join reslMED1n⋈s⋈ps

P2S2: S12

Multivariate Time Predictionn List of Considered Input Features

Ø OperatorsØ Input DataØ Output DataØ Data Growth

P2S2: S13

Job Time Prediction ModelØ Model job execution time based on selectivity estimation

Ø Training on over 5647 MR jobs, about 1000 queries from TPC-DS and TPC-H of different scales.

Ø ! is trained for extract, groupby and join jobs respectively.

P2S2: S14

Task Time Prediction ModelØ Data size:

Ø TDIn_i and TDOut_i

Ø The predicted time for the i-th task: ETi

Ø ETi = k0 + k1 TDIn_i + k2 TDOut_i + k3 * P (1-P) TDIn_i

P2S2: S15

Scheduling with Semantics Awarenessn Semantics-Aware Resource Demand

Ø Weight Resource Demand (WRD): aggregate the demand from all map tasks (MTi) and Reduce tasks (RTi)

Ø WRD = ∑(#$% ∗ '#% ) + ∑(($% ∗ '(%)

n Experimented with a simple greedy scheduling policyØ Prioritizing smallest Queries for fast turnaroundØ Smallest WRD First (SWRD) query scheduling

P2S2: S16

Evaluation setupn Benchmarks.

Ø Built with TPC-H, TPC-DS queries and Terasort/Grep/Wordcount MapReduce jobs.

Ø Submitted in Poisson interval

n MetricsØ Accuracy of the prediction via semantics awarenessØ Efficiency: query execution time

P2S2: S17

Estimation of Job Execution timen Accuracy

Ø On average, 13.98% error rate for the test set of jobs.

Job Time Estimation

P2S2: S18

Estimation of Task Executionn Map Task Execution Time

Ø Join operators lead to lower accuracy

n Reduce Task Execution Time

P2S2: S19

Validation of job and query time estimationn Predicted time accuracy for queries

Ø Error rate is 8.3% on average for 22 100G TPC-H queries.

Query Time Estimation

0

200

400

600

800

1000

1200

1400

Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20 Q21 Q22

Que

ry R

espo

nse

Tim

e (s

ec) Actual Estm

P2S2: S20

Benefits of Semantics-Aware Schedulingn Execution time of queries

Ø Compared to HFS, SWRD improves the execution of Bing and Facebook workloads by 44% and 40%, respectively.

Ø Compared to HCS, SWRD improves by 27.4% and 72.8%, respectively.

0"

400"

800"

1200"

1600"

2000"

Bing"" Facebook"

Execu&

on)Tim

e)(secon

ds)) HFS" HCS" SWRD"

P2S2: S21

Conclusion and Future Workn Introduced cross-layer semantics extraction and percolation to

increase the semantics awareness of the Hadoop job schedulern Formalized the estimation of selectivity for intermediate data

and final outputn Developed a multivariate prediction model for job and task

execution time and validated the accuracyØ Leveraged semantics awareness for efficient query scheduling in HIVE

n Plan to pursue further integration of semantics awareness in complex query scheduling and other data analytics systems.

P2S2: S22

Acknowledgement

semantics-aware prediction for analytic queries in ...€¦ · and final output n developed a...

Documents