![Page 1: Dynamically Optimizing Queries over Large Scale Data Platforms](https://reader038.vdocuments.us/reader038/viewer/2022103117/55b4a5c3bb61ebe6678b4571/html5/thumbnails/1.jpg)
Dynamically Optimizing Queries over Large Scale Data Platforms
Konstantinos Karanasos♯, Andrey Balmin§, Marcel Kutsch♣, Fatma Özcan*, Vuk Ercegovac◊, Chunyang Xia♦, Jesse Jackson♦
Inria SaclayNovember 26, 2014
♯Microsoft *IBM Research §Platfora ♣Apple ◊Google ♦IBM
[Work done at IBM Almaden Research Center]
![Page 2: Dynamically Optimizing Queries over Large Scale Data Platforms](https://reader038.vdocuments.us/reader038/viewer/2022103117/55b4a5c3bb61ebe6678b4571/html5/thumbnails/2.jpg)
2
Big Data Platforms
HiveQL
Pig
DryadLINQ
Spark
Stratosphere
Jaql
SQL
unstructured
semi-structurednested
structured
data streamsrelational
Languages
HadoopHive/Stinger
Jaql
StratosphereSpark
Impala
HAWQDryad
HadaptPolybase
DrillNeed for efficient Big Data management
Challenging due to size and heterogeneity of data, variety of applications
The Big Data Landscape
Query optimization is crucial
![Page 3: Dynamically Optimizing Queries over Large Scale Data Platforms](https://reader038.vdocuments.us/reader038/viewer/2022103117/55b4a5c3bb61ebe6678b4571/html5/thumbnails/3.jpg)
3
Query Optimization in Large Scale Data Platforms
• Existing challenges• Exponential error propagation in joins• Correlations between predicates
• “New” challenges• Prominent use of UDFs• Complex data types (arrays, maps, structs)• Poor statistics (do we own the data?)• Bad plans over Big data may be disastrous• Exploit cluster’s resources (parallel execution)
Traditional static techniques are not sufficient
We introduce dynamic techniques that are:• at least as good as and • up to 2x (4x) better than the best hand-written left-deep Jaql (Hive) plans
![Page 4: Dynamically Optimizing Queries over Large Scale Data Platforms](https://reader038.vdocuments.us/reader038/viewer/2022103117/55b4a5c3bb61ebe6678b4571/html5/thumbnails/4.jpg)
4
SELECT <projection list> FROM ( SELECT <projection list> FROM "PART", "SUPPLIER", "LINEITEM", "PARTSUPP", "ORDERS", "NATION" WHERE <join conditions> AND "PART"."p_name" LIKE '%green%' AND "ORDERS"."o_orderdate" BETWEEN '1995-01-01' AND '1995-07-01' AND "ORDERS"."o_orderstatus"='P' AND udf("PARTSUPP"."ps_partkey") < 0.001 AND <udf list>) "PROFIT"GROUP BY "PROFIT"."NATION", "PROFIT"."order_YEAR"ORDER BY "PROFIT"."NATION" ASC, "PROFIT"."order_YEAR" DESC;
5-way join
external UDFs
correlated predicates
Example: TPCH Q9’
![Page 5: Dynamically Optimizing Queries over Large Scale Data Platforms](https://reader038.vdocuments.us/reader038/viewer/2022103117/55b4a5c3bb61ebe6678b4571/html5/thumbnails/5.jpg)
5
“SQL” Processing in Large Scale Platforms• Relational operators -> MapReduce jobs• Two join algorithms:
• Repartition join (RJ) – 1MR job (default)• Memory join (MJ) – map-only job
• Optimizations based on rewrite rules and hints• RJ -> MJ• Chain MJs (multiple joins in one map job)
• Left-deep plans• This is the picture for Jaql (and Hive)
![Page 6: Dynamically Optimizing Queries over Large Scale Data Platforms](https://reader038.vdocuments.us/reader038/viewer/2022103117/55b4a5c3bb61ebe6678b4571/html5/thumbnails/6.jpg)
6
Limitations• No selectivity estimation for predicates/UDFs
• Conservative application of memory joins• No cost-based join enumeration
• Rely on order of relations in FROM clause• Left-deep plans
• Often close to optimal for centralized settings• Not sufficient for distributed query processing
![Page 7: Dynamically Optimizing Queries over Large Scale Data Platforms](https://reader038.vdocuments.us/reader038/viewer/2022103117/55b4a5c3bb61ebe6678b4571/html5/thumbnails/7.jpg)
7
p
l
Best left-deep hand-written Jaql plan
RJ
o
RJ ps
RJ
Best relational optimizer plan
MJ
s nudf(ps)
udf(o)
udf(p)
udf(o,l)
RJ
s
RJ l
RJ
RJ
RJ
udf(o,l)
p
udf(p)
n
MJ
o
udf(o)
ps
udf(ps)
TPCH Q9’: Execution Plans
![Page 8: Dynamically Optimizing Queries over Large Scale Data Platforms](https://reader038.vdocuments.us/reader038/viewer/2022103117/55b4a5c3bb61ebe6678b4571/html5/thumbnails/8.jpg)
8
Dynamic Optimization
• Key idea: alter execution plan at runtime• Studied in the relational setting
• Both centralized and distributed• Basic concern: when to break the pipeline?• No emphasis on UDFs and correlated predicates
• Increasingly being used in large scale platforms(e.g., Scope, Shark, Hive)
Goal: dynamic optimization techniques for large scale data platforms (implemented in Jaql)
![Page 9: Dynamically Optimizing Queries over Large Scale Data Platforms](https://reader038.vdocuments.us/reader038/viewer/2022103117/55b4a5c3bb61ebe6678b4571/html5/thumbnails/9.jpg)
9
IBM BigInsights Jaql
Key differentiators• Functions:
reusability + abstraction• Physical Transparency:
precise control when needed• Data model:
semi-structured based on JSON
Dataflows for conceptual JSON data
Flexible scripting language
Scalable map-reduce runtime
Fault Tolerant DFS
Jaql
Map
Jaql
Reduce
Jaql
MapJaql
Reduce
Jaql
Map
![Page 10: Dynamically Optimizing Queries over Large Scale Data Platforms](https://reader038.vdocuments.us/reader038/viewer/2022103117/55b4a5c3bb61ebe6678b4571/html5/thumbnails/10.jpg)
10
Jaql Script: Example
transform group by writeread
read(hdfs("reviews")) -> transform { pid: $.placeid, rev: sentAn($.review) }-> group by p = ($.pid) as r into { pid: p, revs: r.rev } -> write(hdfs("group-reviews"))
[ { pid: 12, revs: [ 3*, 4*, … ] }, { pid: 19, revs: [ 2*, 1*, … ] }]
Query Data
Group user reviews by place
![Page 11: Dynamically Optimizing Queries over Large Scale Data Platforms](https://reader038.vdocuments.us/reader038/viewer/2022103117/55b4a5c3bb61ebe6678b4571/html5/thumbnails/11.jpg)
11
Jaql to MapReduce
mapReduce( input: { type: hdfs, location: "reviews" }, output: { type: hdfs, location: "group-reviews" }, map: fn($mapIn) ( $mapIn -> transform { pid: $.placeid, rev: sentAn($.review) } -> transform [ $.placeid, $.rev ] ), reduce: fn($p, $r) ( [ pid: $p, revs: $r ] ) )
• Functions as parameters• Rewritten script is valid
Jaql!
read(hdfs("reviews")) -> transform { pid: $.placeid, rev: sentAn($.review) }-> group by p = ($.pid) as r into { pid: p, revs: r.rev } -> write(hdfs("group-reviews"))
Rewrite Engine
![Page 12: Dynamically Optimizing Queries over Large Scale Data Platforms](https://reader038.vdocuments.us/reader038/viewer/2022103117/55b4a5c3bb61ebe6678b4571/html5/thumbnails/12.jpg)
12
Outline
• Introduction• System Architecture
• Pilot Runs• Adaptation of Execution Plans
• Experiments• Conclusion
![Page 13: Dynamically Optimizing Queries over Large Scale Data Platforms](https://reader038.vdocuments.us/reader038/viewer/2022103117/55b4a5c3bb61ebe6678b4571/html5/thumbnails/13.jpg)
13
DynO Architecture
Query
Query result
Jaql plan
Optimizer (join enumeration)
Jaql compiler
Jaql runtime
MapReduce
Statistics DFS
join query
blocks
execute part of the plan
best plan
pilot runs
remaining plan
1
2
3
4
8
5
6
7
![Page 14: Dynamically Optimizing Queries over Large Scale Data Platforms](https://reader038.vdocuments.us/reader038/viewer/2022103117/55b4a5c3bb61ebe6678b4571/html5/thumbnails/14.jpg)
14
• PilR algorithm:• Push-down selections/UDFs• Get leaf expressions (scans + local predicates)• Transform them to map-only jobs• Execute them over random splits of each relation• Until k tuples are output• Collect statistics during execution
• Parallel execution of pilot runs (~4.5x speedup)• Approx. 3% overhead to the execution• Performance speedup of up to 2x (4x) for Jaql (Hive)
Pilot Runs
![Page 15: Dynamically Optimizing Queries over Large Scale Data Platforms](https://reader038.vdocuments.us/reader038/viewer/2022103117/55b4a5c3bb61ebe6678b4571/html5/thumbnails/15.jpg)
15
p
l
Best left-deep hand-written Jaql plan
RJ
o
RJ ps
RJ
Best relational optimizer plan
MJ
s nudf(ps)
udf(o)
udf(p)
udf(o,l)
RJ
s
RJ l
RJ
RJ
RJ
udf(o,l)
p
udf(p)
n
MJ
o
udf(o)
ps
udf(ps)
p
l
MJo
ps
MJ
MJ
s n
udf(ps)
udf(o)
udf(p)
udf(o,l)
MJ
MJ
DynO plan
Up to 2x speedup(4x when applied to Hive)
TPCH Q9’: Impact of Pilot Runs
![Page 16: Dynamically Optimizing Queries over Large Scale Data Platforms](https://reader038.vdocuments.us/reader038/viewer/2022103117/55b4a5c3bb61ebe6678b4571/html5/thumbnails/16.jpg)
16
• Collected statistics:• #tuples, min/max, #distinct values• add more if the optimizer can support them
• Statistics reusability• Optimization for selective (and expensive) predicates• Shortcomings:
• Non-local predicates• Non primary/foreign key joins• Join correlations
Pilot Runs: Details
Runtime adaptation of execution plans
![Page 17: Dynamically Optimizing Queries over Large Scale Data Platforms](https://reader038.vdocuments.us/reader038/viewer/2022103117/55b4a5c3bb61ebe6678b4571/html5/thumbnails/17.jpg)
17
• Cost-based optimizer• Based on Columbia (top-down) optimizer• Focuses on join enumeration• Accurate statistics from pilot runs and/or previous executions• Bushy plans (intra-query parallelization)
• Online statistics collection• Re-optimization points (natural in MR)• Execution strategies: choosing leaf jobs
• Degree of parallelization, cost/uncertainty of jobs
Adaptation of Execution Plans
![Page 18: Dynamically Optimizing Queries over Large Scale Data Platforms](https://reader038.vdocuments.us/reader038/viewer/2022103117/55b4a5c3bb61ebe6678b4571/html5/thumbnails/18.jpg)
18
sp Best left-deep hand-written Jaql plan
RJ l
RJ o
RJ c
RJ
n1
MJ n2
MJ r
MJ
o
RJ
s
RJ n2
MJ
RJ
l
RJ
p
udf(o,c)
r
MJ
c
MJ
n1
udf(o,c)
Best relational optimizer plan
TPCH Q8’: Impact of Execution Plan Adaptation
![Page 19: Dynamically Optimizing Queries over Large Scale Data Platforms](https://reader038.vdocuments.us/reader038/viewer/2022103117/55b4a5c3bb61ebe6678b4571/html5/thumbnails/19.jpg)
19
o
s
RJ n2
MJ
RJ
l
RJ
p
r
MJ
c
MJ
n1
udf(o,c)
RJt1s
MJ n2
MJ
RJ
l
RJ
t1
p
t2
RJ s
RJ n2
MJ
p t2
t3
MJ
n2
MJ
s
t3Speedup up to 2x without any initial statistics(despite the added overhead)
TPCH Q8’: Impact of Execution Plan Adaptation
![Page 20: Dynamically Optimizing Queries over Large Scale Data Platforms](https://reader038.vdocuments.us/reader038/viewer/2022103117/55b4a5c3bb61ebe6678b4571/html5/thumbnails/20.jpg)
20
Outline
• Introduction• System Architecture
• Pilot Runs• Adaptation of Execution Plans
• Experiments• Conclusion
![Page 21: Dynamically Optimizing Queries over Large Scale Data Platforms](https://reader038.vdocuments.us/reader038/viewer/2022103117/55b4a5c3bb61ebe6678b4571/html5/thumbnails/21.jpg)
21
Experimental Setup
• 15-node cluster, 10 GbE• Each machine:
• 12-cores, 96 GB RAM (2GB to each MR slot), 12*2TB disks• 10 map/8 reduce slots
• Hadoop 1.1.1• ZooKeeper for coordination (in statistics collection)• TPCH data, SF = {100, 300, 1000}• TPCH queries (with additional UDFs)
![Page 22: Dynamically Optimizing Queries over Large Scale Data Platforms](https://reader038.vdocuments.us/reader038/viewer/2022103117/55b4a5c3bb61ebe6678b4571/html5/thumbnails/22.jpg)
22
Execution times comparison
• At least as good as the best left-deep hand-written plans• Benefits from bushy plans (Q2)• Benefits from pilot runs due to many UDFs (Q9’)• Benefits from re-optimization due to UDF on join result (Q8’)• Biggest benefit is brought by the pilot runs
![Page 23: Dynamically Optimizing Queries over Large Scale Data Platforms](https://reader038.vdocuments.us/reader038/viewer/2022103117/55b4a5c3bb61ebe6678b4571/html5/thumbnails/23.jpg)
23
Benefits of our Approach on Hive
• Similar performance trends with Jaql
• Bigger speedup (up to 4x) due to implementation of broadcast joins (Hive 0.12 exploits DistributedCache)
![Page 24: Dynamically Optimizing Queries over Large Scale Data Platforms](https://reader038.vdocuments.us/reader038/viewer/2022103117/55b4a5c3bb61ebe6678b4571/html5/thumbnails/24.jpg)
24
Overhead of Dynamic Optimization
• Pilot runs overhead 2.5-6.5%
• Stats collection overhead 0.1-2.8%
• Overall overhead 7-10%
![Page 25: Dynamically Optimizing Queries over Large Scale Data Platforms](https://reader038.vdocuments.us/reader038/viewer/2022103117/55b4a5c3bb61ebe6678b4571/html5/thumbnails/25.jpg)
25
Conclusion
• Pilot runs to account for UDFs• Dynamic adaptation of execution plans• Traditional optimizer for join ordering (bushy plans)• Online statistics collection (no need for initial statistics)• Execution strategies• At least as good plans as the left-deep hand-written ones
• Up to 2x faster (4x for Hive)• Applicability to other systems (e.g., Hive)
![Page 26: Dynamically Optimizing Queries over Large Scale Data Platforms](https://reader038.vdocuments.us/reader038/viewer/2022103117/55b4a5c3bb61ebe6678b4571/html5/thumbnails/26.jpg)
26
Perspectives
• Broader range of applications (e.g., ML)• Other runtimes (e.g., Tez)• Adaptive operators• Extend optimizer to support grouping, ordering
![Page 27: Dynamically Optimizing Queries over Large Scale Data Platforms](https://reader038.vdocuments.us/reader038/viewer/2022103117/55b4a5c3bb61ebe6678b4571/html5/thumbnails/27.jpg)
Thank you!