stinger hadoop summit june 2013

Putting the Sting in Hive

Alan F. Gates

@alanfgates

© 2013 Hortonworks

Stinger Overview

•An initiative, not a project or product• Includes changes to Hive and a new project Tez•Two main goals

–Improve Hive performance 100x over Hive 0.10–Extend Hive SQL to include features needed for analytics

•Hive will support:–BI tools connecting to Hadoop–Analysts performing ad-hoc, interactive queries–Still excellent at the large batch jobs it is used for today

© 2013 Hortonworks

Stinger Mileposts

Stinger Phase 3• Buffer Cache• Cost Based

Optimizer

Stinger Phase 2• YARN Resource Mgmnt• Hive on Apache Tez• Query Service• Vectorized Operators

Stinger Phase 1• Base Optimizations• SQL Analytics• ORCFile Format

1 2 Improve existing tools & preserve investments

Enable Hive to support interactive workloads

Released in Hive 0.11

Current Work

Roadmap

© 2013 Hortonworks

Hive Performance Gains in 0.11

• Enable star joins by improving Hive’s map join (aka broadcast join)–Where possible do in single map only task–When not possible push larger tables to separate tasks

• Collapse adjacent jobs where possible–Hive has lots of M->MR type plans, collapse these to MR–Collapse adjacent jobs on sufficiently similar keys when

feasible– join followed by group– join followed by order– group followed by order

• Improvements in sort merge bucket (SMB) joins

© Hortonworks Inc. 2013

Improvements in Map Joins

• TPC-DS Query 27, Scale=200, 10 EC2 nodes (40 disks)

Query 270

200

400

600

800

1000

1200

1400

16001501.895

1176.479

631.026999999999

39.985

Text

Linear (Text)

RCFile

Partitioned RCFile

Partitioned RCFile + Optimizations


Before


After


Improvements in SMB Joins

• TPC-DS Query 82, Scale=200, 10 EC2 nodes (40 disks)

Query 820

500

1000

1500

2000

2500

3000

35003257.692

2862.669

255.641

71.114

Text

RCFile

Partitioned RCFile

Partitioned RCFile + Optimizations

© 2013 Hortonworks

New Technologies in Hive

• All covered in depth in other talks– See Owen’s, Eric’s, and Jitendra’s talk ORC File & Vectorization at 4:25 today

• Tez – A new execution engine for relational tools such as Hive– No need to use MapReduce, instead provides general DAG execution– Data moved between tasks via socket, disk, or HDFS based on performance / re-

startability trade off– Provides standing service to greatly reduce query start time

• ORCFile – A rewrite of RCFile– Columnar– Tightly integrated with Hive’s type model, including support for nested types– Much better compression – Supports projection and filter push down

• Vectorization – Rewriting operators to take advantage of modern processors– Based on work done in MonetDB– Rewrite operators to radically reduce number of function calls, branch prediction

misses, and cache misses


Standard Queries

Query 27_x000d_Scale 200

Query 82_x000d_Scale 200

Query 27_x000d_Scale

1000

Query 82_x000d_Scale

1000

0

50

100

150

200

250

300

260

165

38

77

142

296

38 42

6780

Query 27 Star JoinQuery 82 Fact Table Join

Hive 0.10, RC File

Hive 0.11 CP, RC File

Hive 0.11 CP, ORC File


Performance Trajectory

Hive 10_x000d_T

ext

Hive 10_x000d_R

C

Hive 11_x000d_R

C

Hive 11_x000d_O

RC

Hive 11 CP_x000d_O

RC, Tez_x000d_Vectoriza-

tion

0X

5X

10X

15X

20X

25X

1X2X

12X11X

21X

Query 27 Speedup

Hive 10_x000d_

Text

Hive 10_x000d_

RC

Hive 11_x000d_

RC

Hive 11_x000d_

ORC

Hive 11 CP_x000d_ORC, Tez

0X

10X

20X

30X

40X

50X

60X

70X

80X

90X

1X

14X

44X

57X

78X

Query 82 Speedup


Query 12 – Demonstrating MRR

RC File _x000d_Scale 200

ORC File _x000d_Scale 200

RC File _x000d_Scale 1000

ORC File _x000d_Scale 1000

0

10

20

30

40

50

60

70

80

55 54

75

65

35 34

55

46

Query 12 - MRR Optimization

Traditional _x000d_Map-ReduceTez Map_x000d_Reduce Reduce

Elap

sed

Tim

e (s

econ

ds)

© 2013 Hortonworks

Hive Performance Up Next

• Push down start up time - even for queries that spend less than a second running on the cluster, there is ~15 seconds of start up time– Tez service will remove Hadoop startup issues– Need to reduce time for the metadata access– Need intelligent file caching so that hot tables can be kept in memory

• Keep working on the optimizer– Y Smart work from Ohio State University– Start using statistics to make intelligent decisions about how many mappers and

reducers to spawn – maybe in Hive, maybe in Tez– Start using statistics to choose between competing plan options

• Buffer Cache– Coordinate with HDFS team to determine caching strategy

© 2013 Hortonworks

Extending Hive SQL in 0.11

• DECIMAL data type – for fixed precision calculation (e.g. currency)• OVER clause

– PARTITION BY, ORDER BY, ROWS BETWEEN/FOLLOWING/PRECEDING

– Works with existing aggregate functions– New analytic and window functions added

– ROW_NUMBER, RANK, DENSE_RANK, LEAD, LAG, LEAD, FIRST_VALUE, LAST_VALUE, NTILE, CUME_DIST, PERCENT_RANK

SELECT salesperson, AVG(salesprice) OVER (PARTITION BY region ORDER BY date

ROWS BETWEEN 10 PRECEEDING AND 10 FOLLOWING)FROM sales;

© 2013 Hortonworks

Extending Hive SQL Post 0.11

• Subqueries in WHERE– Non-correlated first– [NOT] IN first, then extend to (in)equalities and EXISTS

• Datatype conformance – Hive has Java type model, add support for SQL types:– DATE– CHAR() and VARCHAR()– add precision and scale to decimal and float– aliases for standard SQL types (BLOB = binary, CLOB = string, integer =

int, real/number = decimal)

• Security– Add security checks to views, indices, functions, etc.– Secure GRANT and REVOKE

stinger hadoop summit june 2013

Technology

hive page

orc hive

possible hive

text hive

analytics hive

rc file hive

hive sql post

hive performance gains