apache hadoop yarn - the future of data processing with hadoop

31
Apache Hadoop YARN and Tez Future of Data Processing with Hadoop Page 1 Chris Harris [email protected]

Upload: hortonworks

Post on 27-Jan-2015

119 views

Category:

Technology


3 download

DESCRIPTION

A dive into new data processing capabilities in Hadoop 2.0 with Apache Hadoop YARN, Apache Tez and Apache Hive

TRANSCRIPT

Page 1: Apache Hadoop YARN - The Future of Data Processing with Hadoop

Apache Hadoop YARN and Tez Future of Data Processing with Hadoop

Page 1

Chris Harris [email protected]

Page 2: Apache Hadoop YARN - The Future of Data Processing with Hadoop

Agenda

Page 2

• Overview

• Status Quo

• Architecture

• Improvements and Updates

• Tez

Page 3: Apache Hadoop YARN - The Future of Data Processing with Hadoop

Hadoop MapReduce Classic

•  JobTracker

–  Manages cluster resources and job scheduling •  TaskTracker

–  Per-node agent

–  Manage tasks

Page 4: Apache Hadoop YARN - The Future of Data Processing with Hadoop

Current Limitations

•  Scalability –  Maximum Cluster size – 4,000 nodes –  Maximum concurrent tasks – 40,000 –  Coarse synchronization in JobTracker

•  Single point of failure –  Failure kills all queued and running jobs –  Jobs need to be re-submitted by users

•  Restart is very tricky due to complex state

4

Page 5: Apache Hadoop YARN - The Future of Data Processing with Hadoop

Current Limitations

•  Hard partition of resources into map and reduce slots –  Low resource utilization

•  Lacks support for alternate paradigms –  Iterative applications implemented using MapReduce are

10x slower –  Hacks for the likes of MPI/Graph Processing

•  Lack of wire-compatible protocols –  Client and cluster must be of same version –  Applications and workflows cannot migrate to different

clusters

5

Page 6: Apache Hadoop YARN - The Future of Data Processing with Hadoop

Requirements

•  Reliability •  Availability

•  Utilization

•  Wire Compatibility

•  Agility & Evolution – Ability for customers to control upgrades to the grid software stack.

•  Scalability - Clusters of 6,000-10,000 machines –  Each machine with 16 cores, 48G/96G RAM, 24TB/36TB

disks –  100,000+ concurrent tasks –  10,000 concurrent jobs

6

Page 7: Apache Hadoop YARN - The Future of Data Processing with Hadoop

Design Centre

•  Split up the two major functions of JobTracker –  Cluster resource management –  Application life-cycle management

•  MapReduce becomes user-land library

7

Page 8: Apache Hadoop YARN - The Future of Data Processing with Hadoop

Concepts

•  Application –  Application is a job submitted to the framework –  Example – Map Reduce Job

•  Container –  Basic unit of allocation –  Example – container A = 2GB, 1CPU –  Fine-grained resource allocation –  Replaces the fixed map/reduce slots

8

Page 9: Apache Hadoop YARN - The Future of Data Processing with Hadoop

Architecture

•  Resource Manager –  Global resource scheduler –  Hierarchical queues

•  Node Manager –  Per-machine agent –  Manages the life-cycle of container –  Container resource monitoring

•  Application Master –  Per-application –  Manages application scheduling and task execution –  E.g. MapReduce Application Master

9

Page 10: Apache Hadoop YARN - The Future of Data Processing with Hadoop

Architecture

ResourceManager

MapReduce Status

Job Submission

Client

NodeManager

NodeManager

Container

NodeManager

App Mstr

Node Status

Resource Request

ResourceManager

Client

MapReduce Status

Job Submission

Client

NodeManager

NodeManager

App Mstr Container

NodeManager

App Mstr

Node Status

Resource Request

ResourceManager

Client

MapReduce Status

Job Submission

Client

NodeManager

Container Container

NodeManager

App Mstr Container

NodeManager

Container App Mstr

Node Status

Resource Request

Page 11: Apache Hadoop YARN - The Future of Data Processing with Hadoop

Improvements vis-à-vis classic MapReduce

11

•  Utilization –  Generic resource model

•  Data Locality (node, rack etc.) •  Memory •  CPU •  Disk b/q •  Network b/w

–  Remove fixed partition of map and reduce slot •  Scalability

–  Application life-cycle management is very expensive –  Partition resource management and application life-cycle management –  Application management is distributed –  Hardware trends - Currently run clusters of 4,000 machines

•  6,000 2012 machines > 12,000 2009 machines •  <16+ cores, 48/96G, 24TB> v/s <8 cores, 16G, 4TB>

Page 12: Apache Hadoop YARN - The Future of Data Processing with Hadoop

•  Fault Tolerance and Availability –  Resource Manager

•  No single point of failure – state saved in ZooKeeper (coming soon)

•  Application Masters are restarted automatically on RM restart –  Application Master

•  Optional failover via application-specific checkpoint •  MapReduce applications pick up where they left off via state saved

in HDFS

•  Wire Compatibility –  Protocols are wire-compatible –  Old clients can talk to new servers –  Rolling upgrades

12

Improvements vis-à-vis classic MapReduce

Page 13: Apache Hadoop YARN - The Future of Data Processing with Hadoop

•  Innovation and Agility

–  MapReduce now becomes a user-land library –  Multiple versions of MapReduce can run in the same cluster

(a la Apache Pig) •  Faster deployment cycles for improvements

–  Customers upgrade MapReduce versions on their schedule –  Users can customize MapReduce

13

Improvements vis-à-vis classic MapReduce

Page 14: Apache Hadoop YARN - The Future of Data Processing with Hadoop

© Hortonworks Inc. 2013 © Hortonworks Inc. 2013

To Talk to Tez..You Have To First Talk Yarn

• New Processing layer in Hadoop 2.0 that decouples Resource management from application management

• Created to manage resource needs across all uses

• Ensures predictable performance & QoS for all apps • Enables apps to run “IN” Hadoop rather than “ON”

– Key to leveraging all other common services of the Hadoop platform: security, data lifecycle management, etc.

Page 14

Applica'ons  Run  Na'vely  IN  Hadoop  

HDFS2  (Redundant,  Reliable  Storage)  

YARN  (Cluster  Resource  Management)      

BATCH  (MapReduce)  

INTERACTIVE  (Tez)  

STREAMING  (Storm,  S4,…)  

GRAPH  (Giraph)  

IN-­‐MEMORY  (Spark)  

HPC  MPI  (OpenMPI)  

ONLINE  (HBase)  

OTHER  (Search)  (Weave…)  

Page 15: Apache Hadoop YARN - The Future of Data Processing with Hadoop

© Hortonworks Inc. 2013

ResourceManager

Client

MapReduce Status

Job Submission

Client

NodeManager

Container Container

NodeManager

App Mstr Container

NodeManager

Container App Mstr

Node Status

Resource Request

Tez: High Throughput and Low Latency

Tez runs in YARN

Accelerate High Throughput

AND Low Latency

Processing

Page 16: Apache Hadoop YARN - The Future of Data Processing with Hadoop

© Hortonworks Inc. 2013 © Hortonworks Inc. 2013

Tez - Core Idea

Task with pluggable Input, Processor & Output

Page 16

YARN ApplicationMaster to run DAG of Tez Tasks

Input Processor

Task

Output

Tez Task - <Input, Processor, Output>

Page 17: Apache Hadoop YARN - The Future of Data Processing with Hadoop

© Hortonworks Inc. 2013 © Hortonworks Inc. 2013

Tez Hive Performance

• Low level data-processing, execution engine on YARN • Base for MapReduce, Hive, Pig, Cascading, etc. • Re-usable data processing primitives (ex: sort, merge, intermediate data management)

• All Hive SQL, can be expressed as single job – Jobs are no longer interrupted (efficient pipeline) – Avoid writing intermediate output to HDFS when performance

outweights job re-start (speed and network/disk usage savings) – Break MR contract to turn MRMRMR to MRRR (flexible DAG)

• Removes task and job overhead (10s savings is huge for a 2s query!)

Page 17

Page 18: Apache Hadoop YARN - The Future of Data Processing with Hadoop

© Hortonworks Inc. 2013 © Hortonworks Inc. 2013

Tez Service

• MR Query Startup Expensive – Job launch & task-launch latencies are fatal for short queries (in

order of 5s to 30s)

• Tez Service

– An always-on pool of Application Masters – Hive (and soon others like Pig) jobs are executed on an

Application Masterin the pool instead of starting a new Application Master(saving precious seconds) – Container Preallocation/Ware Containers

• In Summary… – Removes job-launch overhead (Application Master) – Removes task-launch overhead (Pre-warmed Containers)

Page 18

Page 19: Apache Hadoop YARN - The Future of Data Processing with Hadoop

© Hortonworks Inc. 2013 © Hortonworks Inc. 2013

Stinger Initiative: Making Apache Hive 100X Faster

Page 19 © Hortonworks Inc. 2013 Page 19 DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & SENSITIVE INFORMATION

Had

oop

Base  Op'miza'ons    

Generate  simplified  DAGs  In-­‐memory  Hash  Joins  

Current

3 – 9 Months

9 – 18 Months

Deep  Analy'cs    

SQL  CompaEble  Types  SQL  CompaEble  Windowing  

More  SQL  Subqueries  

Hiv

e

YARN    

Next-­‐gen  Hadoop  data  processing  framework  

Tez    

Express  data  processing  tasks  more  simply  

Eliminate  disk  writes  

Hive  Tez  Service    

Pre-­‐warmed  Containers  Low-­‐latency  dispatch  

ORCFile    

Column  Store  High  Compression  

Predicate  /  Filter  Pushdowns  

Buffer  Caching    

Cache  accessed  data  OpEmized  for  vector  engine  

Query  Planner    

Intelligent  Cost-­‐Based  OpEmizer  

Vector  Query  Engine    

OpEmized  for  modern  processor  architectures  

Page 20: Apache Hadoop YARN - The Future of Data Processing with Hadoop

© Hortonworks Inc. 2013 © Hortonworks Inc. 2013

Demo Benchmark Spec

• The TPC-DS benchmark data+query set

• Query 27 – big table(store_sales) joins lots of small tables – A.K.A Star Schema Join

• What does Query 27 do? For all items sold in stores located in specified states during a given year, find the average quantity, average list price, average list sales price, average coupon amount for a given gender, marital status, education and customer demographic..

Page 21: Apache Hadoop YARN - The Future of Data Processing with Hadoop

© Hortonworks Inc. 2013 © Hortonworks Inc. 2013

SELECT  col5,  avg(col6)  

FROM  store_sales_fact  ssf  

 join  item_dim  on  (ssf.col1  =  item_dim  .col1)      

 join  date_dim  on  (ssf.col2  =  date_dim.col2                                  

 join  custdmgrphcs_dim  on  (ssf.col3  =custdmgrphcs_dim.col3)      

 join  store_dim  on  (ssf.col4  =  store_dim.col4)    

GROUP  BY  col5                                                                                                                                  

ORDER  BY  col5  

LIMIT  100;  

Query 27 - Star Schema Join

• Derived from TPC-DS Query 27

Page 21

41 GB

58 MB

11MB

80MB

106 KB

Page 22: Apache Hadoop YARN - The Future of Data Processing with Hadoop

© Hortonworks Inc. 2013 © Hortonworks Inc. 2013

Benchmark Cluster Specs

• 6 Node HDP 1.3 Cluster – 2 Master Nodes – 4 Data/Slave Nodes

• Slave Node Specs – Dual Processors – 14 GB

Page 23: Apache Hadoop YARN - The Future of Data Processing with Hadoop

© Hortonworks Inc. 2013 © Hortonworks Inc. 2013

Query27 Execution Before Hive 11-Text Format

Query spawned 5 MR Jobs

The intermediate output of each job is written to HDFS

Job 1 of 5 – Mappers stream Fact table and first dimension table sorted by join key. Reducers do the join operation between the two tables. Job 2 of 5: Mappers and Reducers join the

output of job 1 and second dimension table.

Last job performs the order by and group operation

Query Response Time

149 total mappers got executed

Page 24: Apache Hadoop YARN - The Future of Data Processing with Hadoop

© Hortonworks Inc. 2013 © Hortonworks Inc. 2013

Query27 Execution With Hive 11-Text Format

Query spawned of 1 job with Hive 11 compared to 5 MR Jobs with Hive 10

Job 1 of 1 – Each Mapper loads into memory the 4 small dimension tables and streams parts of the large fact table. Joins then occur in Mapper hence the name MapJoin

Increase in performance with Hive 11 as query time went down from 21 minutes to about 4 minutes

Page 25: Apache Hadoop YARN - The Future of Data Processing with Hadoop

© Hortonworks Inc. 2013 © Hortonworks Inc. 2013

Query27 Execution With Hive 11- RC Format

Conversion from Text to RC file format decreased size of dimension data set from 38 GB to 8.21 GB

Smaller file equates to less IO causing the query time to decrease from 246 seconds to 136 seconds

Page 26: Apache Hadoop YARN - The Future of Data Processing with Hadoop

© Hortonworks Inc. 2013 © Hortonworks Inc. 2013

Query27 Execution With Hive 11- ORC Format

ORC File type consolidates data more tighly than RCFile as the size of dataset decreased from 8.21 GB to 2.83 GB

Smaller file equates to less IO causing the query time to decrease from 136 seconds to 104 seconds

Page 27: Apache Hadoop YARN - The Future of Data Processing with Hadoop

© Hortonworks Inc. 2013 © Hortonworks Inc. 2013

Query27 Execution With Hive 11- RC Format/Partitioned, Bucketed and Sorted

Partitioned the data decreases the file input size to 1.73 GB

Smaller file equates to less IO causing the query time to decrease from 136 seconds to 104 seconds

Page 28: Apache Hadoop YARN - The Future of Data Processing with Hadoop

© Hortonworks Inc. 2013 © Hortonworks Inc. 2013

Query27 Execution With Hive 11- ORCFormat/Partitioned, Bucketed and Sorted

Partitioned Data with ORC file format produces the smallest input size of 687 MB a decrease from 43 GB

Smallest Input Size allows us the fastest response time for the query: 79 seconds

Page 29: Apache Hadoop YARN - The Future of Data Processing with Hadoop

© Hortonworks Inc. 2013 © Hortonworks Inc. 2013

Summary of Results

File  Type   Number  of  MR  Jobs  

Input  Size   Mappers   Time  

Text/Hive  10   5   43.1  GB   179   21  minutes  

Text/Hive  11   1   38  GB   151   246  seconds  

RC/Hive  11   1   8.21  GB   76   136  seconds  

ORC/Hive  11   1   2.83  GB   38   104  seconds  

RC/Hive  11/Partitioned/Bucketed  

1   1.73  GB   19   104  seconds  

ORC/Hive  11/Partitioned/Bucketed  

1   687  MB   27   79.62  

Page 30: Apache Hadoop YARN - The Future of Data Processing with Hadoop

Resources

Page 30

hadoop-2.0.3-alpha: http://hadoop.apache.org/common/releases.html Release Documentation: http://hadoop.apache.org/common/docs/r2.0.3-alpha/ Blogs: http://hortonworks.com/blog/category/apache-hadoop/yarn/ http://hortonworks.com/blog/introducing-apache-hadoop-yarn/ http://hortonworks.com/blog/apache-hadoop-yarn-background-and-an-overview/ http://hortonworks.com/blog/apache-hadoop-yarn-background-and-an-overview/

Page 31: Apache Hadoop YARN - The Future of Data Processing with Hadoop

© Hortonworks Inc. 2012: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION

Thank You! Questions & Answers

Page 31