what's new in hadoop yarn- dec 2014

18
Page 1 © Hortonworks Inc. 2014 What’s new in YARN Hortonworks. We do Hadoop.

Upload: inmobi-technology

Post on 14-Jul-2015

368 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: What's new in Hadoop Yarn- Dec 2014

Page 1 © Hortonworks Inc. 2014

What’s new in YARN

Hortonworks. We do Hadoop.

Page 2: What's new in Hadoop Yarn- Dec 2014

Page 2 © Hortonworks Inc. 2014

Speaker

Varun Vasudev

Hortonworks

Work on Hadoop YARN

[email protected]

Page 3: What's new in Hadoop Yarn- Dec 2014

Page 3 © Hortonworks Inc. 2014

Agenda

• Overview of YARN

• New YARN Innovation in Hadoop 2.6

– Rolling upgrades

– Added fault tolerance

– CPU scheduling in Capacity Scheduler

– C-Group isolation

– Node labels

– Support for long running services

• Q & A

Page 4: What's new in Hadoop Yarn- Dec 2014

Page 4 © Hortonworks Inc. 2014

Overview of YARN

Page 5: What's new in Hadoop Yarn- Dec 2014

Page 5 © Hortonworks Inc. 2014

What is YARN?

YARN : Data Operating System

°1 ° ° ° ° ° ° ° °

° ° ° ° ° ° ° ° °

°

°°

° ° ° ° ° ° °

° ° ° ° ° ° N

BatchMapReduce

Batch & InteractiveTez

Real-TimeSlider

Direct

Java

.NET

Scripting

Pig

SQL

Hive

Cascading

Java

Scala

NoSQL

HBase

Accumulo

Stream

Storm

Other

ISV

Other

ISV

Others

Spark

Other ISV

Page 6: What's new in Hadoop Yarn- Dec 2014

Page 6 © Hortonworks Inc. 2014

What’s new in Hadoop YARN 2.6?

Page 7: What's new in Hadoop Yarn- Dec 2014

Page 7 © Hortonworks Inc. 2014

YARN in Hadoop 2.6: What’s New

• Security: Kerberos Token Renewal

• Log Aggregation: View logs for running applications

• Fault Tolerance: AM Retry and Container keep alive

• Service Registry: Directory for services running in YARN

Long RunningServices Support

THEM

E

• CPU Scheduling in Capacity Scheduler

• CPU Isolation through CGroups

• Node Labels for scheduling constraints

Workload SchedulingTH

EME

• YARN Rolling Upgrades support

• Work Preserving Restart

• Timeline Server support in Secure clusters

Reliable and Secure

OperationsTHEM

E

Page 8: What's new in Hadoop Yarn- Dec 2014

Page 8 © Hortonworks Inc. 2014

New in Hadoop YARN 2.6:

Long Running Service Support

Page 9: What's new in Hadoop Yarn- Dec 2014

Page 9 © Hortonworks Inc. 2014

YARN Supports Multiple Workloads

YARN : Data Operating System

°1 ° ° ° ° ° ° ° °

° ° ° ° ° ° ° ° °

°

°°

° ° ° ° ° ° °

° ° ° ° ° ° N

HDFS (Hadoop Distributed File System)

BatchMapReduce

Batch & InteractiveTez

Real-TimeSlider

Direct

Java

.NET

Scripting

Pig

SQL

Hive

Cascading

Java

Scala

NoSQL

HBase

Accumulo

Stream

Storm

Other

ISV

Other

ISV

Application

Others

Spark

Other ISV

Page 10: What's new in Hadoop Yarn- Dec 2014

Page 10 © Hortonworks Inc. 2014

Enhancements to Support Long Running Services

• YARN-941: YARN updates Kerberos token for a Long Running Service after the token expires

SecurityC

apab

ility

• YARN-2468: Aggregate and capture logs during the lifetime of a Long Running Service

Log Aggregation

Cap

abili

ty

• YARN-1489: When ApplicationMaster(AM) restarts, do not kill all associated containers – reconnect to the AM

• YARN-611/YARN-614: Tolerate AM failures for Long Running Services

Fault Tolerance

Cap

abili

ty

• YARN-913: Service Registry that publishes host and ports that each service comes up on

ServiceRegistry

Cap

abili

ty

Page 11: What's new in Hadoop Yarn- Dec 2014

Page 11 © Hortonworks Inc. 2014

New in Hadoop YARN 2.6:

Workload Scheduling

Page 12: What's new in Hadoop Yarn- Dec 2014

Page 12 © Hortonworks Inc. 2014

Node Labels: Apply Node Constraints

AApp

L1 L1 L1

Deploy/Allocate

BApp

L1 L1 L1

Isolate

A A A

nodes

labels

Page 13: What's new in Hadoop Yarn- Dec 2014

Page 13 © Hortonworks Inc. 2014

CPU Scheduling in Capacity Scheduler

What

• Admin tells YARN how much CPU capacity is available in cluster

• Applications specify CPU capacity needed for each container

• YARN Capacity Scheduler schedules application taking CPU capacity availability into account

Why

• Applications (for example Storm, HBase, Machine Learning) need predictable access to CPU

as a resource

• CPU has become bottleneck instead of memory in certain clusters (128 GB RAM, 6 CPUs)

Page 14: What's new in Hadoop Yarn- Dec 2014

Page 14 © Hortonworks Inc. 2014

CGroup Isolation

What

• Admin enables CGroups for CPU Isolation for all YARN application workloads

Why

• Applications need guaranteed access to CPU resources

• To ensure SLAs, need to enforce CPU allocations given to an Application container

Page 15: What's new in Hadoop Yarn- Dec 2014

Page 15 © Hortonworks Inc. 2014

New in Hadoop YARN 2.6:

Reliable and Secure Operations

Page 16: What's new in Hadoop Yarn- Dec 2014

Page 16 © Hortonworks Inc. 2014

YARN Rolling Upgrades Support

• No Service Disruption

– Highly Available Resource Manager

– Job submission automatically retries during failover

• No Service Degradation

– Resource Manager Restart with Work Preservation

– Node Manager Restart with Container Preservation

• Preserve State

– Preserve Application Queue

– Preserve Application History

– Preserve Node/Container state

Page 17: What's new in Hadoop Yarn- Dec 2014

Page 17 © Hortonworks Inc. 2014

Secure Timeline Server Support

App Timeline ServerAMBARI

Custom App Monitoring

ClientHTTPs

HT

TP

s

Page 18: What's new in Hadoop Yarn- Dec 2014

Page 18 © Hortonworks Inc. 2014

Q & A