microsoft - usenix...hadoop mr, centralized sched. mr 2003 2008 2013 2019 scope, centralized sched....

23
Carlo Curino, Subru Krishnan, Konstantinos Karanasos, Sriram Rao, Giovanni M. Fumarola, Botong Huang, Kishore Chaliparambil, Arun Suresh, Young Chen, Solom Heddaya, Roni Burd, Sarvesh Sakalanaga, Chris Douglas, Bill Ramsey, and Raghu Ramakrishnan Microsoft

Upload: others

Post on 16-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Microsoft - USENIX...Hadoop MR, Centralized sched. MR 2003 2008 2013 2019 Scope, Centralized sched. Scope [eurosys07, vldb08] Hydra MR YARN Tez Spark +multi-framework +security +scheduler

Carlo Curino, Subru Krishnan, Konstantinos Karanasos, Sriram Rao, Giovanni

M. Fumarola, Botong Huang, Kishore Chaliparambil, Arun Suresh, Young

Chen, Solom Heddaya, Roni Burd, Sarvesh Sakalanaga, Chris Douglas, Bill

Ramsey, and Raghu Ramakrishnan

Microsoft

Page 2: Microsoft - USENIX...Hadoop MR, Centralized sched. MR 2003 2008 2013 2019 Scope, Centralized sched. Scope [eurosys07, vldb08] Hydra MR YARN Tez Spark +multi-framework +security +scheduler

Hadoop MR,

Centralized sched.

MR

201920132003 2008

Scope,

Centralized sched.

Scope

[eurosys07, vldb08]

Hydra

MR

YARN

Tez Spark

+multi-framework

+security

+scheduler expressivity[socc13]

Scope1 Scope2 Scope3

Distributed sched.

+tooling/optimizer,

+scale, +high utilization[osdi14]

Page 3: Microsoft - USENIX...Hadoop MR, Centralized sched. MR 2003 2008 2013 2019 Scope, Centralized sched. Scope [eurosys07, vldb08] Hydra MR YARN Tez Spark +multi-framework +security +scheduler

Hadoop MR,

Centralized sched.

MR

201920132003 2008

Scope,

Centralized sched.

Scope

[eurosys07, vldb08]

Hydra

MR

YARN

Tez Spark

+multi-framework

+security

+scheduler expressivity[socc13]

Scope1 Scope2 Scope3

Distributed sched.

+tooling/optimizer,

+scale, +high utilization[osdi14]>99% tenants migrated

>250K servers

>500k daily jobs

>1 Zetabyte data processed

>1 Trillion tasks scheduled

Page 4: Microsoft - USENIX...Hadoop MR, Centralized sched. MR 2003 2008 2013 2019 Scope, Centralized sched. Scope [eurosys07, vldb08] Hydra MR YARN Tez Spark +multi-framework +security +scheduler

1. Support multiple application frameworks [socc13]

2. Simplify writing new app frameworks [vldb14,sigmod15,tocs17]

3. Achieve good ROI, i.e., high CPU utilization [atc15, eurosys16]

4. Scale to large clusters, many jobs, large jobs [nsdi19]

OSS +

pro

du

ctio

n!

Page 5: Microsoft - USENIX...Hadoop MR, Centralized sched. MR 2003 2008 2013 2019 Scope, Centralized sched. Scope [eurosys07, vldb08] Hydra MR YARN Tez Spark +multi-framework +security +scheduler

THE SCALE/UTILIZATION CHALLENGE…

Cluster(s):

> 50K nodes

Job(s):

>2M tasks, >5PB input

Tasks:

10 sec 50th %ile

Utilization:

~60% avg CPU util

Scheduler:

>70K QPS

Page 6: Microsoft - USENIX...Hadoop MR, Centralized sched. MR 2003 2008 2013 2019 Scope, Centralized sched. Scope [eurosys07, vldb08] Hydra MR YARN Tez Spark +multi-framework +security +scheduler

SCALE, UTILIZATION

SCHEDULING CONTROL AND MULTI-FRAMEWORK

WANT IT ALL?

Page 7: Microsoft - USENIX...Hadoop MR, Centralized sched. MR 2003 2008 2013 2019 Scope, Centralized sched. Scope [eurosys07, vldb08] Hydra MR YARN Tez Spark +multi-framework +security +scheduler

RM

sub-cluster 1

RM

sub-cluster 2

Ro

ute

rR

ou

ter

Ro

ute

rR

ou

ter

StateStore

ProxyStateStore

ProxyStateStore

Proxy

State

Store

Page 8: Microsoft - USENIX...Hadoop MR, Centralized sched. MR 2003 2008 2013 2019 Scope, Centralized sched. Scope [eurosys07, vldb08] Hydra MR YARN Tez Spark +multi-framework +security +scheduler

NM

RM

sub-cluster 1

RM

sub-cluster 2

Ro

ute

rR

ou

ter

Ro

ute

rR

ou

ter

StateStore

ProxyStateStore

ProxyStateStore

Proxy

State

Store

Page 9: Microsoft - USENIX...Hadoop MR, Centralized sched. MR 2003 2008 2013 2019 Scope, Centralized sched. Scope [eurosys07, vldb08] Hydra MR YARN Tez Spark +multi-framework +security +scheduler

AMRM

Proxy

AM

RM

sub-cluster 1

RM

sub-cluster 2

Ro

ute

rR

ou

ter

Ro

ute

rR

ou

ter

StateStore

ProxyStateStore

ProxyStateStore

Proxy

State

Store

Task

NMNM

Page 10: Microsoft - USENIX...Hadoop MR, Centralized sched. MR 2003 2008 2013 2019 Scope, Centralized sched. Scope [eurosys07, vldb08] Hydra MR YARN Tez Spark +multi-framework +security +scheduler

AMRM

Proxy

AM

RM

sub-cluster 1

RM

sub-cluster 2

Ro

ute

rR

ou

ter

Ro

ute

rR

ou

ter

StateStore

ProxyStateStore

ProxyStateStore

Proxy

State

Store

Global

Policy

Generator

NM

Page 11: Microsoft - USENIX...Hadoop MR, Centralized sched. MR 2003 2008 2013 2019 Scope, Centralized sched. Scope [eurosys07, vldb08] Hydra MR YARN Tez Spark +multi-framework +security +scheduler
Page 12: Microsoft - USENIX...Hadoop MR, Centralized sched. MR 2003 2008 2013 2019 Scope, Centralized sched. Scope [eurosys07, vldb08] Hydra MR YARN Tez Spark +multi-framework +security +scheduler

SCHEDULING DESIDERATA

Page 13: Microsoft - USENIX...Hadoop MR, Centralized sched. MR 2003 2008 2013 2019 Scope, Centralized sched. Scope [eurosys07, vldb08] Hydra MR YARN Tez Spark +multi-framework +security +scheduler

POLICIES

QA QB

root

50% 50%

demand: 12

demand: 4

demand: 12

demand: 12

✓ Locality, Fairness

✗ Utilization

✓ Utilization, Fairness

✗ Locality

✓ Utilization, locality

✗ Fairness

✓ Utilization,

✓ Fairness,

✓ Locality.

Page 14: Microsoft - USENIX...Hadoop MR, Centralized sched. MR 2003 2008 2013 2019 Scope, Centralized sched. Scope [eurosys07, vldb08] Hydra MR YARN Tez Spark +multi-framework +security +scheduler

KEY IDEA

Page 15: Microsoft - USENIX...Hadoop MR, Centralized sched. MR 2003 2008 2013 2019 Scope, Centralized sched. Scope [eurosys07, vldb08] Hydra MR YARN Tez Spark +multi-framework +security +scheduler

PROPOSED SOLUTION*

RM

sub-cluster 1

RM

sub-cluster 2

StateStore

ProxyStateStore

ProxyStateStore

Proxy

StateStore

Global

Policy

Generator•

1

2

3

1 1

2

3

* More advanced than what in prod. (details in paper)

Page 16: Microsoft - USENIX...Hadoop MR, Centralized sched. MR 2003 2008 2013 2019 Scope, Centralized sched. Scope [eurosys07, vldb08] Hydra MR YARN Tez Spark +multi-framework +security +scheduler

Page 17: Microsoft - USENIX...Hadoop MR, Centralized sched. MR 2003 2008 2013 2019 Scope, Centralized sched. Scope [eurosys07, vldb08] Hydra MR YARN Tez Spark +multi-framework +security +scheduler
Page 18: Microsoft - USENIX...Hadoop MR, Centralized sched. MR 2003 2008 2013 2019 Scope, Centralized sched. Scope [eurosys07, vldb08] Hydra MR YARN Tez Spark +multi-framework +security +scheduler

Page 19: Microsoft - USENIX...Hadoop MR, Centralized sched. MR 2003 2008 2013 2019 Scope, Centralized sched. Scope [eurosys07, vldb08] Hydra MR YARN Tez Spark +multi-framework +security +scheduler

*not in the paper, updated as of jan/feb 2019

Page 20: Microsoft - USENIX...Hadoop MR, Centralized sched. MR 2003 2008 2013 2019 Scope, Centralized sched. Scope [eurosys07, vldb08] Hydra MR YARN Tez Spark +multi-framework +security +scheduler
Page 21: Microsoft - USENIX...Hadoop MR, Centralized sched. MR 2003 2008 2013 2019 Scope, Centralized sched. Scope [eurosys07, vldb08] Hydra MR YARN Tez Spark +multi-framework +security +scheduler
Page 22: Microsoft - USENIX...Hadoop MR, Centralized sched. MR 2003 2008 2013 2019 Scope, Centralized sched. Scope [eurosys07, vldb08] Hydra MR YARN Tez Spark +multi-framework +security +scheduler

Page 23: Microsoft - USENIX...Hadoop MR, Centralized sched. MR 2003 2008 2013 2019 Scope, Centralized sched. Scope [eurosys07, vldb08] Hydra MR YARN Tez Spark +multi-framework +security +scheduler

If you want to be part of our next project, we are hiring!