cluster management systems - github pages · computation model: frameworks l a framework (e.g.,...

15
Cluster Management Systems Ion Stoica CS 294 September 26, 2016

Upload: others

Post on 20-May-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Cluster Management Systems - GitHub Pages · Computation Model: Frameworks l A framework (e.g., Hadoop, MPI) manages one or more jobs in a computer cluster l A job consists of one

ClusterManagementSystemsIonStoicaCS294

September26,2016

Page 2: Cluster Management Systems - GitHub Pages · Computation Model: Frameworks l A framework (e.g., Hadoop, MPI) manages one or more jobs in a computer cluster l A job consists of one

Motivationl Rapidinnovationincloudcomputing

l Todayl Nosingleframeworkoptimalforallapplicationsl Eachframeworkrunsonitsdedicatedclusterorcluster

partition

Dryad

Pregel

CassandraHypertable

2011slide

Page 3: Cluster Management Systems - GitHub Pages · Computation Model: Frameworks l A framework (e.g., Hadoop, MPI) manages one or more jobs in a computer cluster l A job consists of one

ComputationModel:Frameworksl Aframework (e.g.,Hadoop,MPI)managesoneormorejobs inacomputercluster

l Ajob consistsofoneormoretasksl Atask (e.g.,map,reduce)isimplementedbyoneormoreprocessesrunningonasinglemachine

3

cluster

FrameworkScheduler(e.g.,JobTracker)

Executor(e.g.,TaskTracker)

Executor(e.g.,TaskTraker)

Executor(e.g.,TaskTracker)

Executor(e.g.,TaskTracker)

task1task5

task3task7 task4

task2task6

Job1: tasks1,2,3,4Job2:tasks5,6,7

2011slide

Page 4: Cluster Management Systems - GitHub Pages · Computation Model: Frameworks l A framework (e.g., Hadoop, MPI) manages one or more jobs in a computer cluster l A job consists of one

OneFrameworkPerClusterChallengesl Inefficientresourceusage

l E.g.,HadoopcannotuseavailableresourcesfromPregel’scluster

l Noopportunityforstat.multiplexing

l Hardtosharedatal Copyoraccessremotely,expensive

l Hardtocooperatel E.g.,NoteasyforPregeltouse

graphsgeneratedbyHadoop

4

Hadoop

Pregel

0%#

25%#

50%#

0%#

25%#

50%#

Hadoop

Pregel

2011slideNeedtorunmultipleframeworksonsamecluster

Page 5: Cluster Management Systems - GitHub Pages · Computation Model: Frameworks l A framework (e.g., Hadoop, MPI) manages one or more jobs in a computer cluster l A job consists of one

Whatdowewant?

l Commonresourcesharinglayerl Abstracts(“virtualizes”)resourcestoframeworksl Enablediverseframeworkstoshareclusterl Makeiteasiertodevelopanddeploynewframeworks(e.g.,Spark)

5

MPIHadoopMPIHadoop

ResourceManagementSystem

Uniprograming Multiprograming

2011slide

Page 6: Cluster Management Systems - GitHub Pages · Computation Model: Frameworks l A framework (e.g., Hadoop, MPI) manages one or more jobs in a computer cluster l A job consists of one

FineGrainedResourceSharing

l Taskgranularitybothintime&spacel Multiplexnode/timebetweentasksbelongingtodifferent

jobs/frameworks

l Taskstypicallyshort;median~=10sec,minutes

l Whyfinegrained?l Improvedatalocalityl Easiertohandlenodefailures

62011slide

Page 7: Cluster Management Systems - GitHub Pages · Computation Model: Frameworks l A framework (e.g., Hadoop, MPI) manages one or more jobs in a computer cluster l A job consists of one

Goals

l Efficientutilizationofresources

l Supportdiverseframeworks (existing&future)

l Scalability to10,000’sofnodes

l Reliability infaceofnodefailures

2011slide

Page 8: Cluster Management Systems - GitHub Pages · Computation Model: Frameworks l A framework (e.g., Hadoop, MPI) manages one or more jobs in a computer cluster l A job consists of one

Approach:GlobalScheduler

8

GlobalScheduler

OrganizationpoliciesResourceavailability

• Responsetime• Throughput• Availability• …

Jobrequirements

2011slide

Page 9: Cluster Management Systems - GitHub Pages · Computation Model: Frameworks l A framework (e.g., Hadoop, MPI) manages one or more jobs in a computer cluster l A job consists of one

Approach:GlobalScheduler

9

GlobalScheduler

OrganizationpoliciesResourceavailability

• TaskDAG• Inputs/outputs

JobrequirementsJobexecutionplan

2011slide

Page 10: Cluster Management Systems - GitHub Pages · Computation Model: Frameworks l A framework (e.g., Hadoop, MPI) manages one or more jobs in a computer cluster l A job consists of one

Approach:GlobalScheduler

10

GlobalScheduler

OrganizationpoliciesResourceavailability

• Taskdurations• Inputsizes• Transfersizes

JobrequirementsJobexecutionplan

Estimates

2011slide

Page 11: Cluster Management Systems - GitHub Pages · Computation Model: Frameworks l A framework (e.g., Hadoop, MPI) manages one or more jobs in a computer cluster l A job consists of one

Approach:GlobalScheduler

l Advantages:canachieveoptimalschedulel Disadvantages:

l Complexityà hardtoscaleandensureresiliencel Hardtoanticipatefuture frameworks’ requirementsl Needtorefactorexistingframeworks

11

GlobalScheduler

OrganizationpoliciesResourceavailability

TaskscheduleJobrequirementsJobexecutionplan

Estimates

2011slide

Page 12: Cluster Management Systems - GitHub Pages · Computation Model: Frameworks l A framework (e.g., Hadoop, MPI) manages one or more jobs in a computer cluster l A job consists of one

Motivations,NowandThenl Recallmainmotivationsin2011:

l Efficientresourceusagel Enablerapidinnovationl Mainusecase:bigdataframeworks

l Whathaschangedsincethen?

12

Page 13: Cluster Management Systems - GitHub Pages · Computation Model: Frameworks l A framework (e.g., Hadoop, MPI) manages one or more jobs in a computer cluster l A job consists of one

WhathasChangedSinceThen?l Workloads:runlongrunningapplications(e.g.,fron-endservices)emergedasacommonworkloadl Noneedforfinegrainallocation

l Containershaveevolvedfromanisolationmechanismtoapackagingsolution(e.g.,Docker)l Containerorchestrationamajorusecaseàmanageentire

applifecycle:develop,test,deploy

l Cloudhasbecameprevalentl EasytoscaleupusingPAYG(statisticmultiplexingless

important) 13

Page 14: Cluster Management Systems - GitHub Pages · Computation Model: Frameworks l A framework (e.g., Hadoop, MPI) manages one or more jobs in a computer cluster l A job consists of one

Howtoclassifysystems?

l Whatisthegranularityofresourcesharing?l Tasksvsinstanceallocation

l Howdotheymakeresourcemanagementdecisions?l Centralizedvs.distributedallocationl Whomapsresourcestoapps,andwhomapsresourcesto

tasks?l Whatresourcemanagementpolicestheyenforce?

14

Page 15: Cluster Management Systems - GitHub Pages · Computation Model: Frameworks l A framework (e.g., Hadoop, MPI) manages one or more jobs in a computer cluster l A job consists of one

ThisLecture

l Mesos

l Yarn

l Omega(andBorg)

l Kubernetes

15