scheduling and sharing resources in data clusters
Post on 10-May-2015
571 Views
Preview:
DESCRIPTION
TRANSCRIPT
IntroductionYARNMesosOmega
Related workConclusions
Scheduling and sharing resources in Data Clusters
Jose Luis Lopez Pino
December 12, 2013
Jose Luis Lopez Pino Scheduling and sharing resources in Data Clusters
IntroductionYARNMesosOmega
Related workConclusions
Table of contents
1 IntroductionThe problemSolutions
2 YARNArchitectureAdvantagesDrawbacksPerformance
3 MesosArchitectureAdvantages
DrawbacksPerformance
4 OmegaArchitectureAdvantagesDrawbacksPerformance
5 Related workResource managersScheduling techniques
6 Conclusions
Jose Luis Lopez Pino Scheduling and sharing resources in Data Clusters
IntroductionYARNMesosOmega
Related workConclusions
The problemSolutions
The problem
Data lake
Multiple frameworks[6]
Duplicate de data
Cluster utilization
Jose Luis Lopez Pino Scheduling and sharing resources in Data Clusters
IntroductionYARNMesosOmega
Related workConclusions
The problemSolutions
Solutions[8]
1 Static partitioning
2 Monolithic schedulers
3 Two-level scheduler
4 Shared state approach
Jose Luis Lopez Pino Scheduling and sharing resources in Data Clusters
IntroductionYARNMesosOmega
Related workConclusions
ArchitectureAdvantagesDrawbacksPerformance
Architecture[10]
Figure: YARN architecture
Jose Luis Lopez Pino Scheduling and sharing resources in Data Clusters
IntroductionYARNMesosOmega
Related workConclusions
ArchitectureAdvantagesDrawbacksPerformance
Advantages
Scale
Data locality
Easy to port a new framework
Jose Luis Lopez Pino Scheduling and sharing resources in Data Clusters
IntroductionYARNMesosOmega
Related workConclusions
ArchitectureAdvantagesDrawbacksPerformance
Drawbacks
Failure recovery
High latency?
Network overload?
Jose Luis Lopez Pino Scheduling and sharing resources in Data Clusters
IntroductionYARNMesosOmega
Related workConclusions
ArchitectureAdvantagesDrawbacksPerformance
Performance
Job throughput
Application Masterflooding
Preemption
Jose Luis Lopez Pino Scheduling and sharing resources in Data Clusters
IntroductionYARNMesosOmega
Related workConclusions
ArchitectureAdvantagesDrawbacksPerformance
Architecture[9]
Figure: Mesos architecture
Jose Luis Lopez Pino Scheduling and sharing resources in Data Clusters
IntroductionYARNMesosOmega
Related workConclusions
ArchitectureAdvantagesDrawbacksPerformance
Advantages
Flexible
Extensible
Fault tolerance
Backup master nodeRecreate master using communicationUse checkpoints for the slaves
Jose Luis Lopez Pino Scheduling and sharing resources in Data Clusters
IntroductionYARNMesosOmega
Related workConclusions
ArchitectureAdvantagesDrawbacksPerformance
Drawbacks
Complex to port a framework
Intensive communication
Revocation might be dangerous
Penalizes long jobs
Jose Luis Lopez Pino Scheduling and sharing resources in Data Clusters
IntroductionYARNMesosOmega
Related workConclusions
ArchitectureAdvantagesDrawbacksPerformance
Performance
Missing: comparison of different policies and modules
Scalable
+ 18% memory+ 10% CPU utilizationless than 1s launching tasks with 50k nodes
Small tasks
Data locality with delay scheduling[12]
MPITorque and gang scheduling
Jose Luis Lopez Pino Scheduling and sharing resources in Data Clusters
IntroductionYARNMesosOmega
Related workConclusions
ArchitectureAdvantagesDrawbacksPerformance
Architecture[8]
Figure: Omega architecture
Jose Luis Lopez Pino Scheduling and sharing resources in Data Clusters
IntroductionYARNMesosOmega
Related workConclusions
ArchitectureAdvantagesDrawbacksPerformance
Advantages
Schedulers work in parallel
Very scalable
Jose Luis Lopez Pino Scheduling and sharing resources in Data Clusters
IntroductionYARNMesosOmega
Related workConclusions
ArchitectureAdvantagesDrawbacksPerformance
Drawbacks
Unfair distribution
Conflicts
Jose Luis Lopez Pino Scheduling and sharing resources in Data Clusters
IntroductionYARNMesosOmega
Related workConclusions
ArchitectureAdvantagesDrawbacksPerformance
Performance
Decision time and busyness of the scheduler
Real workloads
Jose Luis Lopez Pino Scheduling and sharing resources in Data Clusters
IntroductionYARNMesosOmega
Related workConclusions
Resource managersScheduling techniques
Resource managers
Heterogeneous environments: Corona and Cosmos [1]
Homogeneous environments: Quincy[4]
Jose Luis Lopez Pino Scheduling and sharing resources in Data Clusters
IntroductionYARNMesosOmega
Related workConclusions
Resource managersScheduling techniques
Scheduling techniques
Lottery scheduling[11]
Dynamic Proportional Share Scheduling[7]
Calibration: how does a particular task perform in a particularnode?[5]
Stragglers and speculative relaunch[13]
Delay scheduling: achieve locality, relax fairness[12]
Rich resource-requests[2]
Optimize short jobs[3]
Jose Luis Lopez Pino Scheduling and sharing resources in Data Clusters
IntroductionYARNMesosOmega
Related workConclusions
Conclusions
Different models
YARN:
Easier to port a new frameworkData locality
Mesos
Flexible and modularFault toleranceMore scalable
Omega:
FlexibleHighly scalable
Jose Luis Lopez Pino Scheduling and sharing resources in Data Clusters
IntroductionYARNMesosOmega
Related workConclusions
References I
[1] Ronnie Chaiken, Bob Jenkins, Per-Ake Larson, Bill Ramsey,Darren Shakib, Simon Weaver, and Jingren Zhou.Scope: easy and efficient parallel processing of massive datasets.Proceedings of the VLDB Endowment, 1(2):1265–1276, 2008.
[2] Carlo Curino, Djellel Difallah, Chris Douglas, RaghuRamakrishnan, and Sriram Rao.Reservation-based scheduling: If youre late dont blame us!
[3] Khaled Elmeleegy.Piranha: Optimizing short jobs in hadoop.Proceedings of the VLDB Endowment, 6(11):985–996, 2013.
Jose Luis Lopez Pino Scheduling and sharing resources in Data Clusters
IntroductionYARNMesosOmega
Related workConclusions
References II
[4] Michael Isard, Vijayan Prabhakaran, Jon Currey, Udi Wieder,Kunal Talwar, and Andrew Goldberg.Quincy: fair scheduling for distributed computing clusters.In Proceedings of the ACM SIGOPS 22nd symposium onOperating systems principles, pages 261–276. ACM, 2009.
[5] Gunho Lee, Byung-Gon Chun, and Randy H Katz.Heterogeneity-aware resource allocation and scheduling in thecloud.In Proceedings of the 3rd USENIX Workshop on Hot Topicsin Cloud Computing, HotCloud, volume 11, 2011.
Jose Luis Lopez Pino Scheduling and sharing resources in Data Clusters
IntroductionYARNMesosOmega
Related workConclusions
References III
[6] Kyong-Ha Lee, Yoon-Joon Lee, Hyunsik Choi, Yon DohnChung, and Bongki Moon.Parallel data processing with mapreduce: a survey.ACM SIGMOD Record, 40(4):11–20, 2012.
[7] Thomas Sandholm and Kevin Lai.Dynamic proportional share scheduling in hadoop.In Job scheduling strategies for parallel processing, pages110–131. Springer, 2010.
Jose Luis Lopez Pino Scheduling and sharing resources in Data Clusters
IntroductionYARNMesosOmega
Related workConclusions
References IV
[8] Malte Schwarzkopf, Andy Konwinski, Michael Abd-El-Malek,and John Wilkes.Omega: Flexible, scalable schedulers for large computeclusters.In Proceedings of the 8th ACM European Conference onComputer Systems, EuroSys ’13, pages 351–364, New York,NY, USA, 2013. ACM.
[9] Facebook Engineering Team.Under the hood: Scheduling mapreduce jobs more efficientlywith corona.
Jose Luis Lopez Pino Scheduling and sharing resources in Data Clusters
IntroductionYARNMesosOmega
Related workConclusions
References V
[10] Vinod K. Vavilapalli.Apache Hadoop YARN: Yet Another Resource Negotiator.In Proc. SOCC, 2013.
[11] Carl A Waldspurger and William E Weihl.Lottery scheduling: Flexible proportional-share resourcemanagement.In Proceedings of the 1st USENIX conference on OperatingSystems Design and Implementation, page 1. USENIXAssociation, 1994.
Jose Luis Lopez Pino Scheduling and sharing resources in Data Clusters
IntroductionYARNMesosOmega
Related workConclusions
References VI
[12] Matei Zaharia, Dhruba Borthakur, Joydeep Sen Sarma,Khaled Elmeleegy, Scott Shenker, and Ion Stoica.Delay scheduling: a simple technique for achieving localityand fairness in cluster scheduling.In Proceedings of the 5th European conference on Computersystems, pages 265–278. ACM, 2010.
[13] Matei Zaharia, Andy Konwinski, Anthony D Joseph, Randy HKatz, and Ion Stoica.Improving mapreduce performance in heterogeneousenvironments.In OSDI, volume 8, page 7, 2008.
Jose Luis Lopez Pino Scheduling and sharing resources in Data Clusters
top related