meson: building a machine learning orchestration framework on mesos
TRANSCRIPT
Building a Machine Learning Orchestration Framework on Mesos
0
Antony Arokiasamy | Kedar Sadekar | Personalization Infrastructure
1
Help members find content to watch and enjoy to maximize member satisfaction and retention
Everything is a Recommendation2
Recommendations are driven by Machine Learning
Ranking
Row
s
Machine Learning Pipeline3
User Selection
Feature Generation
Model Validation
PublishModel
Model Training
Machine Learning Pipeline Challenges
4
• Innovation• Heterogeneous Environments
• Spark• Native Support
• Separate Orchestration and Execution
• Multi Tenancy
• ML Constructs• Parameter Sweep – 30k Dockers
Meson Workflow System in 30 seconds
5
• General Purpose Workflow Orchestration and Scheduling framework• Delegates execution to resource managers like Mesos
• Optimized for Machine Learning Pipelines and Visualization
• Checkout the Blog• bit.ly/mesonws or techblog.netflix.com
Meson Architecture6
Mesos Usage7
• Executors• Custom Executor• Executor Caching• Executor Cleanup
• Framework Messages
• Resource Attributes• Multi Tenancy• Cluster Management
Custom Executors8
• Reuse Executor Process• e.g. Spark• Executor Id = <unique id>
• Two Way Communication
Executor Caching9
Executor Caching10
• Executor Id = hash(<something unique for the class of executors>)• E.g. Executor Id = hash(classpath)
• Match with Executor Id in Offer
offers
accept
Executor Cleanup11
• Expiration
• Explicitly keep track of Executors
Framework Messages12
Multi Tenancy13
• Resource Attributes • spark.mesos.constraints
Cluster Management14
• Red-Black software updates
• Scale up/Scale down
Mesos Cluster15
• 100s of Concurrent Jobs
• 700 Nodes
• 5000 Cores
• 25 TB Memory
• Apps: Meson Workflow System, Spark and Dockers
• Few smaller clusters
What's Next16
• Fenzo Scheduler - https://github.com/Netflix/Fenzo• Bin Packing, Auto Scaling, Host Attributes/Constraints, Groups, etc
• Cook Scheduler - https://github.com/twosigma/Cook• Multi tenant Spark Scheduler
• Open Source Meson Workflow System
17
Antony Arokiasamy
Kedar Sadekar
@aasamy
/aasamy
@kedar_sadekar
/kedar-sadekar