meson: heterogeneous workflows with spark at netflix
TRANSCRIPT
Heterogeneous Workflows with Spark at Netflix
0
Antony Arokiasamy | Kedar Sadekar | Personalization Infrastructure
1
Help members find content to watch and enjoy to maximize member satisfaction and retention
Everything is a Recommendation2
Recommendations are driven by Machine Learning
Ranking
Row
s
Machine Learning Pipeline3
User SelectionFeature
GenerationModel
ValidationPublishModel
Model Training
Machine Learning Pipeline Challenges
4
• Innovation• Heterogeneous Environments
• Spark• Native Support
• Separate Orchestration and Execution
• Multi Tenancy
• Machine Learning Constructs• Parameter Sweep – 30k Dockers
Meson Workflow System5
• General Purpose Workflow Orchestration and Scheduling framework• Delegates execution to resource managers like Mesos
• Optimized for Machine Learning Pipelines and Visualization
• Checkout the Blog• http://bit.ly/mesonws or techblog.netflix.com
• Plan to Open Sourced soon
Meson Architecture6
Standard and Custom Step Types7
Parameter Passing8
Hive Query User DataSet Regional DataSet
Global DataSet
Get Users
Regional Model
Global Model
User DataSet
Wrangle Data
Structured Constructs9
Top Down or Bottom Up10
Two Way Communication11
Spark Step12
Artifacts13
• Step outputs tracked as Artifacts
• Visualization
• Memoization
Multi Tenancy14
• Resource Attributes • spark.cores.max• spark.executor.memory• spark.mesos.constraints• Dynamic Resource Allocation
Cluster Management15
• Red-Black software updates
• Scale up/Scale down
Meson/Spark Cluster16
• 100s of Concurrent Jobs
• 700 Nodes
• 5000 Cores
• 25 TB Memory
• Apps: Meson Workflow System, Spark and Dockers
• Few smaller clusters
17
Antony Arokiasamy
Kedar Sadekar
@aasamy
/aasamy
@kedar_sadekar
/kedar-sadekar