qconsf 2014 talk on netflix mantis, a stream processing system

100
Mantis: Netflix's Event Stream Processing System Justin Becker Danny Yuan 10/26/2014

Upload: danny-yuan

Post on 30-Jun-2015

884 views

Category:

Data & Analytics


6 download

DESCRIPTION

Justin and I gave this talk in QCon SF 2014 about the Mantis, a stream processing system that features a reactive programming API, auto scaling, and stream locality

TRANSCRIPT

Page 1: QConSF 2014 talk on Netflix Mantis, a stream processing system

Mantis: Netflix's Event Stream Processing System

Justin Becker Danny Yuan 10/26/2014

Page 2: QConSF 2014 talk on Netflix Mantis, a stream processing system

Motivation

Page 3: QConSF 2014 talk on Netflix Mantis, a stream processing system

Traditional TV just works

Page 4: QConSF 2014 talk on Netflix Mantis, a stream processing system

Netflix wants Internet TV to work just as well

Page 5: QConSF 2014 talk on Netflix Mantis, a stream processing system
Page 6: QConSF 2014 talk on Netflix Mantis, a stream processing system

Especially when things aren’t working

Page 7: QConSF 2014 talk on Netflix Mantis, a stream processing system
Page 8: QConSF 2014 talk on Netflix Mantis, a stream processing system

Tracking “big signals” is not enough

Page 9: QConSF 2014 talk on Netflix Mantis, a stream processing system

Need to track all kinds of permutations

Page 10: QConSF 2014 talk on Netflix Mantis, a stream processing system

Detect quickly, to resolve ASAP

Page 11: QConSF 2014 talk on Netflix Mantis, a stream processing system

Cheaper than product services

Page 12: QConSF 2014 talk on Netflix Mantis, a stream processing system

Here comes the problem

Page 13: QConSF 2014 talk on Netflix Mantis, a stream processing system

12 ~ 20 million metrics updates per second

Page 14: QConSF 2014 talk on Netflix Mantis, a stream processing system

750 billion metrics updates per day

Page 15: QConSF 2014 talk on Netflix Mantis, a stream processing system
Page 16: QConSF 2014 talk on Netflix Mantis, a stream processing system
Page 17: QConSF 2014 talk on Netflix Mantis, a stream processing system

4 ~ 18 Million App Events Per Second> 300 Billion Events Per Day

Page 18: QConSF 2014 talk on Netflix Mantis, a stream processing system
Page 19: QConSF 2014 talk on Netflix Mantis, a stream processing system
Page 20: QConSF 2014 talk on Netflix Mantis, a stream processing system
Page 21: QConSF 2014 talk on Netflix Mantis, a stream processing system
Page 22: QConSF 2014 talk on Netflix Mantis, a stream processing system
Page 23: QConSF 2014 talk on Netflix Mantis, a stream processing system
Page 24: QConSF 2014 talk on Netflix Mantis, a stream processing system
Page 25: QConSF 2014 talk on Netflix Mantis, a stream processing system
Page 26: QConSF 2014 talk on Netflix Mantis, a stream processing system

They Solve Specific Problems

Page 27: QConSF 2014 talk on Netflix Mantis, a stream processing system

They Require Lots of Domain Knowledge

Page 28: QConSF 2014 talk on Netflix Mantis, a stream processing system

A System to Rule Them All

Page 29: QConSF 2014 talk on Netflix Mantis, a stream processing system
Page 30: QConSF 2014 talk on Netflix Mantis, a stream processing system
Page 31: QConSF 2014 talk on Netflix Mantis, a stream processing system

Mantis: A Stream Processing System

Page 32: QConSF 2014 talk on Netflix Mantis, a stream processing system
Page 33: QConSF 2014 talk on Netflix Mantis, a stream processing system

3

Page 34: QConSF 2014 talk on Netflix Mantis, a stream processing system

1. Versatile User Demands

Page 35: QConSF 2014 talk on Netflix Mantis, a stream processing system
Page 36: QConSF 2014 talk on Netflix Mantis, a stream processing system
Page 37: QConSF 2014 talk on Netflix Mantis, a stream processing system
Page 38: QConSF 2014 talk on Netflix Mantis, a stream processing system
Page 39: QConSF 2014 talk on Netflix Mantis, a stream processing system
Page 40: QConSF 2014 talk on Netflix Mantis, a stream processing system

Asynchronous Computation Everywhere

Page 41: QConSF 2014 talk on Netflix Mantis, a stream processing system
Page 42: QConSF 2014 talk on Netflix Mantis, a stream processing system
Page 43: QConSF 2014 talk on Netflix Mantis, a stream processing system
Page 44: QConSF 2014 talk on Netflix Mantis, a stream processing system
Page 45: QConSF 2014 talk on Netflix Mantis, a stream processing system
Page 46: QConSF 2014 talk on Netflix Mantis, a stream processing system

3. Same Source, Multiple Consumers

Page 47: QConSF 2014 talk on Netflix Mantis, a stream processing system
Page 48: QConSF 2014 talk on Netflix Mantis, a stream processing system
Page 49: QConSF 2014 talk on Netflix Mantis, a stream processing system
Page 50: QConSF 2014 talk on Netflix Mantis, a stream processing system

We want Internet TV to just work

Page 51: QConSF 2014 talk on Netflix Mantis, a stream processing system

One problem we need to solve,detect movies that are failing?

Page 52: QConSF 2014 talk on Netflix Mantis, a stream processing system

Do it fast → limit impact, fix earlyDo it at scale → for all permutations Do it cheap → cost detect <<< serve

Page 53: QConSF 2014 talk on Netflix Mantis, a stream processing system

Work through the details for how to solve this problem in Mantis

Page 54: QConSF 2014 talk on Netflix Mantis, a stream processing system

Goal is to highlight unique and interesting design features

Page 55: QConSF 2014 talk on Netflix Mantis, a stream processing system

… begin with batch approach, the non-Mantis approach

Batch algorithm, runs every N minutes

for(play in playAttempts()){ Stats movieStats = getStats(play.movieId); updateStats(movieStats, play); if (movieStats.failRatio > THRESHOLD){

alert(movieId, failRatio, timestamp); }}

Page 56: QConSF 2014 talk on Netflix Mantis, a stream processing system

First problem, each run requires reads + writes to data store per run

Batch algorithm, runs every N minutes

for(play in playAttempts()){ Stats movieStats = getStats(play.movieId); updateStats(movieStats, play); if (movieStats.failRatio > THRESHOLD){

alert(movieId, failRatio, timestamp); }}

Page 57: QConSF 2014 talk on Netflix Mantis, a stream processing system

For Mantis don’t want to pay that cost: for latency or storage

Batch algorithm, runs every N minutes

for(play in playAttempts()){ Stats movieStats = getStats(play.movieId); updateStats(movieStats, play); if (movieStats.failRatio > THRESHOLD){

alert(movieId, failRatio, timestamp); }}

Page 58: QConSF 2014 talk on Netflix Mantis, a stream processing system

Next problem, “pull” model great for batch processing, bit awkward for stream processing

Batch algorithm, runs every N minutes

for(play in playAttempts()){ Stats movieStats = getStats(play.movieId); updateStats(movieStats, play); if (movieStats.failRatio > THRESHOLD){

alert(movieId, failRatio, timestamp); }}

Page 59: QConSF 2014 talk on Netflix Mantis, a stream processing system

By definition, batch processing requires batches. How do I chunk my data? Or, how often do I run?

Batch algorithm, runs every N minutes

for(play in playAttempts()){ Stats movieStats = getStats(play.movieId); updateStats(movieStats, play); if (movieStats.failRatio > THRESHOLD){

alert(movieId, failRatio, timestamp); }}

Page 60: QConSF 2014 talk on Netflix Mantis, a stream processing system

For Mantis, prefer “push” model, natural approach to data-in-motion processing

Batch algorithm, runs every N minutes

for(play in playAttempts()){ Stats movieStats = getStats(play.movieId); updateStats(movieStats, play); if (movieStats.failRatio > THRESHOLD){

alert(movieId, failRatio, timestamp); }}

Page 61: QConSF 2014 talk on Netflix Mantis, a stream processing system

For our “push” API we decided to use Reactive Extensions (Rx)

Page 62: QConSF 2014 talk on Netflix Mantis, a stream processing system

1. Observable is a natural abstraction for stream processing, Observable = stream

2. Rx already leveraged throughout the company

Two reasons for choosing Rx: theoretical, practical

Page 63: QConSF 2014 talk on Netflix Mantis, a stream processing system

So, what is an Observable?A sequence of events, aka a stream

Observable<String> o = Observable.just(“hello”, “qcon”, “SF”);o.subscribe(x->{println x;})

Page 64: QConSF 2014 talk on Netflix Mantis, a stream processing system

What can I do with an Observable?

Apply operators → New observableSubscribe → Observer of data

Operators, familiar lambda functionsmap(), flatMap(), scan(), ...

Page 65: QConSF 2014 talk on Netflix Mantis, a stream processing system

What is the connection with Mantis?

In Mantis, a job (code-to-run) is the collection of operators applied to a sourced observable where the output is sinked to observers

Page 66: QConSF 2014 talk on Netflix Mantis, a stream processing system

Think of a “source” observable as the input to a job.

Think of a “sink” observer as the output of the job.

Page 67: QConSF 2014 talk on Netflix Mantis, a stream processing system

Let’s refactor previous problem using Mantis API terminology

Source: Play attemptsOperators: Detection logicSink: Alerting service

Page 68: QConSF 2014 talk on Netflix Mantis, a stream processing system

Sounds OK, but how will this scale?

For pull model luxury of requesting data at specified rates/timeAnalogous to drinking from a straw

Page 69: QConSF 2014 talk on Netflix Mantis, a stream processing system

In contrast, push is the firehose

No explicit control to limit the flow of data

Page 70: QConSF 2014 talk on Netflix Mantis, a stream processing system

In Mantis, we solve this problem by scaling horizontally

Page 71: QConSF 2014 talk on Netflix Mantis, a stream processing system

Horizontal scale is accomplished by arranging operators into logical “stages”, explicitly by job writer or implicitly with fancy tooling (future work)

Page 72: QConSF 2014 talk on Netflix Mantis, a stream processing system

A stage is a boundary between computations. The boundary may be a network boundary, process boundary, etc.

Page 73: QConSF 2014 talk on Netflix Mantis, a stream processing system

So, to scale, Mantis job is really,

Source → input observableStage(s) → operatorsSink → output observer

Page 74: QConSF 2014 talk on Netflix Mantis, a stream processing system

Let’s refactor previous problem to follow the Mantis job API structure

MantisJob.source(Netflix.PlayAttempts()).stage({ // detection logic }).sink(Alerting.email())

Page 75: QConSF 2014 talk on Netflix Mantis, a stream processing system

We need to provide a computation boundary to scale horizontally

MantisJob.source(Netflix.PlayAttempts()).stage({ // detection logic }).sink(Alerting.email())

Page 76: QConSF 2014 talk on Netflix Mantis, a stream processing system

For our problem, scale is a function of the number of movies tracking

MantisJob.source(Netflix.PlayAttempts()).stage({ // detection logic }).sink(Alerting.email())

Page 77: QConSF 2014 talk on Netflix Mantis, a stream processing system

Lets create two stages, one producing groups of movies, other to run detection

MantisJob.source(Netflix.PlayAttempts()).stage({ // groupBy movieId }).stage({ // detection logic }).sink(Alerting.email())

Page 78: QConSF 2014 talk on Netflix Mantis, a stream processing system

OK, computation logic is split, how is the code scheduled to resources for execution?

MantisJob.source(Netflix.PlayAttempts()).stage({ // groupBy movieId }).stage({ // detection logic }).sink(Alerting.email())

Page 79: QConSF 2014 talk on Netflix Mantis, a stream processing system

One, you tell the Mantis Scheduler explicitly at submit time: number of instances, CPU cores, memory, disk, network, per instance

Page 80: QConSF 2014 talk on Netflix Mantis, a stream processing system

Two, Mantis Scheduler learns how to schedule job (work in progress)

Looks at previous run history Looks at history for source inputOver/under provision, auto adjust

Page 81: QConSF 2014 talk on Netflix Mantis, a stream processing system

A scheduled job creates a topologyStage 1 Stage 2 Stage 3

Worker

Worker

WorkerWorker

Worker

Worker

Worker

Application 1

Mantis clusterApplication

clusters

Application 2

Application 3

cluster boundary

Page 82: QConSF 2014 talk on Netflix Mantis, a stream processing system

Computation is split, code is scheduled, how is data transmitted over stage boundary?

Worker Worker?

Page 83: QConSF 2014 talk on Netflix Mantis, a stream processing system

Worker Worker?

Depends on the service level agreement (SLA) for the Job, transport is pluggable

Page 84: QConSF 2014 talk on Netflix Mantis, a stream processing system

Worker Worker?

Decision is usually a trade-off between latency and fault tolerance

Page 85: QConSF 2014 talk on Netflix Mantis, a stream processing system

Worker Worker?

A “weak” SLA job might trade-off fault tolerance for speed, using TCP as the transport

Page 86: QConSF 2014 talk on Netflix Mantis, a stream processing system

A “strong” SLA job might trade-off speed for fault tolerance, using a queue/broker as a transport

Worker Worker?

Page 87: QConSF 2014 talk on Netflix Mantis, a stream processing system

Forks and joins require data partitioning, collecting over boundaries

Worker

Worker

Worker

Fork Worker

Worker

Worker

Join

Need to partition data Need to collect data

Page 88: QConSF 2014 talk on Netflix Mantis, a stream processing system

Mantis has native support for partitioning, collecting over scalars (T) and groups (K,V)

Worker

Worker

Worker

Fork Worker

Worker

Worker

Join

Need to partition data Need to collect data

Page 89: QConSF 2014 talk on Netflix Mantis, a stream processing system

Let’s refactor job to include SLA, for the detection use case we prefer low latencyMantisJob.source(Netflix.PlayAttempts()).stage({ // groupBy movieId }).stage({ // detection logic }).sink(Alerting.email()).config(SLA.weak())

Page 90: QConSF 2014 talk on Netflix Mantis, a stream processing system

The job is scheduled and running what happens when the input-data’s volume changes?

Previous scheduling decision may not holdPrefer not to over provision, goal is for cost insights <<< product

Page 91: QConSF 2014 talk on Netflix Mantis, a stream processing system

Good news, Mantis Scheduler has the ability to grow and shrink (autoscale) the cluster and jobs

Page 92: QConSF 2014 talk on Netflix Mantis, a stream processing system

The cluster can scale up/down for two reasons: more/less job (demand) or jobs themselves are growing/shrinking

For cluster we can use submit pending queue depth as a proxy for demand

For jobs we use backpressure as a proxy to grow shrink the job

Page 93: QConSF 2014 talk on Netflix Mantis, a stream processing system

Backpressure is “build up” in a system

Imagine we have a two stage Mantis job,Second stage is performing a complex calculation, causing data to queue up in the previous stage

We can use this signal to increase nodes at second stage

Page 94: QConSF 2014 talk on Netflix Mantis, a stream processing system

Having touched on key points in Mantis architecture, want to show a complete job definition

Page 95: QConSF 2014 talk on Netflix Mantis, a stream processing system

Source

Stage 1

Stage 2

Sink

Page 96: QConSF 2014 talk on Netflix Mantis, a stream processing system

Play Attempts

Grouping by movie Id

Detection algorithm

Email alert

Page 97: QConSF 2014 talk on Netflix Mantis, a stream processing system

Sourcing play attempts

Static set of sources

Page 98: QConSF 2014 talk on Netflix Mantis, a stream processing system

Grouping by movie Id

GroupBy operator returns key selector function

Page 99: QConSF 2014 talk on Netflix Mantis, a stream processing system

Simple detection algorithms

Windows for 10 minutes or 1000 play events

Reduce the window to an experiment, update counts

Filter out if less than threshold

Page 100: QConSF 2014 talk on Netflix Mantis, a stream processing system

Sink alerts