the future of mlops - bristol data scientists...the future of mlops … and how did we get here?...

Post on 27-Jun-2020

0 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

The Future of MLOps… and how did we get here?

Luke Marsden, CEO & Founder

@lmarsden #mlops

Software Engineering

DevOps

Machine Learning

The state of...

@lmarsden #mlops

Software EngineeringHow we write software

From punch cards… to multi-user UNIX systems...

To personal computers…

To networked computers…

Emailing patch files…

Centralised version control…

Distributed version control… Git! GitHub!

@lmarsden #mlops

DevOpsHow we deploy software

From editing code live on the server…

To building binaries and emailing them around…

To continuous integration…

To public cloud...

To immutable infrastructure… Docker! Kubernetes! GitOps!

@lmarsden #mlops

Machine Learning (AI)How we use data + maths to train models

From curve fitting, linear regressions…

To Markov chains… to neural networks…

To the first 'AI winter'...

Rediscovery of backprop, move to data driven approach…

Deep learning becomes computationally feasible...

@lmarsden #mlops

As ML emerges from research, these disciplines need to converge…

The convergence of software engineering,

DevOps and ML will be called MLOps

@lmarsden #mlops

AI/ML has the potential to reinvent the global economy

But as a discipline it's immature —it's the Wild West out there

We've seen damaging levels of chaos and pain operationalizing AI/ML

@lmarsden #mlops

Key problems affecting AI efforts today

Many teams have problems they don't realize exist, because it's just "how it's done"

1. Can’t deploy – models blocked2. Wasting time3. Inefficient collaboration4. Manual tracking5. No reproducibility or provenance6. Unmonitored models

and more...@lmarsden #mlops

We’ve been here before!

In the 90s, software engineering was siloed.

Not everything was version controlled, there was no continuous delivery.

Software took months to ship; now it ships in minutes.

The same is possible for AI...

@lmarsden #mlops

Requirements to achieveMLOps

2. AccountableMust be able to trace back from model in production to its provenance

4. ContinuousMust be able to deploy automatically & monitor statistically

MANIFESTO

1. ReproducibleMust be able to retrain a 9 month old model to within few %

3. CollaborativeMust be able to do asynchronous collaboration

@lmarsden #mlops

Code Test Deploy

Software Development

Machine Learning

Data runs Code Parameters

Model runs

Models / metrics Deploy

Monitor

Monitor

Data engineering & ML are different from software development

There are unique ways that working with data, code & models is different to just working with code.

This makes existing software tools insufficient...

@lmarsden #mlops

Code Test Deploy

Software Development

Machine Learning

Data runs Code Parameters

Model runs

Models / metrics Deploy

Monitor

Monitor

By tracking runs, you can achieve MLOps

Normal software development tools focus on commits of code. But ML has many more moving parts: data, models & metrics.

You need to track runs, bundling data, code & param versions that went into creating an intermediate dataset or a model. This provides full context for reproducibility, and provenance to connect data engineering with model training to track back for accountability.

@lmarsden #mlops

The MLOps model lifecycle.

@lmarsden #mlops

DEPLOYMODELS

STATISTICALMONITORING

Model runs

Metrics Realtime production traffic

Real-time decisionsTraining

V2V3

DEVELOPMENT PRODUCTIONV1FEB

MAR

Data

Labels

Features

Data runsRaw

DATA ENGINEERING

Data EngineeringIn data engineering pipelines, you need to track data runs.

As raw data is processed, features engineered & samples annotated with labels, every data version needs to be recorded and made available for model development with full provenance. Need to avoid questions about which data was used to train a model.

@lmarsden #mlops

DEPLOYMODELS

STATISTICALMONITORING

Model runs

Metrics Realtime production traffic

Real-time decisionsTraining

V2V3

DEVELOPMENT PRODUCTIONV1FEB

MAR

Data

Labels

Features

Data runsRaw

DATA ENGINEERING

Model DevelopmentOnce your data is annotated and ready to start building models, need to track model runs.

You can increase team productivity by building a 'runnable ML knowledge base' to eliminate silos. Need to reduce key person risk by making it easy for anyone to pick up where another left off.

@lmarsden #mlops

DEPLOYMODELS

STATISTICALMONITORING

Model runs

Metrics Realtime production traffic

Real-time decisionsTraining

V2V3

DEVELOPMENT PRODUCTIONV1FEB

MAR

Data

Labels

Features

Data runsRaw

DATA ENGINEERING

Models in ProductionNeed to be able to push your models to production through a CI/CD system.

Once models are in production, need to keep them performing reliably. Need alerting on issues with statistical monitoring & to be confident changes to models to fix issues are working need ability to do forensic provenance tracking.

@lmarsden #mlops

Live demo!

@lmarsden #mlops@lmarsden #mlops

Thank you! Questions?

Please help🙏 luke@dotscience.com

Come talk to me & see a demo at dotscience.com😉

@lmarsden #mlops

top related