your ml code productionizing seamlessly · some fun facts yelpers have written 155 million reviews...
TRANSCRIPT
![Page 1: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/1.jpg)
Productionizing your ML codeseamlessly
lauris #EuroPython
![Page 2: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/2.jpg)
IMAGEConnecting people with great local businesses
![Page 3: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/3.jpg)
![Page 4: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/4.jpg)
Yelpers have written 155 million reviews since 2004.Some fun factsabout Yelp
We have over 500 developers.
We have over 300 services and our monolith yelp-main has over 3 million lines of code!
We have 74 million desktop and 30 million mobile app monthly unique visitors.
![Page 5: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/5.jpg)
Agendafor today
How to improve your development workflow to make the path to production easier
What does running an ML model in production involve?
![Page 6: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/6.jpg)
Starting Point: your notebook
![Page 7: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/7.jpg)
predicts a desirable behavior
detects when an event is about to happen
recommends the best items to your usersIt solves your problem
STARTING POINT: YOUR NOTEBOOK
ranks items in search
forecasts trends in stocks
Your notebook ...
![Page 8: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/8.jpg)
Your notebook can be simple
STARTING POINT: YOUR NOTEBOOK
![Page 9: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/9.jpg)
… or not so much
STARTING POINT: YOUR NOTEBOOK
![Page 10: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/10.jpg)
Verify that your initial objective is accomplished, regularly
Making a model work in a notebook is only the first step to make it useful
STARTING POINT: YOUR NOTEBOOK
Getting data to predict or train on regularly
Evaluate the model
Use the predictions in your product
![Page 11: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/11.jpg)
IMAGE
“Only a small fraction of real-world ML systems is composed of the ML code [...] The required surrounding infrastructure is vast and complex.”
Hidden Technical Debt in Machine Learning Systems - Google NIPS 2015
STARTING POINT: YOUR NOTEBOOK
You are here
![Page 12: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/12.jpg)
What does running an ML model in production involve?
![Page 13: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/13.jpg)
IMAGE
WHAT DOES ML IN PROD INVOLVE?
This presentation is not about tooling
![Page 14: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/14.jpg)
IMAGE
WHAT DOES ML IN PROD INVOLVE?
This is a framework on how to tackle the problem
![Page 15: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/15.jpg)
Training Prediction
Your product
Data sources Data sources
SamplingSampling
Feature Extraction
Model Training
Model
Evaluation
Feature Extraction
Predictions
ML pipelineA simplified view
WHAT DOES ML IN PROD INVOLVE?
![Page 16: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/16.jpg)
Training Prediction
Your product
Data sources Data sources
SamplingSampling
Feature Extraction
Model Training
Model
Evaluation
Feature Extraction
Predictions
Your data source is updating
WHAT DOES ML IN PROD INVOLVE?
![Page 17: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/17.jpg)
Training Prediction
Your product
Data sources Data sources
SamplingSampling
Feature Extraction
Model Training
Model
Evaluation
Feature Extraction
Predictions
Updatingthe model
WHAT DOES ML IN PROD INVOLVE?
Regular training
Re-run strategies
Scale
![Page 18: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/18.jpg)
Training Prediction
Your product
Data sources Data sources
SamplingSampling
Feature Extraction
Model Training
Model
Evaluation
Feature Extraction
Predictions
Evaluating Model
WHAT DOES ML IN PROD INVOLVE?
![Page 19: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/19.jpg)
Evaluating Model
WHAT DOES ML IN PROD INVOLVE?
Does my evaluation metric reflect how this model will be used in production?
Consider both a classic metric and something that makes sense business wise.
Think about feedback loops
![Page 20: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/20.jpg)
Training Prediction
Your product
Data sources Data sources
SamplingSampling
Feature Extraction
Model Training
Model
Evaluation
Feature Extraction
Predictions
Generatingpredictions
WHAT DOES ML IN PROD INVOLVE?
Regular training
Re-run strategies
Scale
![Page 21: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/21.jpg)
Training Prediction
Your product
Data sources Data sources
SamplingSampling
Feature Extraction
Model Training
Model
Evaluation
Feature Extraction
Predictions
Usingpredictions
WHAT DOES ML IN PROD INVOLVE?
![Page 22: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/22.jpg)
Measuring Success
WHAT DOES ML IN PROD INVOLVE?
Beware about predicting what will happen anyway. Think about having a hold-out set.
A/B test different models
Track the business metrics you are trying to move
Confront your hypothesis to reality
![Page 23: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/23.jpg)
How to improve your development workflow to make the path to production easier
![Page 24: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/24.jpg)
General Advice
MAKE PRODUCTIONIZING EASIER
Use containers, virtualenvs
Use prod technologies from the get go
Persist your models, logs, code ...
Rely on already existing tools
![Page 25: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/25.jpg)
Training Prediction
Your product
Data sources Data sources
SamplingSampling
Feature Extraction
Model Training
Model
Evaluation
Feature Extraction
Predictions
Feature Extraction
MAKE PRODUCTIONIZING EASIER
=
![Page 26: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/26.jpg)
Feature Extraction
MAKE PRODUCTIONIZING EASIER
Code should be the same between training and prediction
Think about edge cases
Write features as codeGiant SQL queries are hard to review, test and maintain
Unit Test the feature extraction code
![Page 27: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/27.jpg)
Training Prediction
Your product
Data sources Data sources
SamplingSampling
Feature Extraction
Model Training
Model
Evaluation
Feature Extraction
Predictions
Evaluating Model
MAKE PRODUCTIONIZING EASIER
![Page 28: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/28.jpg)
Evaluating Model
MAKE PRODUCTIONIZING EASIER
Perform feature importance analysis, to be able to detect issues or big change between one training and the other.
Set aside a sanity check test set for evaluation
![Page 29: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/29.jpg)
Training Prediction
Your product
Data sources Data sources
SamplingSampling
Feature Extraction
Model Training
Model
Evaluation
Feature Extraction
Predictions
VersionAll the things
Keep track of change
MAKE PRODUCTIONIZING EASIER
Versioned
Versioned
Versioned
Versioned
Versioned
![Page 30: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/30.jpg)
Training Prediction
Your product
Data sources Data sources
SamplingSampling
Feature Extraction
Model Training
Model
Evaluation
Feature Extraction
Predictions
LogAll the things
And persist it
MAKE PRODUCTIONIZING EASIER
log
log
log
log
log
log
log
log
![Page 31: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/31.jpg)
Training Prediction
Your product
Data sources Data sources
SamplingSampling
Feature Extraction
Model Training
Model
Evaluation
Feature Extraction
Predictions
Monitoring the pipeline
MAKE PRODUCTIONIZING EASIER
![Page 32: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/32.jpg)
Monitoring the pipeline
MAKE PRODUCTIONIZING EASIER
Keep track of the number of prediction generated (and alert when it’s 0)
Alert on errors in your pipeline’s code
Alert on the business metrics you are trying to move.
Keep track of timings, to be able to see problems earlier
Write runbooks
![Page 33: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/33.jpg)
Closing word
ML code is code, and all good practices still apply.
Verify your assumption against reality. Really.
Design for change, and evolution.
![Page 34: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/34.jpg)
www.yelp.com/careers/
Offices @HamburgLondon
San Francisco
We're Hiring!
![Page 35: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/35.jpg)
Thank you!
![Page 36: your ML code Productionizing seamlessly · Some fun facts Yelpers have written 155 million reviews since 2004. about Yelp We have over 500 developers. We have over 300 services and](https://reader033.vdocuments.us/reader033/viewer/2022052005/60190ba41e3b86114140ed09/html5/thumbnails/36.jpg)
Questions?