Machine learning Go’s and No-Go’s
Adrian Foltyn, External Data Science Expert
IoT / Blockchain / AI Expo, Amsterdam
27 June 2018
Perfectly Portioned Ingredients For 3-5 Meals
Per Week
Personalised Fresh Food, Locally
Sourced
Easily Managed Via Subscription
Platform
1 Box Delivered Weekly To The
Door
NoPlanning
NoShopping
NoWaste
HelloFresh breaks the dinner routine by continuously innovating both service and product
Disrupting the supply chain by cutting middlemen, ensuring higher margins and fresher products
5
HelloFresh global footprint
+ Luxembourg and Northern France 2018
How we use data science / machine learning
DATA SCIENCE @ HF
Fraud detection
Marketing attribution
Lifetime / churn prediction
Recommendation engines
Demand forecasting
minimize cost maximize revenue
Generalized Additive Models
Support Vector Regressions
Random Forests
Extreme Gradient Boosting
Bayesian networks
Collaborative filtering
Deep learning CNNs
ARIMA & other time series
models
Hidden Markov Models
Graph databases
Myself: a drift between consulting and data science
▪ Quant methods and computational psychoacoustics
▪ Demand forecasting
▪ Market research & business intelligence
▪ Data Science in strategic consulting
▪ Data Science in-house
Data Science Lead’s decision path
In-house <----> vendor
Bottom up <----> top down
Algorithms <----> features
Quality <----> business-valid output
Who is going to do it?
What approach shall we take?
Where shall we focus?
What result is expected?
In-house <----> vendor
Bottom up <----> top down
Algorithms <----> features
Quality <----> business-valid output
Why do I need a vendor as CDO / VP / Director / Head of Data Science?
▪ Actual business reasons, bla bla bla…
and...
▪ I have too few people
▪ My people don’t know that sh*t
▪ I don’t believe my team can do better
▪ I’m easily impressed by tech gimmicks
▪ I want a butt-cushion = evidence I’m right
▪ I’d probably need to pay ridiculous money to hire those PhDs…
▪ …..
Big data landscape 2017
Common fallacies -> collaboration with pink glasses on
Common fallacies -> methodology & output
In-house <----> vendor
Bottom up <----> top down
Algorithms <----> features
Quality <----> business-valid output
First decision: forecast top-down or bottom-up ?
CustomerID
Weeks from activation
Weeks from last pause
Weeksfrom last meal swap
No. of meal swaps total
No. ofboxes in total
Box type
……. Probabilityof getting a box
….. ….. ….. ….. ….. ….. ….. ….. 0.4
….. ….. ….. ….. ….. ….. ….. ….. 0.7
….. ….. ….. ….. ….. ….. ….. ….. 0.5
….. ….. ….. ….. ….. ….. ….. ….. 0.6
Total 0.55
Sales (boxes)**
Outlook of actives
Outlook of pauses
………..
`
** Dummy data in all charts
~
+
+
Theorem I ☺: given data availability, nearly all problems in ML can be represented by both a bottom-up and top-down approach
Why do we need a top-down forecasting model?
Should I stay (or should I
go)?
Shall I take a break?
Do I care to see my options
?
Do I swap my
meals?
CANCEL? PAUSE?TRUST DEFAULT MEAL CHOICE ?
SWAP MEALS?
Y
N
Y
Y
Y
N
N
N
• Each decision increases variance of final output
• In a bottom-up model those variances could mitigate each other or could explode…
• Top-down model (aggregate number of boxes) is much more stable
Next step is bottom up: predicting user-level demand withdeep learning
CNNs
Factorization / Word2Vec
Marketing attribution -> again: top-down or bottom-up approach?
CustomerID
Touchpoint Paid-Social
Touchpt.Affiliates
Touchpt. Bloggers
Touchpt….
Likelihoodof outdoor exposure
Likelihoodof TVexposure
…….
Number of boxes overfirst year(CLV)
….. ….. ….. ….. ….. ….. ….. ….. 10.5
….. ….. ….. ….. ….. ….. ….. ….. 2.5
….. ….. ….. ….. ….. ….. ….. ….. 5.3
….. ….. ….. ….. ….. ….. ….. ….. 7.6
Total 9.2
Number of boxes from
newly acquired
customers
Activity in TV**
Activity in PaidSocial
**
………..
`
** Dummy data in all charts
~
+
+
In-house <----> vendor
Bottom up <----> top down
Algorithms <----> features
Quality <----> business-valid output
We need to manage the trade off between devoting resourcesto feature extraction vs algorithm selection & tuning
• In most situations, a healthy balance is required, tending in the direction indicated by the 4 criteria• Focus on algorithms does not necessarily entail that we dive straight into deep learning!
Data size How unstructured is your data
Looking for causal explanationsBudget
Focus on algorithms
Focus on features
Focus on algorithms
Focus on features
Some problems just get no good solutions without right features….
Demand forecasting backtest in country 1 Demand forecasting backtest in country 2
• Neither standard time series nor average ensemble forecast work• Best forecast method selected by progressive cross validation is better (final.forecast)• Frequent review based on backtesting and root-cause analysis is even better
` `
Mo
del
err
or
bas
ed o
n 1
6-w
eek
pro
gres
sive
cro
ss-v
alid
atio
n
Mo
del
err
or
bas
ed o
n 1
6-w
eek
pro
gres
sive
cro
ss-v
alid
atio
n
In-house <----> vendor
Bottom up <----> top down
Algorithms <----> features
Quality <----> business-valid output
Standard data science process is not linear => requires iterations
Source: Microsoft
• The key is to factor in iterations with business stakeholdersas indispensablesteps in ALL phasesof project timeline, not only at largemilestones
Balancing business insight and simulation / prediction power
• Typically, statistics used doesn’t align exactly to desired business outcomes
• There is usually an inverse relationship between how well the model predicts and how interpretable are its components
• In marketing attribution, forcing intuitive constraints (non-negative contribution of channels, convex shape of response = saturation etc.) often affect fit and predictive strength
• Hitting sweet spot requires an iterative process of refining the model against business assumptions and usability / actionability
Example: Simulator for marketing attribution & ROI purposes based on a PCA + Bayesian network + GAM model
• MVP alone required 8-9 iterations…• …and it’s an ongoing process
Conclusions: Go’s for ML
▪ Vendors are your friends but don’t marry them
▪ Combine bottom-up and top-down approach
▪ Make informed decisions about balancing resources between algorithm dev / selection and feature engineering
▪ Factor in iterations with business and make it part of model building
▪ Keep calm and always be prepared to explain discrepancies, since…
Predicting / forecasting / simulating is the art of saying what will
happen and then explaining why it didn’t…
We’re hiring at HelloFresh!
▪ Data Scientists
‒ Python, R, Spark, Scala, ML + computer vision / NLP / other deep learning experience
▪ Machine Learning Engineers
‒ Python, Hadoop, Spark, Kafka, ML productionizing expertise
▪ Data Engineers
‒ Python, Hadoop, Spark, Kafka, Airflow, ETL experience
https://www.hellofresh.com/careers/
Thanks!Any Questions?