Download - Machine learning Go’s and No - IoT Tech Expo World Series · to feature extraction vs algorithm selection & tuning • In most situations, a healthy balance is required, tending

Machine learning Go’s and No-Go’s

Adrian Foltyn, External Data Science Expert

IoT / Blockchain / AI Expo, Amsterdam

27 June 2018

Perfectly Portioned Ingredients For 3-5 Meals

Per Week

Personalised Fresh Food, Locally

Sourced

Easily Managed Via Subscription

Platform

1 Box Delivered Weekly To The

Door

NoPlanning

NoShopping

NoWaste

HelloFresh breaks the dinner routine by continuously innovating both service and product

Disrupting the supply chain by cutting middlemen, ensuring higher margins and fresher products

5

HelloFresh global footprint

+ Luxembourg and Northern France 2018

How we use data science / machine learning

DATA SCIENCE @ HF

Fraud detection

Marketing attribution

Lifetime / churn prediction

Recommendation engines

Demand forecasting

minimize cost maximize revenue

Generalized Additive Models

Support Vector Regressions

Random Forests

Extreme Gradient Boosting

Bayesian networks

Collaborative filtering

Deep learning CNNs

ARIMA & other time series

models

Hidden Markov Models

Graph databases

Myself: a drift between consulting and data science

▪ Quant methods and computational psychoacoustics

▪ Demand forecasting

▪ Market research & business intelligence

▪ Data Science in strategic consulting

▪ Data Science in-house

Data Science Lead’s decision path

In-house <----> vendor

Bottom up <----> top down

Algorithms <----> features

Quality <----> business-valid output

Who is going to do it?

What approach shall we take?

Where shall we focus?

What result is expected?

Why do I need a vendor as CDO / VP / Director / Head of Data Science?

▪ Actual business reasons, bla bla bla…

and...

▪ I have too few people

▪ My people don’t know that sh*t

▪ I don’t believe my team can do better

▪ I’m easily impressed by tech gimmicks

▪ I want a butt-cushion = evidence I’m right

▪ I’d probably need to pay ridiculous money to hire those PhDs…

▪ …..

Big data landscape 2017

Common fallacies -> collaboration with pink glasses on

Common fallacies -> methodology & output

First decision: forecast top-down or bottom-up ?

CustomerID

Weeks from activation

Weeks from last pause

Weeksfrom last meal swap

No. of meal swaps total

No. ofboxes in total

Box type

……. Probabilityof getting a box

….. ….. ….. ….. ….. ….. ….. ….. 0.4

….. ….. ….. ….. ….. ….. ….. ….. 0.7

….. ….. ….. ….. ….. ….. ….. ….. 0.5

….. ….. ….. ….. ….. ….. ….. ….. 0.6

Total 0.55

Sales (boxes)**

Outlook of actives

Outlook of pauses

………..

`

** Dummy data in all charts

~

+

+

Theorem I ☺: given data availability, nearly all problems in ML can be represented by both a bottom-up and top-down approach

Why do we need a top-down forecasting model?

Should I stay (or should I

go)?

Shall I take a break?

Do I care to see my options

?

Do I swap my

meals?

CANCEL? PAUSE?TRUST DEFAULT MEAL CHOICE ?

SWAP MEALS?

Y

N

Y

Y

Y

N

N

N

• Each decision increases variance of final output

• In a bottom-up model those variances could mitigate each other or could explode…

• Top-down model (aggregate number of boxes) is much more stable

Next step is bottom up: predicting user-level demand withdeep learning

CNNs

Factorization / Word2Vec

Marketing attribution -> again: top-down or bottom-up approach?

CustomerID

Touchpoint Paid-Social

Touchpt.Affiliates

Touchpt. Bloggers

Touchpt….

Likelihoodof outdoor exposure

Likelihoodof TVexposure

…….

Number of boxes overfirst year(CLV)

….. ….. ….. ….. ….. ….. ….. ….. 10.5

….. ….. ….. ….. ….. ….. ….. ….. 2.5

….. ….. ….. ….. ….. ….. ….. ….. 5.3

….. ….. ….. ….. ….. ….. ….. ….. 7.6

Total 9.2

Number of boxes from

newly acquired

customers

Activity in TV**

Activity in PaidSocial

**

………..

`

** Dummy data in all charts

~

+

+

We need to manage the trade off between devoting resourcesto feature extraction vs algorithm selection & tuning

• In most situations, a healthy balance is required, tending in the direction indicated by the 4 criteria• Focus on algorithms does not necessarily entail that we dive straight into deep learning!

Data size How unstructured is your data

Looking for causal explanationsBudget

Focus on algorithms

Focus on features

Focus on algorithms

Focus on features

Some problems just get no good solutions without right features….

Demand forecasting backtest in country 1 Demand forecasting backtest in country 2

• Neither standard time series nor average ensemble forecast work• Best forecast method selected by progressive cross validation is better (final.forecast)• Frequent review based on backtesting and root-cause analysis is even better

` `

Mo

del

err

or

bas

ed o

n 1

6-w

eek

pro

gres

sive

cro

ss-v

alid

atio

n

Mo

del

err

or

bas

ed o

n 1

6-w

eek

pro

gres

sive

cro

ss-v

alid

atio

n

Standard data science process is not linear => requires iterations

Source: Microsoft

• The key is to factor in iterations with business stakeholdersas indispensablesteps in ALL phasesof project timeline, not only at largemilestones

Balancing business insight and simulation / prediction power

• Typically, statistics used doesn’t align exactly to desired business outcomes

• There is usually an inverse relationship between how well the model predicts and how interpretable are its components

• In marketing attribution, forcing intuitive constraints (non-negative contribution of channels, convex shape of response = saturation etc.) often affect fit and predictive strength

• Hitting sweet spot requires an iterative process of refining the model against business assumptions and usability / actionability

Example: Simulator for marketing attribution & ROI purposes based on a PCA + Bayesian network + GAM model

• MVP alone required 8-9 iterations…• …and it’s an ongoing process

Conclusions: Go’s for ML

▪ Vendors are your friends but don’t marry them

▪ Combine bottom-up and top-down approach

▪ Make informed decisions about balancing resources between algorithm dev / selection and feature engineering

▪ Factor in iterations with business and make it part of model building

▪ Keep calm and always be prepared to explain discrepancies, since…

Predicting / forecasting / simulating is the art of saying what will

happen and then explaining why it didn’t…

We’re hiring at HelloFresh!

▪ Data Scientists

‒ Python, R, Spark, Scala, ML + computer vision / NLP / other deep learning experience

▪ Machine Learning Engineers

‒ Python, Hadoop, Spark, Kafka, ML productionizing expertise

▪ Data Engineers

‒ Python, Hadoop, Spark, Kafka, Airflow, ETL experience

https://www.hellofresh.com/careers/

Thanks!Any Questions?

Download - Machine learning Go’s and No - IoT Tech Expo World Series · to feature extraction vs algorithm selection & tuning • In most situations, a healthy balance is required, tending

Top Related