webinar: survival analysis for marketing attribution - july 17, 2013
DESCRIPTION
A central question in advertising is how to measure the effectiveness of different ad campaigns. In online advertising, including social media, it is possible to create thousands of different variations on an ad, and serve millions of impressions to targeted audiences each day. Rather too often, digital advertisers use the last click attribution model to evaluate the success of campaigns. In other words, when a user clicks on an ad impression, only the very last event is deemed as significant. This is convenient but doesn't help in making good marketing decisions. Survival analysis is widely used in the modeling of living organisms and time to failure of components, but Chandler-Pepelnjak (2010) proposed to use survival analysis for marketing attribution analysis. Listen to our webinar to learn more about this theory and a big data case study, showing how DataSong used Revolution Analytics.TRANSCRIPT
Revolution Confidential
Survival Analysis for
Marketing Attribution
Webinar
July 2013
Andrie de Vries
Business Services Director – Europe
@RevoAndrie
Revolution Analytics
@RevolutionR
Revolution ConfidentialWho am I?
CRAN package ggdendro
StackOverflow
Revolution Analytics Webinar, July 2013 2
Revolution ConfidentialFrom Toledo to Albacete
Revolution Analytics Webinar, July 2013 3
Revolution ConfidentialRent a car?
Revolution Analytics Webinar, July 2013 4
Revolution Confidential… or take a bus?
Revolution Analytics Webinar, July 2013 5
Revolution Confidential… but what’s happening here?
Revolution Analytics Webinar, July 2013 6
Revolution ConfidentialMarketing attribution: the Question
How to attribute conversion success to marketing spend?
Where to spend the next marketing dollar?
Revolution Analytics Webinar, July 2013 7
Revolution ConfidentialAgenda
Digital marketing attribution
Using Survival models
At scale, on big data
Revolution Analytics Webinar, July 2013 8
Revolution ConfidentialAgenda
Digital marketing attribution
Using Survival models
At scale, on big data
Revolution Analytics Webinar, July 2013 9
Revolution ConfidentialTwo-click conversion journey
Click1:
Open landing page
Click 2:
Sign up to offer
Revolution Analytics Webinar, July 2013 10
Revolution ConfidentialTypical conversion journey…
Impressions
• Banner ad
• Page post ad
• Sponsored tweet
• Search ad
Click
• Landing page
• Special offer
• Application form
Conversion
• Sign up
• Ask for more detail
11Revolution Analytics Webinar, July 2013 11
Revolution Confidential…but no two journeys are the same…
Impressions
• Banner ad
• Page post ad
• Sponsored tweet
• Search ad
Click
• Landing page
• Special offer
• Application form
Conversion
• Sign up
• Ask for more detail
12Revolution Analytics Webinar, July 2013
Person 2
Person 3
Person 1
12
Revolution Confidential
Attribution models
Last click only
All events even
Rule based
Statistical modelling
…so how to attribute the value?
13Revolution Analytics Webinar, July 2013
Person 2
Person 3
Person 1
13
Revolution ConfidentialAttribution with statistical modelling
Regression
In many cases, log data is available only
for conversions
And when non-conversion data is available,
these people may convert in near future
Revolution Analytics Webinar, July 2013 14
Revolution ConfidentialAttribution with statistical modelling
Regression
In many cases, log data is available only
for conversions
And when non-conversion data is available,
these people may convert in near future
Survival analysis
Use time to conversion as dependent
variable
Can use each interaction (view or click)
as an observation
Can include censored (incomplete) data
No need to flatten the data
Revolution Analytics Webinar, July 2013 15
Revolution ConfidentialAgenda
Digital marketing attribution
Using Survival models
At scale, on big data
Revolution Analytics Webinar, July 2013 16
Revolution ConfidentialSurvival models
Kaplan-Meier survivor function
Cox proportional hazards model
𝑆𝑘𝑚 =
𝑡𝑖<𝑡
𝑟 𝑡𝑖 − 𝑑(𝑡𝑖)
𝑟(𝑡𝑖)
𝐿(𝛽) = 𝐿𝑖(𝛽)
𝐿𝑖(𝛽) =𝑟𝑖 𝑡∗
𝑗 𝑌𝑗 𝑡∗ 𝑟𝑗 𝑡∗
𝜆 𝑡; 𝑍𝑖 = 𝜆0(𝑡)𝑟𝑖(𝑡) Hazard function
𝑟𝑖 𝑡 = 𝑒𝛽𝑍𝑖(𝑡) Risk score
Likelihood that
individual i dies
Partial likelihood
> library(survival)
> Surv(…)
> library(survival)
> coxph( Surv(…) ~ …)
Revolution Analytics Webinar, July 2013 17
Revolution ConfidentialWhat is death?
Revolution Analytics Webinar, July 2013
Medicine: actual death of patient
Engineering: failure of component
18
Revolution ConfidentialWhat is death?
Revolution Analytics Webinar, July 2013
For attribution: cookie conversion
19
Revolution Confidential
Worked example
Attribution of digital media for
telecoms client
Revolution Analytics Webinar, July 2013 20
Revolution ConfidentialRead the data
> rdsFile <- "survival_data.rds"
> xd <- readRDS(rdsFile)
> class(xd)
[1] "data.table" "data.frame"
> nrow(xd)
[1] 775782
> ncol(xd)
[1] 31
Revolution Analytics Webinar, July 2013 21
Revolution ConfidentialWhat does the data look like?> xd[1:25, 1:6, with=FALSE]
id Conversion.Time Event.Number Event.Time Event.Type Campaign
1: 10101:49721794 01/10/2012 00:05 1 01/10/2012 00:02 Click Free Sims
2: 10101:49721801 01/10/2012 00:05 1 29/09/2012 16:25 View BAU High Media
6: 10101:49721854 01/10/2012 00:07 3 17/09/2012 18:32 View BAU High Media
7: 10101:49721854 01/10/2012 00:07 4 17/09/2012 19:13 View BAU High Media
8: 10101:49721854 01/10/2012 00:07 5 17/09/2012 19:17 View BAU High Media
9: 10101:49721854 01/10/2012 00:07 6 17/09/2012 19:20 View BAU High Media
10: 10101:49721854 01/10/2012 00:07 7 17/09/2012 19:21 View BAU High Media
11: 10101:49721854 01/10/2012 00:07 8 17/09/2012 19:47 View BAU High Media
12: 10101:49721854 01/10/2012 00:07 9 17/09/2012 19:49 View BAU High Media
13: 10101:49721854 01/10/2012 00:07 10 17/09/2012 19:53 View BAU High Media
14: 10101:49721854 01/10/2012 00:07 11 17/09/2012 20:04 View BAU High Media
15: 10101:49721854 01/10/2012 00:07 12 18/09/2012 10:02 View BAU High Media
16: 10101:49721854 01/10/2012 00:07 13 18/09/2012 10:03 View BAU High Media
17: 10101:49721854 01/10/2012 00:07 14 18/09/2012 10:03 View BAU High Media
18: 10101:49721854 01/10/2012 00:07 15 18/09/2012 20:06 View BAU High Media
19: 10101:49721854 01/10/2012 00:07 16 18/09/2012 20:10 View BAU High Media
20: 10101:49721854 01/10/2012 00:07 17 19/09/2012 18:14 View BAU High Media
21: 10101:49721854 01/10/2012 00:07 18 19/09/2012 20:23 View BAU High Media
22: 10101:49721854 01/10/2012 00:07 19 20/09/2012 20:22 View BAU High Media
23: 10101:49721854 01/10/2012 00:07 20 22/09/2012 14:57 View BAU High Media
24: 10101:49721854 01/10/2012 00:07 21 22/09/2012 22:18 View BAU High Media
25: 10101:49721854 01/10/2012 00:07 22 23/09/2012 21:06 View BAU High Media
Revolution Analytics Webinar, July 2013 22
Revolution ConfidentialHistogram of cookie lifetime
Revolution Analytics Webinar, July 2013
Impressions and clicks in customer journey
Cookie duration (days)
Eve
nts
(im
pre
ssio
ns a
nd
clicks)
0 5 10 15 20 25 30
05
00
00
15
00
00
25
00
00
23
Revolution ConfidentialFitting the model
Revolution Analytics Webinar, July 2013
> library(survival)
> fitp <- coxph(
Surv(times, event=Converted) ~ Type +
Event.Type +
Supplier +
PrevClicks +
AdFormat,
data=xd)
24
Revolution ConfidentialWhat does the data say?AdFormat Event.Type PrevClicks Supplier Type
0
1
2
Super
Sky-1
60x600
Leaderb
oard
- 728x90
MP
U-3
00x250
Vie
w
Click
Pre
vC
licks
MexA
d
Specific
Media
Valu
eC
lick
Ebay
AO
L N
etw
ork
Gum
tree
Dis
pla
y N
etw
ork
Facebook A
PI
Pay m
onth
ly S
IM
Contr
act
Phone
SIM
only
Exp
on
en
tia
ted
co
eff
icie
nt
Revolution Analytics Webinar, July 2013 25
Revolution Confidential
AdFormat Event.Type PrevClicks Supplier Type
0
1
2
Super
Sky-1
60x600
Leaderb
oard
- 728x90
MP
U-3
00x250
Vie
w
Click
Pre
vC
licks
MexA
d
Specific
Media
Valu
eC
lick
Ebay
AO
L N
etw
ork
Gum
tree
Dis
pla
y N
etw
ork
Facebook A
PI
Pay m
onth
ly S
IM
Contr
act
Phone
SIM
only
Exp
on
en
tia
ted
co
eff
icie
nt
What does the data say?
• Advertise the right product!
• Some suppliers are better at generating
conversion
• But note the data wasn’t an unbiased
experiment!
Revolution Analytics Webinar, July 2013 26
Revolution ConfidentialEstimated hazard function
Revolution Analytics Webinar, July 2013
> x <- survfit(fitp)
> xx <- with(x, data.frame(time, surv, upper, lower))
> ggplot(xx, aes(time, surv)) + geom_step() …
27
Revolution ConfidentialAgenda
Digital marketing attribution
Using Survival models
At scale, on big data
Revolution Analytics Webinar, July 2013 28
Revolution ConfidentialWhere Revolution helps
Import
• Text formats
• SAS
• High-speed database
• Hadoop
Pre-process
• DataStep
• Clean
• Refactor
• Sort
• Merge
Analyse
• Cube
• Summarise
• Parallelise (rxExec)
Model
• Regression
• GLM
• Tweedie
• Clustering
• Decision trees
Score
• Predict
Deploy
• Web services
Confidential to Revolution Analytics 29
Revolution R Enterprise
Parallel external memory algorithms (PEMAs)
Revolution ConfidentialCase study: Datasong
Profile:
Multi-channel marketing analytics
Software developer and service provider
Growing, innovative, cost-conscious
Technology:
Revolution Analytics Webinar, July 2013 30
Revolution ConfidentialModeling the Baseline Hazard
Revolution Analytics Webinar, July 2013
Capture nonlinear trends in
baseline, while overlaying
marketing treatment variables
as well as other customer
attributes
Revolution R package used:
• RevoScaleR
Revolution R functions used:
• rxImport()
• rxSummary()
• rxCube()
• rxLogit()
• rxPredict()
• rxRoc()
31
Revolution ConfidentialTransformations
Revolution Analytics Webinar, July 2013
Catalog Email
32
Revolution ConfidentialOutcome
Massively scalable infrastructure
Attribution and optimization at individual customer level for clients
such as Williams-Sonoma
Client saved $250K in one campaign
Rapid deployment of customer-specific models
Innovative techniques, e.g. GAM Survival models
Performance improvement
Experienced 4x performance improvement on 50 million records
Revolution Analytics Webinar, July 2013 33
Revolution Confidential
34
www.revolutionanalytics.com Twitter: @RevolutionR
The leading commercial provider of software and support for the popular open source R statistics language.
Revolution Analytics Webinar, July 2013