doing customer intelligence with r - data science for ... · 1/3/2004 · definition: “doing...

55
1 1 1 \ October 19, 2004 Presented at the SDForum Business Intelligence SIG Palo Alto, CA Loyalty Matrix, Inc., 580 Market St., Suite 600, San Francisco, CA 94104 (415) 296-1141 www.LoyaltyMatrix.com R.LoyaltyMatrix.com Doing Customer Intelligence with R By Jim Porzak Director of Analytics Loyalty Matrix, Inc. PDF processed with CutePDF evaluation edition www.CutePDF.com

Upload: doandan

Post on 11-Oct-2018

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

1111

\

October 19, 2004

Presented at the SDForum Business Intelligence SIGPalo Alto, CA

Loyalty Matrix, Inc., 580 Market St., Suite 600, San Francisco, CA 94104 (415) 296-1141www.LoyaltyMatrix.com R.LoyaltyMatrix.com

Doing Customer Intelligence with R

By Jim PorzakDirector of AnalyticsLoyalty Matrix, Inc.

PD

F processed w

ith CuteP

DF

evaluation editionw

ww

.CuteP

DF

.com

Page 2: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

2222

� Introduction to Customer Intelligence at Loyalty Matrix

� Introduction to R

� R for Exploratory Data Analysis (EDA)

� R for Statistics & Data Mining

� Summary and Q&A

Page 3: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

3333

What is Customer Intelligence (CI)?

� Twenty years ago a mentor in the Valley told me: “Jim, take care of your customers. They will take care of you.”

� The fundamentals have not changed

� But the data, tools and techniques have become much richer

� Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build actionable business insights that can change marketing strategy and tactics to better align the organization with it’s customers needs and expectations.”

� Customer Intelligence is multifaceted

� It is quantitative – insights must be based on facts

� It is perceptive – insights are business interpretation of the numbers

� It is actionable – insights must usable for the business

� It is iterative – application generates more data & more insights

Page 4: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

4444

Loyalty Matrix, Inc.

� A privately held marketing services company based on technology

� A combination of seasoned marketers, techies and proprietary technology providing our clients with unparalleled customer intelligence solutions focused on:

� Customer Acquisition

� Customer Retention

� Product Cross-sell and Up-sell

� Founded in 2001

� A San Francisco based firm with offices in Dallas, Chicago, and (soon) New York

� Key Goal: Transform Customer Data into Actionable Insights!

Page 5: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

5555

Customer Intelligence (CI) in Action

�Case study: Apple .MAC Subscribers

� Business Challenge: Maximize renewal rates

� CI Insight: Early usage drives loyalty

� Business Impact: Shift marketing effort to drive new subscriber usage (depth then breadth)

� Case study: St. Regis Hotel Guests

� Business Challenge: Reduce high customer attrition

� CI Insight: Decline in loyal guests masked by boom economy single visit guests.

� Business Impact: Focus on regaining former loyal guests

Page 6: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

6666

The MatrixOptimizer®: Data Insights Action

Customer Company

Transactional

Research data

Survey data

3rd party data

Web site usage data

The data the describes the relationship

Pricing

Product

Distribution

Data Strategies & Collection

Communication

Loyalty ProgramDesign & Evaluation

The results

MatrixOptimizer®

Page 7: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

7777

The CI Framework at Loyalty Matrix

Page 8: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

8888

Inside the CI solution, the MatrixOptimizer®

� Data Mart in Microsoft SQL 2000

� Holds “ready to report data” in dimensional schema.

� Major effort to load client data, quality checks, cleanse

� Client data staged in SQL

� OLAP cubes built with Microsoft Analysis Services

� Built off of Data Mart

� Each cube focuses on single set of business problems

� OLAP presentation layer built with eBlocks

� Web access to standard reports

� Allows slice and dice

� Version 3.0 released October, 2004

Page 9: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

9999

How can R help build Customer Intelligence?

Challenges with Classical CI

� Aggregations limited to counts, sums & means.

� Nature of customer data

� Events with related # and/or $ values

� #’s, $’s & intervals all highly right skewed distributions

� Data quality often suspect – especially in dimensions (factors)

� No rigorous tests

� No advanced methods

R to the Rescue!

� Visualization

� Take first look at raw data

� EDA on “clean” data

� Classical stats

� Differences really significant?

� Efficacy of marketing efforts?

� Prediction & Modeling

� Classification

� Meaningful customer attributes

Page 10: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

10101010

� Introduction to Customer Intelligence at Loyalty Matrix

� Introduction to R

� R for Exploratory Data Analysis (EDA)

� R for Statistics & Data Mining

� Summary and Q&A

Page 11: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

11111111

Evolution of R from S

� R is the free (GNU), open source, version of S

� S developed by John Chambers and colleagues while at Bell Labs in 80’s

� For “data analysis and graphics”

� Version 4 defined by the “Green Book” Programming with Data, 1998

� S-Plus now owned and developed by Insightful Corp., Seattle, WA

� R was initially written by Robert Gentleman and Ross Ihaka

� In early 1990’s

� Statistics Department of the University of Auckland

� GNU GPL release in 1995

� Since 1997 a core group of 17 developers has had write access to the source

� V1.0 released in February, 2000

� New 0.1 level release ~ 6 months

Page 12: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

12121212

Current state of R

� V2.0 Released October, 2004

� Windows, Mac OS, Linux & Unix ports

� Over 400 submitted packages from “abind” to “zoo”

� 12th newsletter (Volume 4/2) published September 2004

� The first useR! – R User Conference held in Vienna May 2004

� ~400 R-help newsgroup messages per week

� ~ Dozen texts specifically on R or with R examples and code

� R language generally accepted to be more powerful than S-Plus

� Some interesting GUI work in progress

Page 13: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

13131313

R Resources

� R Homepage: http://www.r-project.org/

� The official site of R

� R Foundation: http://www.r-project.org/foundation/

� Central reference point for R development community

� Holds copyright of R software and documentation

� Support it!

� Local CRAN:

� Mirror site

� Current Binaries

� Current Documentation

� Link to related projects and sites

� R.LoyaltyMatrix.com blog

Page 14: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

14141414

� Introduction to Customer Intelligence at Loyalty Matrix

� Introduction to R

� R for Exploratory Data Analysis (EDA)

� R for Statistics & Data Mining

� Summary and Q&A

Page 15: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

15151515

Visualization is Key to EDA

Getting information from a table is like extracting sunlight from a cucumber.

– Farquhar & Farquhar, 1891.

If I can’t picture it, I can’t understand it. – Albert Einstein

You can see a lot, just by looking. – Yogi Berra

Thanks to Michael Friendly of VCD fame.

Page 16: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

16161616

Our Favorite Visualization Methods

� For first look and exploration

� Frank Harrell’s datadensity for quick look, quality checks, …

� Scatter Plots for patterns, outliers, …

� Box Plots for median, range, outliers

� To understand customer behavior

� Interval Histograms for time between visit, purchase, …

� Distance Histograms for customer to store travel

� Geographical Maps for customer to store travel

� Exploration for correlations and associations

� Scatterplot Matrices

� Mosaic Plots for categorical associations

Page 17: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

17171717

Basic Exploratory Data Analysis (EDA)

Page 18: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

18181818

Everything We Know About the Fact Table – Example 1

Page 19: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

19191919

Everything We Know About the Fact Table – Example 2

Page 20: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

20202020

Scatter Plot Shows Correlation & Outlier Suspects.

0 e+00 1 e+07 2 e+07 3 e+07 4 e+07 5 e+07 6 e+07 7 e+07

050

00

0100

00

015

00

00

200

00

02002 Only

aum

fee

s

146917150105

174765

183667

185138

193045

199963

A.G. Edwards & Sons Inc. RBC Dain Rauscher Inc.

Merrill Lynch Pierce Fenner & Sm

Merrill Lynch Pierce Fenner & Sm

Salomon Smith Barney

Merrill Lynch Pierce Fenner & Sm

Salomon Smith Barney

UBS Financial Services, Inc.

Page 21: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

21212121

Box Plots for Skewed Data like $’s & #’s

Page 22: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

22222222

Boxplots for Strongly Skewed $ Fees by #’s of Accounts

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

05

00

01

00

00

15

00

02

00

00

Fees by Number of New Account Opens in 2002

# New Accounts Opened in 2002

Fe

es

Page 23: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

23232323

Boxplots for Restaurant Group $ Sales over Time

15Ju

n02

22Ju

n02

29Ju

n02

06Ju

l02

13Ju

l02

20Ju

l02

27Ju

l02

03A

ug02

10A

ug02

17A

ug02

24A

ug02

31A

ug02

07S

ep02

14S

ep02

21S

ep02

28S

ep02

05O

ct0

21

2O

ct0

21

9O

ct0

22

6O

ct0

20

2N

ov02

09N

ov02

16N

ov02

23N

ov02

30N

ov02

07D

ec02

14D

ec02

21D

ec02

28D

ec02

04Ja

n03

11Ja

n03

18Ja

n03

25Ja

n03

01F

eb03

08F

eb03

15F

eb03

22F

eb03

01M

ar0

30

8M

ar0

31

5M

ar0

32

2M

ar0

32

9M

ar0

30

5A

pr0

31

2A

pr0

31

9A

pr0

32

6A

pr0

303

Ma

y03

10

Ma

y03

17

Ma

y03

24

Ma

y03

31

Ma

y03

07Ju

n03

14Ju

n03

21Ju

n03

28Ju

n03

05Ju

l03

12Ju

l03

19Ju

l03

26Ju

l03

02A

ug03

09A

ug03

16A

ug03

23A

ug03

30A

ug03

06S

ep03

13S

ep03

20S

ep03

27S

ep03

04O

ct0

31

1O

ct0

31

8O

ct0

32

5O

ct0

30

1N

ov03

08N

ov03

15N

ov03

22N

ov03

29N

ov03

06D

ec03

13D

ec03

20D

ec03

27D

ec03

03Ja

n04

010

000

2000

030

000

400

00

50

000

Total $'s of Presented Tickets by Week(Full Term Restaurants Only)

Week

$ A

mou

nt

Page 24: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

24242424

Customer Purchase Intervals

Page 25: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

25252525

Visit Interval Histogram for Restaurant/Bar/Arcade

Days between 1st and Last use of a Credit Card for PowerCard Transactions

DeltaDays

Fre

qu

en

cy

0 20 40 60 80 100 120 140

05

00

10

00

15

00

20

00

25

00

30

00

Page 26: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

26262626

Purchase Interval for Vehicles Shows Key Customer Types

0 50 100 150

0.0

00

0.0

05

0.0

10

0.0

15

0.0

20

0.0

25

0.0

30

0.0

35

Time from Prior Purchase Segments by Months from Prior Purchase for GM Repeat Households

Months from Prior Purchase

Pro

bib

ility

of P

urc

hase

N = 95793

Buy-A-Pair

Keep as Long as Reliable

3-Year Cycle

50% of Purchases

Page 27: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

27272727

Distance Traveled and Geography

Page 28: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

28282828

Distance Traveled by Customer to Restaurant

Distance Travelled by All iDine Member Dinersto SCHMIDT'S RESTAURANT UND SAUSA (52279)

July '02 - September '03

Miles Traveled

# V

isits

0 500 1000 1500 2000 2500 3000

02

00

40

06

00

1777 visits up to 50 miles

Distance Travelled by Local iDine Member Dinersto SCHMIDT'S RESTAURANT UND SAUSA (52279)

July '02 - September '03

Miles Traveled (First 50)

# V

isits

0 10 20 30 40 50

05

01

00

20

0 1777 visits up to 50 miles.

Distance Travelled by All iDine Member Dinersto STRADA WORLD CUISINE (27327)

July '02 - September '03

Miles Traveled

# V

isits

0 500 1000 1500 2000 2500 3000

02

04

06

08

0

1566 visits up to 50 miles

Distance Travelled by Local iDine Member Dinersto STRADA WORLD CUISINE (27327)

July '02 - September '03

Miles Traveled (First 50)

# V

isits

0 10 20 30 40 50

05

01

00

15

0

1566 visits up to 50 miles.

Page 29: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

29292929

Map Home ZIP’s for Customers Dining at Restaurants

SCHMIDT'S RESTAURANT UND SAUSA (52279) Visits in Last 18 Months

0 20 4060 mi

STRADA WORLD CUISINE (27327) Visits in Last 18 Months

0 20 4060 mi

Page 30: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

30303030

Geographic Density of Outlets Relative to Population

Page 31: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

31313131

Geographic Density with Ethnic Targeted Coverage

Page 32: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

32323232

Understanding Categorical Variables

Page 33: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

33333333

Mosaic Plots Show Automotive Brand Loyalty

Sta

nd

ard

ize

dR

esid

ua

ls:

<-4

-4:-

2-2:0

0:2

2:4

>4

Division Last Pch

Div

isio

n P

rio

r P

ch

Chevy P B C OI GMC ? SG

Chevy

PB

CO

GM

C?SG

Last Division by Prior Division Same Consumer ( N = 5653 )

A B C D E F G H I J K

* F E D C B A

Page 34: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

34343434

Mosaic Pairs Shows Categorical Responses in Survey

Mode

F M 1

Ma

ilW

eb

1 2 3 4+ *

Mail

Web

W M 3 6 6+*

Mail

Web

B A MP*

Mail

Web

Mail Web

FM

1

Gender

1 2 3 4+ *

FM

1

W M 3 6 6+*

FM

1

B A MP*

FM

1

Mail Web

12

34+

*

F M 1

12

34+

*

Age

W M 3 6 6+*

12

34+

*

B A MP*

12

34+

*

Mail Web

WM

36

6+

*

F M 1

WM

36

6+

*

1 2 3 4+ *

WM

36

6+

*

Last.Visit

B A MP*

WM

36

6+

*

Mail Web

BA

MP *

F M 1

BA

MP *

1 2 3 4+ *

BA

MP *

W M 3 6 6+*

BA

MP *

Quality

BR Survey - Catigorical Answers

Page 35: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

35353535

Mosaic Shows Mailed vs. Control Purchases are Alike

Sta

nd

ard

ize

dR

esid

ua

ls:

<-4

-4:-

2-2

:00:2

2:4

>4

Travel Trailer

MorC

Cla

ss

C M

CLA

SS

AC

LA

SS

CF

IFT

H W

HE

EL

FO

LD

ING

TR

AIL

ER

TR

AV

EL T

RA

ILE

RT

RU

CK

CA

MP

ER

Page 36: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

36363636

Customer Segment Migration

Page 37: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

37373737

Customer Segment Migration

Page 38: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

38383838

� Introduction to Customer Intelligence at Loyalty Matrix

� Introduction to R

� R for Exploratory Data Analysis (EDA)

� R for Statistics & Data Mining

� Summary and Q&A

Page 39: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

39393939

Classical Statistics add Rigor

�Background

� Business analysts mostly use Excel

� Great for exploration and building presentations

� Tests of significance at best an after-thought

�Example: Test Direct Mail Campaign Effectiveness

� Offer: Test drive “RV” at your local dealer. Get a goodie.

– ~$100,000 is not an impulse purchase!

– Low conversion rates (proportions)

� Control group is “hold out” from each list.

� Are there incremental sales? How many? $ value?

Page 40: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

40404040

Fisher.test Measures if Mail Campaign was Effective

>PA.Group <- c(64443, 12060) # Size of Mailed, Control groups

>PA.Sales <- c(139, 15) # Sales to Mailed, Control groups

>fisher.test(matrix(c(PA.Sales, PA.Group-PA.Sales), 2),

alternative="greater")

Fisher's Exact Test for Count Data

data: matrix(c(PA.Sales, PA.Group - PA.Sales), 2)

p-value = 0.02122

alternative hypothesis: true odds ratio is greater than 1

95 percent confidence interval:

1.095079 Inf

sample estimates:

odds ratio

1.735754 We will be wrapping the Fisher test with an easy-to-use interface and result visualization as a Campaign Effectiveness module.

Page 41: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

41414141

All Mail Lists for a Campaign

Page 42: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

42424242

Marketing Campaign Visualization

Page 43: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

43434343

Subscription Survival Analysis

� Probability of surviving a certain number of days?

� What is half live of typical subscriber?

� Hazard Ratio = (# Expired) / (# at risk) at each time interval

� H(t) = Ne(t) / Nr(t)

� Probability of Survival = product (1 – H(t)) up to time of interest

� Ps(t) = Π (1 – H(i) ; i = 1, t

� Need to know

� Number of days in subscription

� Expired or censored?

� See

� Gordon Linoff’s article in Intelligent Enterprise, August 2004

� Tableman & Kim, Survival Analysis Using S, Chapman & Hall, 2004

� R library: survival and more…

Page 44: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

44444444

Subscription Survival and Hazard

0 200 400 600 800

0.0

0.2

0.4

0.6

0.8

1.0

Monthly Subscription

Days into Subscription

Pro

bib

ility

of S

urv

iva

l

00.0

10.0

20

.03

0.0

40

.05

0.0

6

Ha

zard

Ratio

Page 45: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

45454545

Some R Code – Read data set & plot survival curve

library(RODBC)

channel <- odbcConnect(“MyDataSet", uid = “me", pwd = “xx")

sSQL <- readLines("qCustomerSubscriptions4R.sql", n = -1)

PBSub <- sqlQuery(channel, sSQL, na.strings = c("", " "))

nDays <- 900

NumSubs <- length(SubStat)

NumSuccumbed <- table(NumDaysInSub[SubStat=="Expire"])

NumAtRisk <- NumSubs - cumsum(table(NumDaysInSub))

+ table(NumDaysInSub)[1]

HazardProb <- NumSuccumbed[1:nDays]/NumAtRisk[1:nDays]

SurvivalProb <- cumprod(1-HazardProb)

plot(SurvivalProb, type = "s", ylim = c(0, 1),

xlab = "Days into Subscription",

ylab = "", main = “Subscription )

Page 46: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

46464646

0 200 400 600 800

0.0

0.2

0.4

0.6

0.8

1.0

Monthly Subscription

Days into Subscription

Pro

bib

ility

of S

urv

iva

l

Playboy Cyber Club (N = 80905 , $k= 6187 )PlayboyNet (N = 16339 , $k= 976 )PlayboyTV Jukebox (N = 829 , $k= 156 )SpiceNet (N = 7495 , $k= 322 )SpiceTV Club (N = 1822 , $k= 186 )

Survival by Product Type for Monthly Subscriptions

Page 47: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

47474747

Survival by Product Type for Yearly Subscriptions

0 200 400 600 800

0.0

0.2

0.4

0.6

0.8

1.0

Yearly Subscription

Days into Subscription

Pro

bib

ility

of S

urv

iva

l

Playboy Cyber Club (N = 109943 , $k= 8888 )PlayboyNet (N = 8910 , $k= 619 )PlayboyTV Jukebox (N = 481 , $k= 64 )SpiceNet (N = 1785 , $k= 102 )SpiceTV Club (N = 937 , $k= 158 )

Page 48: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

48484848

Advanced Methods Add CI Insight

�Background

� High-end “data mining” usually done with SAS, SPSS, …

� Not generally available to front line business analysts

�Example: Customers purchasing different types of vehicles

� Two distinct product groups

– High end RV’s

– Entry level towables

� How do customer profiles differ?

– Continuous & categorical data

– Sales data + 3rd party demographic matching

� What can we learn for product design? Marketing?

Page 49: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

49494949

randomForest Differentiates Between Customer Groups

UnitSizeOwnMCycleIPhotoAIntelIBoatSaleAmLkMLOwnRVISailIHuntShootFamPosIBikeGenderDwelTypeRegionMSA.TypeIFishHHCompAEnvironLORATraditPNE.LSGSaleMonthPNE.SGEstIncomeHHCldrnCMSAHouseValMBGrpMBSegAgeRange

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

MeanDecreaseAccuracy

Importance

> FWE.rf

Call:randomForest.default(x = FWEi, y = Type, mtry = 6, importance = TRUE,proximity = TRUE, outscale = TRUE)

Type of random forest: classification

Number of trees: 500No. of variables tried at each split: 6

OOB estimate of error rate: 28.95%

Confusion matrix:

LSV Analog TT Entry class.errLSV Analog 622 222 0.26303TT Entry 295 647 0.31316

randomForest package by Andy Liaw and Matthew Wiener based on original Fortran code by Leo Breiman and Adele Cutler

Page 50: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

50505050

randomForest Shows Variable Importance by Group

OwnCompIPhysFitAIntelUnitSizeOwnCellAmLkMLIPhotoOwnMCycleMSA.TypeLORIMCycleFamPosIBikePNE.LSGIFishGenderOwnRVIHuntShootHHCompEstIncomeRegionPNE.SGSaleMonthDwelTypeCMSAHouseValHHCldrnMBSegMBGrpAgeRange

0.0 0.2 0.4 0.6 0.8

LSV Analog

Importance

DwelTypeIPhotoUnitSizeOwnCompRegionAmLkMLIPwrBoatHHIMAIntelGenderFamPosIBikeIBoatSaleHHCompISailIFishMSA.TypeHHCldrnPNE.SGSaleMonthPNE.LSGLOREstIncomeATraditAEnvironCMSAMBGrpMBSegHouseValAgeRange

0.0 0.2 0.4 0.6 0.8

TT Entry

Importance

Page 51: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

51515151

Multi-dimensional Scaling Plot of rF Proximities

Dim 1

-0.2 0.0 0.2 -0.3 -0.1 0.1

-0.1

0.1

0.3

-0.2

0.0

0.2

Dim 2

Dim 3

-0.2

0.0

0.2

-0.3

-0.1

0.1

Dim 4

-0.1 0.1 0.3 -0.2 0.0 0.2 -0.2 0.0 0.2

-0.2

0.0

0.2

Dim 5

Page 52: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

52525252

Association Analysis

� AKA “Shopping Basket” analysis

�Apriori algorithm by Christian Borgelt

Page 53: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

53535353

Demographic Associations for High Spenders

Page 54: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

54545454

R Lessons Learned at Loyalty Matrix

� R a flexible addition to “classical CI” methods

� Yes, there is a learning curve

� Very responsive R user community and support

� No problem with client acceptance

Page 55: Doing Customer Intelligence with R - Data Science for ... · 1/3/2004 · Definition: “Doing Customer Intelligence is analyzing customer behavioral and motivational data to build

55555555

Questions, Comments?

� Now would be the time

� My email is [email protected]

� Our R Blog is R.LoyaltyMatrix.com