predictive modelling survival analysis

Past and Current customer behavior is the best predictor of Future customer behavior

Customer Retention Models:

Data Needed: Sales data linked to each individual customer

Retention rate = Number of current customers/ Number of all customers

We look at every customer who has made at least 2 purchases and calculate the

number of days between the first and second purchases. This number is called

"latency" - the number of days between two customer events. Let us assume that to be

“x” days

We can cluster the customers into 3 based on latency

Customers

CurrentAt RiskLost

We can find the Standard deviation of latency from the data of customers, call it d

If average latency of a customer in a month, l < x+ d, he is a current customer

If average latency of a customer in a month x+d < l< x+ 3d, he is a customer at risk

If average latency of a customer in a month l > x+3 d, he is a lost customer

The clusters of customers based on latency are mapped on to the three customer

segments, segment 1(Loyal to few products), segment 2 (Variety seeking people) and

segment 3 (Customers with longer relationship duration). Segment 2 customers at risk

are the ones that require highest attention followed by segment 1 customers at risk.

This identification helps to come up with promotional offers only to those that are of

maximum value to McDonalds.

For example, let's assume McDonalds has 1,000 customers, and they have an annual

budget of $1,000. They can spend $1 on each customer each year, and for that $1,

they get back $1.10 in profits. That's an ROI of 10%; now, if they knew spending $2

each year on a certain 50% of customers (identified above) would bring back $8 in

profits. That's a 400% ROI. Instead of spending $1 on each customer, if McDonalds

spends $2 on 50 % of its most profitable customers, they can have better results. They

spend the same $1,000 total and make back 500 (half the customers) x $8 = $4,000.

Also the clusters of current, at risk and lost customers can be further analyzed for

demographics to identify any trends and suggest promotions to McDonalds

Survival Analysis:

We can build a survival analysis model to measure the effect of number of purchases

made and the revenue from a customer on the retention of the customer.

To see how the retention probability changes, 2 categorical variables latency and

revenue per month are build from the available customer data.

lanency = 0 if calculated latency > X+ 3d ( X= mean latency , d= SD of latency)

1 if X+d < calculated latency < X+ 3d

2 if calculated latency < X+ d

revenue per month = 0 if calculated revenue < X+d ( X= mean revenue, d= SD of

revenue)

1 if X+d < calculated revenue < X+ 3d

2 if calculated revenue > X+ 3d

From the following graphs, we can see that,

Lower latencies are associated with higher probabilities of retention.

Higher latencies are associated with higher probabilities of retention.

We can also measure by how much the retention probability varies with increasing

revenues or decreasing latencies. By measuring the CLV (Customer Life time Value),

McDonalds can decide how much they should spend on promotions to retain the

customers who have higher probabilities of defection.

CLV can be calculated from the following formula,

CLV=MM×ΣT i=1 (p i /(1+r /12) i−1)

Where

MM is the monthly margin for the last three months for existing customers, or the last

month’s monthly margin for newly acquired customer,

T is the number of months

r is the discount rate.

pi is the series of customer survival probabilities (customer survival curve) from month

1through Month T , where p1 = 1

predictive modelling survival analysis

Documents