deliverablepdf

6
Medill School of Journalism, Media and Integrated Marketing Communication Team 14: Di Liu, Yang Liu, Yudi Gao, Yifan Xu, Fatima M Zaidi In this report, a customer-churning model is built to help QWE Inc. in retaining their customers. Recommendations are would be made according to the result of the churning detection. Customer Churn Prediction for QWE Inc.

Upload: yudi-gao

Post on 09-Feb-2017

23 views

Category:

Documents


2 download

TRANSCRIPT

M e d i l l S c h o o l o f J o u r n a l i s m , M e d i a a n d I n t e g r a t e d M a r k e t i n g C o m m u n i c a t i o n

Team 14: Di Liu, Yang Liu, Yudi Gao, Yifan Xu, Fatima M Zaidi In this report, a customer-churning model is built to help QWE Inc. in retaining their customers. Recommendations are would be made according to the result of the churning detection.

CustomerChurnPredictionforQWEInc.

EXECUTIVESUMMARY In this paper, we attempt to find out a customer-churning method, in which QWE Inc. can proactively approach to customers who are likely to suspend their services. In this way, we are able to develop a better customer retention plan and build long-term relationships with our customers. Decision tree and logistic regression are used as two methods that we used to build our churning-prediction model. Finally, we make recommendations according to the results and insights we conclude from our prediction model.

MR.WALL’SPREDICTION Mr. Wall, the CEO of QWE Inc. gave us an insight that customers with an age between 6-14 months have the highest churn rate. We run the method of compare means and get our result on the right. From the result we can say that consumers with age between 6 months to 14 months have the highest churn rate, comparing with other two age groups, which aligns with Wall’s belief. However, more matrixes are needed if we want to get an overall profile of what kind of customers have a high probability to leave and what are the three most important reasons contribute to their leaving.

METHODOLOGIESA–SINGLEDECISIONTREE First, we use the decision tree matrix to find out what are the main variables that differentiate our customers in the dataset. The result shows that variable Days Since Last Login is the most critical factor in prediction, because it appears on the top of the decision tree. Thus, we run a single variable decision tree using Days Since Last Login to predict the customer churn rate. Exhibit 1 shows the output of the single variable decision tree. According to Exhibit 1, if the value of Days Since Last Login 0-1 is less than 17.5, then the customer belongs to a group of 5624 people. The probability that this customer would quit is 3.88%. According to our data, 323 customers out of 6347 left. The probabilities for churn Exhibit 1

in both groups are less than 50%, we predict that all customers who have a value of Days Since Last Login 0-1 less than 17.5 is staying with us.

However, like we’ve mentioned above, it is very unreliable to use only one variable for prediction. We test the sensitivity of our single decision tree model, which only gives us 0. In this company, only 323 out of 6347 customer terminated their services, which gives us a 5% actual churn rate. In the industry with a naturally low turn rate, this

model has no actual value. It can’t spot out any leaving customer.

METHODOLOGIESB–MULTIPLEDECISIONTREE

To improve the sensitivity of the model, we add three other most important variables to the model, which are Logins 0-1, Customer Age and Views, and make a multiple decision tree model. The result is shown in Exhibit 2. In this model, we spot Group 4 and Group 7 have a very high churn rate – 80% and 66.7% comparing to other groups. In this model, we decide customer who ever falls in groups 4 and 7, which have a more than 50% churn rate, is going to leave. Picking customer ID number 354, 672 and 5203 for examples. According to Exhibit 2 and 3, all of the three customers fall into the group 1, so we predict them to stay with us because the probability is close to 0. Although Customer 5203 will enter the “danger zone” in February, taking all the other variables into account, we still believe they will keep using our service.

Multiple variables decision tree: a) day since last login(Days0.1) b) differences in logins times compare to last month( Login0.) c)

ages of being our customer in month (age). These three variables jointly give us a prediction of if a

customer is going to leave.

Exhibit 2: Multiple Decision Tree Model

Exhibit 3: Data of Customer ID 354, 672 and 5203

By looking at the groups 4 and 7, we are able to get a list of customers who are very likely to leave but haven’t dropped yet: Customer 825, 3371 from group 4 and Customer 279, 334, 404, 475, 488, 530, 554, 573, 2499, 3152, 3177 from group 7. We say Customer 825, 3371 from group 4 have 80% chance to leave and those who from group 7 have 66.7% to leave, if we don’t take any action immediately. To measure this improved model, we calculate the sensitivity again, and find out that result is 9.29%, which is better than that of the previous model.

METHODOLOGIESC–LOGISTICREGRESSION To further improve our prediction model, we use another method - logistic regression. First, we run a logistic regression models for 12 variables and three most important variables with the lowest AIC are chosen to be used These variables are CHI 0, CHI 0-1 and Days Since Last Login. And then we use these three variables to run a multiple logistic regression as our prediction model. The equation is listed as Exhibit 4. From the equation, we can get a Prediction Value when putting CHI Score Month 0, CHI Score Month0-1 and Days Since Last Login 0-1 into the formula. To get a benchmark to decide whether a customer has a strong tendency to stop using our service, we calculated the average Prediction Value for both who quit the service and the group who stayed, which is 0.9304 and 0.9484. To avoid the worst cases (which we assume that the customers will stay but in reality they actually leave), we decided to approach the more conservative benchmark: 0.9304, which is the average value of the model of the customers who quit the service. Thus if a customer has a prediction value lower than 0.9304, we should approach them and improve their using experience to prevent them from leaving.

Pro: Good tool to spot out groups of customer who are in the edge of leaving. Con: Low sensitivity: We found this is a good method to classify customers. It is a desire tool to find out the most risky customers, which is a very small group but with a very high churn rate. But not good for figuring out those who are unsatisfied but aren’t angry enough to leave.

Variables in Logistic Regression: a) Customer Happiness Index (CHI Month 0) b) Differences in Customer Happiness Index compare to last month (CHI Month 0-1) c) Days Since Last Login 0-1. These three variables jointly give us a prediction of if a customer is going to leave.

Exhibit 4: Logistic Regression Equation

Test on previous results:

Looking back at the decision tree results, we predicted that these customers mentioned above in Exhibit 5 will leave. Using Logistic regression model to cross check the results we got from decision tree, we can say that all these 13 customers have a lower predicted value than the benchmark (0.93). As a result, we can conclude that the 13 customers we picked are the customers with highest churn probabilities. To compare this method with the previous decision tree, we calculate the sensitivity for the logistic regression model. According to the results we get, we can identify that there are 135 customers that we predict correctly on their leaving, and there are 188 customers that we predict them to stay but actually left. Thus, the sensitivity for this logistic regression model is 0.4180, which is much higher than the decision tree model. As a result, we believe the logistic regression model is a good

prediction model for us to fix up the disadvantages in Decision tree model.

CONCLUSION: The best way to detect churning customers is using the Logistic regression model and the decision tree model jointly. If customers have a low prediction value, it means they are unsatisfied so we want to put a red flag on them, we say this as a level one alert. For those who also fall into the dangerous groups (group 4 and 7) in the decision tree model, that means they are in the edge of leaving, which we specify as level two alert. By flagging potential churning customer into two level of alert, leaders in QWE Inc. can better distribute their budget on customer retain.

Insights: 1. Non-frequent users who have a day since

last login greater than 18, login 2.5 times less than last month, and shown a sudden drop more than 140 in views is highly likely to drop out (group 4).

2. Non-frequent users who have a day since last login greater than 18, login 2.5 times less than last month, and is close to the end of their one-year contract is very likely to drop out (group 7).

3. The most important reason contribute to customer churning is Customer Happiness Index. User experiences are essential if we want to have a long-term relationship with our customers.

Exhibit 5

RECOMMENDATIONS:

The two models give managers flexibility to adjust the benchmark of consumer churning according to the company budget in customer retention. Different strategies can be executed respect to different level of alert of customer churn.

Level One Alert:

For customers who reach level one alert, QWE Inc. should mark the customer down or pull out their data to do a further analysis. The goal is to increase the customer satisfaction/happiness level, using low cost approaches.

○ Sending out emails with the new contents that they might be interested in.

○ E-Birthday card and E-holiday card. ○ Making us more available to them through all ways.

Level Two Alert: For customers who reach level two alert, QWE Inc. should use higher cost approaches to keep their businesses.

1. Actively reaching out to the non-frequent customers who showed a sudden drop in Views. ○ Recommending list of customized contents on their home page. Offer them

incentives or discount to attract them. ○ Calling the customers to show how we care about their business. Solve the problems

they have.

2. Contacting the non-frequent customers who are close to the end of their one year subscribe contracts (age of 12 month).

○ All customers are going to receive thank-you notes from us and make them feel how we care about their business.

○ Providing discounts or additional services to those non-frequent users. ○ Giving out additional free one/two-month service with the next one-year contract.

Putting this joint customer prediction model into execution, QWE Inc. would be able to detect potential churning customers, retain them with better services and enhance their user experience and hence achieve a better financial outcome.