double down on your data

41

Upload: qualex-asia

Post on 28-Mar-2016

214 views

Category:

Documents


0 download

DESCRIPTION

Chapter two (Predictive Analytics) of the book Double Down On Your Data: How Analytics is Revolutionizing the Casino and Hospitality Industry.

TRANSCRIPT

Page 1: Double Down On Your Data
Page 2: Double Down On Your Data
Page 3: Double Down On Your Data
Page 4: Double Down On Your Data

Qualex Consulting Services, Inc4300 Biscayne Bay, Suite 203Miami, FL 33137.

www.qlx.com

Copyright © 2012 Clive J. Pearson. All rights reserved.

No part of this book may be reproduced, stored in a retrieval system, or transmitted by any means without the written permission of the author, excepting brief quotes used in reviews.

First published by Qualex Consulting Services on 4/15/12.

First Printing: April 15, 2012

Printed in the United States of America.

Cataloging data may be obtained from the Library of Congress

ISBN 978-0-557-73420-7

Page 5: Double Down On Your Data
Page 6: Double Down On Your Data

CHAPTER TWO

PREDICTIVE ANALYTICS

“In business, as in baseball, the question isn't whether or not you'll jump into analytics. The question is when. Do you want to ride the analytics horse to profitability...or follow it with a shovel?”

~ Rob Neyer, ESPN

Overview

Predictive analytics refers to a variety of statistical techniques that

analyze current and historical facts to make predictions about future

events. Using such techniques as predictive modeling, machine

learning, data mining and game theory, predictive analytics can build

models that exploit patterns found in historical and transactional data,

patterns that can identify business risks and potential opportunities.

Predictive analytics is not a new technology, it is a decades

old, proven technology that encompasses such disciplines as statistics

and data mining. A forward-looking technology that uses past events to

predict future activity, predictive analytics arose out of the management

information systems and standard reporting world of the 1970s. The

drill down technology of the 1980s led to the data warehousing and

OLAP cubes systems of the 1990s, which allowed for more complex ad

hoc data querying. The analytics and Business Intelligence solutions of

the 2000s evolved into the more complex world of predictive modeling

and optimization of today. These systems can do more than report, they

37

Page 7: Double Down On Your Data

can actually help predict future business activity. The Predictive

Analytics Pocket Guide4 (2009) defines predictive analytics as such:

Unlike Business Intelligence applications, which

merely present summaries of historical data, predictive

analytics focuses instead on the prediction of future

outcome of events not yet observed in the data. These

predictions are made by creating a model from the

observed data using statistical techniques. These

models can range in complexity from simple linear

equations to powerful techniques such as neural

networks5.

Predictive analytics can be broken down into three different

types of models:

1. Predictive: these analyze past performance to predict the

likelihood that an individual customer will exhibit a specific

behavior in the future.

2. Descriptive: these identify different relationships between

customers to group or segment them for marketing or other

purposes.

3. Decision: these predict outcomes of complex decisions,

relationships, products and/or processes.

Predictive analytics extracts information from data sets and

uses it to anticipate future trends and behavior patterns based on

statistics and data mining (Ramakrishnan and Madure, 2008). The most

4Available at www.predictivesource.com.

5A predictive analytics technology that can learn the relationship between inputs and output through training.

38

Page 8: Double Down On Your Data

important element of predictive analytics is the predictor, “a variable

that can be measured for an individual or other entity to foresee future

behavior” (Ramakrishnan and Madure, 2008). The real trick is to find

the predictive model best suited for the outcome one is trying to study

(Ramakrishnan and Madure, 2008) and this is no easy feat.

“Predictive analytics also encompasses models that seek out

subtle data patterns to answer questions about customer performance,

such as churn prediction, fraud detection and propensity to buy

additional products and services” (Ramakrishnan and Madure, 2008).

Predictive analytics solutions include SAS's suite of analytics products,

IBM's SPSS, EMC's Greenplum and the Revolution's R open source

product. Whichever solution is used, predictive analytics can enhance

customer acquisition and retention, identify cross-sell and up-sell

opportunities, identify customer lifetime value, spot fraud detection,

determine the life cycle of a slot machine and help direct and improve

marketing campaigns. Predictive analytics can even “perform

calculations during live transactions to guide a decision”

(Ramakrishnan and Madure, 2008), but without data mining predictive

analytics would be useless.

Data Mining: An In-House Goldmine

Data mining – the process whereby hidden patterns within data sets are

discovered – is a component of predictive analytics that entails an

analysis of data to identify trends and patterns of relationships among

data sets (Ramakrishnan and Madure, 2008). To put is simply, data

mining helps transform raw data into usable information. In their article

Neural Networks in Data Mining, Singh and Chauhan (2009) state that

data mining is the:

39

Page 9: Double Down On Your Data

business of answering questions that you've not

asked yet. Data mining reaches deep into databases.

Data mining tasks can be classified into two

categories: Descriptive and predictive data mining.

Descriptive data mining provides information to

understand what is happening inside the data without

a predetermined idea. Predictive data mining allows

the user to submit records with unknown field values,

and the system will guess the unknown values based

on previous patterns discovered in the database.

By employing automated predictive analytics to sift through a

casino operator’s customer database, data mining can discover hidden

opportunities and connections that might otherwise be missed. Many

casino operators have terabytes and terabytes of data – everything from

customer player card information to information about a customer’s

room preference – and sifting through this information to discover

meaningful connections would be an impossible task without data

mining.

Data mining and predictive analytics aim to identify valid,

novel, potentially useful and understandable correlations and patterns in

datasets (Chung & Gray, 1999) by combing through copious amounts of

data to sniff out patterns and relationships that are too subtle or complex

for humans to detect (Kreuze, 2001). Data must be gathered from

disparate sources and then seamlessly integrated into a data warehouse

that can then cleanse it and make it ready for consumption. Trends that

surface from the data mining process can help in monetization, as well

as in future advertising and marketing campaigns.

For casinos, data mining can cull through data from such

disparate sources and departments as sales and marketing, thereby

allowing users to measure patron behavior on more than a hundred

40

Page 10: Double Down On Your Data

different attributes, which is a far cry from the three or four different

attributes that statistical modeling used to offer.

Unlike traditional statistical analysis, which relies heavily on

hypothesis testing, data mining tries to identify relationships and

interdependencies that affect a marketing-related opportunity or

problem (Thelen, et al., 2004). While traditional multiple regression

methods can only use a limited number of complexity levels, neural

networks and decision trees can easily handle up to 200 predictor

variables (Thelen, et al., 2004), allowing them to do much more

complicated computations.

Normally, with statistical modeling, an analyst poses a simple

question such as: “Are higher-income people prone to be more loyal to

a casino player card than those with lower income levels?” The

hypothesis would elicit two responses, either “yes” or “no.” Data

mining, however, can reveal factors that contribute to casino loyalty;

factors that the analyst might never have thought to test for. According

to Thelen, et al. (2004), the data mining process is as follows:

1. Identify the business opportunity

2. Cleanse the data

3. Transform the data into meaningful information

4. Confirm the model

5. Tweak and perfect the model

Since data mining systems are inherently reliant on so many

departments, they can be difficult and complicated to implement.

Marketing managers, corporate strategists, statisticians and IT directors

are all required to add their input. Casino operators should keep in

mind, however, that data mining will only be successful if their casino

patrons are willing to provide information on themselves (Thelen, et al.,

2004). Although player cards provide a wealth of information, if the

41

Page 11: Double Down On Your Data

patron doesn’t trust the casino with information beyond what is gleaned

from the player cards, the casino will only have a incomplete view of

that individual patron (Thelen, et al., 2004), which severely limits the

predictive analytics capabilities.

Data Mining Techniques

Regression models: Regression analysis is the process of predicting

the continuous dependent variable from a number of independent

variables. It attempts to find a function which models the data with the

least error. Regression analysis can be used on data which is either

continuous or dichotomous, but cannot be used to determine a causal

relationship. Regression analysis focuses on establishing a

mathematical equation as a model to represent the interactions between

the different variables under consideration. Regression models are

particularly effective to find patron worth because the model can be used

to score historical data to predict an unknown outcome (Sutton, 2011).

Multiple regression models “utilize a variety of predictors and the

relationships between those predictors to predict future worth” Sutton

(2011) states. As an example, Sutton (2011) explains that “a model built

to predict future gaming trip worth might be generated based on historical

information about theoretical win, actual win, credit line, time on device,

nights stayed, and average bet.”

Linear regression models: These analyze the relationship between the

response or dependent variable and a set of independent or predictor

variables. This relationship is expressed as an equation that predicts the

response variable as a linear function of the parameters. These

parameters are adjusted so that a measure of fit is optimized. Much of

the effort in model fitting is focused on minimizing the size of the

42

Page 12: Double Down On Your Data

residual, as well as ensuring that it is randomly distributed with respect

to the model predictions. An important assumption of regression

analysis is linearity, which defines a straight line relationship between

Independent variables and dependent variables. For example, in Figure

1, we could make the assessment that an increase in ad spend also

increases sales and, using the straight line, we could predict how much

the sales would be affected.

For the casino and hospitality industry, regression models can

be used to predict a patron's future worth (Sutton, 2011). Multiple

regression models “utilize a variety of predictors and the relationships

between those predictors to predict future worth” Sutton (2011) states.

As an example, Sutton (2011) explains that “a model built to predict

future gaming trip worth might be generated based on historical

information about theoretical win, actual win, credit line, time on

device, nights stayed, and average bet.”

43

Figure 1: Linear Regression chart

Page 13: Double Down On Your Data

Neural networks: Artificial Neural Networks (ANN) or often just

called “Neural Networks” are non-linear statistical data modeling tools

that are used when the exact nature of a relationship between input and

output is unknown. In their article Neural Networks in Data Mining,

Singh and Chauhan (2009) claim that a neural network is:

a mathematical model or computational model based

on biological neural neworks, in other words, is an

emulation of biological neural system. It consists of

an interconnected group of artificial neurons and

processes information using a connectionist

approach to computation. In most cases an ANN is

an adaptive system that changes its structure based

on external or internal information that flows

through the network during the learning phase.

They can be used to find patterns in data. A key feature of neural

networks is that they learn the relationship between inputs and output

through training.

There are three types of training in neural networks;

reinforcement learning, supervised and unsupervised training, with

supervised being the most common one. Neural Networks (see Figure

2) are data processing systems whose structure and functioning are

inspired by biological neural networks. Their fundamental

characteristics include parallel processing, distributed memory and

adaptability to their surroundings.

44

Page 14: Double Down On Your Data

For casino and hospitality marketing purposes, neural

networks can be used to classify a consumer's spending pattern, analyze

a new product, identify a patron's characteristics as well as forecast

sales (Singh and Chauhan, 2009). The advantages of neural networks

include high accuracy, high noise tolerance and ease of use as they can

be updated with fresh data, which makes them useful for dynamic

environments (Singh and Chauhan, 2009).

Logistic regression - This method transforms information about a

binary dependent variable into an unbounded continuous variable and

estimates a regular multivariate model. Logistic regression (see Figure

3) is a generalized linear model. It is used mainly to predict binary

variables (with values such as yes/no or 0/1). Thus, logistic regression

techniques may be used to classify a new observation whose group is

unknown, in one of the groups, based on the values of the predictor

variables.

45

Figure 2: A Neural network

Page 15: Double Down On Your Data

A/B testing: Also known as split testing or bucket testing, A/B testing

is a method of marketing testing by which a baseline control sample is

compared to a variety of single-variable test samples in order to

improve response rates.

A classic direct mail tactic, this method has recently been

adopted within the interactive space to test tactics such as banner ads,

emails and landing pages. For casino marketers, A/B testing is the most

effective way to identify the best available marketing offer (Sutton,

2011). It can test “two different offers against one another in order to

identify the offer that drives the highest response and the most

revenue/profit” (Sutton 2011).

Decision trees: Used to identify the strategy that is most likely to reach

a goal. It is a decision support tool that uses a graph or model of

decisions and their possible consequences, including chance event

outcomes, resource costs, and utility. Decision trees are sequential

partitions of a set of data that maximize the differences of a dependent

variable (response or output variable). They offer a concise way of

46

Figure 3: Logistic Regression chart

Page 16: Double Down On Your Data

defining groups that are consistent in their attributes, but which vary in

terms of the dependent variable.

The construction of a decision tree is based on the principle of

“divide and conquer”: through a supervised learning algorithm,

successive divisions of the multivariable space are carried out in order

to maximize the distance between groups in each division (that is, carry

out partitions that discriminate). The division process finalizes when all

of the entries of a branch have the same value in the output variable,

giving rise to the complete model. The further down the input variables

are in the tree, the less important they are in the output classification

(and the less generalization they allow, due to the decrease in the

number of inputs in the descending branches). Figure 4 shows a

decision tree for responses to a marketing campaign using age and zip

code as the variables.

Figure 4: Decision Tree

47

Page 17: Double Down On Your Data

For the casino and hospitality industry, decision trees can be

used “to identify patron characteristics that can predict the likelihood of

a patron (or segment of patrons) to abuse an offer” (Sutton, 2011).

Time series model: A time series is an ordered sequence of values of a

variable at uniformly spaced time intervals. According to the

Engineering Statistics Handbook, time series models can be used to:

• Obtain an understanding of the underlying forces and structure

that produced the observed data;

• Fit a model and proceed to forecasting, monitoring or even

feedback and feedforward control.6

A time series model (see Figure 5) can be used to predict or

forecast the future behavior of a variable. These models account for the

fact that data points taken over time may have an internal structure

(such as autocorrelation, trend or seasonal variation) that should be

accounted for. For the casino and hospitality industry, a Time Series

Analysis can be used to forecast sales, project yields and workloads as

well as analyze budgets.

Figure 5: Time Series Model

6http://www.itl.nist.gov/div898/handbook/pmc/section4/pmc41.htm .

48

Page 18: Double Down On Your Data

Nearest Neighbor Method: Initially introduced by J. G. Skellam, the

Nearest Neighbor Method is a technique that is based on the concept of

similarity where “the expected and observed mean value of the nearest

neighbor distances is used to determine if a data set is clustered”

(Skellam, 1952). This method constructs a classification system

without making assumptions concerning the shape of the function that

relates the dependent variable with the independent variables. The aim

is to identify in a dynamic way observations in the training data that are

similar to a new observations that we want to classify. This method

does not impose a priori any assumptions about the distribution from

which the modeling sample is drawn. It involves a training set with

both positive and negative values.

Discriminant Analysis: Discriminant or discriminant function analysis

is a method used to determine which weightings of quantitative

variables or predictors best discriminate between two or more than two

groups of cases and do so better than chance (Cramer, 2003). It is a

method used in statistics, pattern recognition and machine learning to

find a linear combination of features which characterizes or separates

two or more classes of objects or events.

Because of its ability to classify individuals or experimental

units into two or more uniquely defined populations, discriminate

analysis can be used for market segmentation and the prediction of

group membership. The discriminant score can be the basis on which a

prediction about group membership is made. For example, the

discriminant weights of each predictive variable (age, sex, income, etc)

indicate the relative importance of each variable. For example, if age

has a low discriminant weight then it is less important than the other

variables. For a casino and hospitality marketing department, use of

discriminant analysis can help predict why a patron frequents one

casino over another. Discriminant analysis is specifically useful in

49

Page 19: Double Down On Your Data

product research, perception/image research, advertising research and

direct marketing.

Figure 6 shows a generic Discriminant Analysis Model.

Figure 6: Discriminant Analysis Model

Survival or duration analysis: A branch of statistics that deals with

death in biological organisms and failure in mechanical systems (see

Figure 7). It involves the modeling of time to event data; in this

context, death or failure is considered an “event” in the survival

analysis literature – traditionally only a single event occurs, after which

the organism or mechanism is dead or broken. Survival Analysis is the

study of lifetimes and their distributions. It usually involves one or

more of the following objectives:

• To explore the behavior of the distribution of a lifetime.

• To model the distribution of a lifetime.

• To test for differences between the distributions of two or

more lifetimes.

50

Page 20: Double Down On Your Data

• To model the impact of one or more explanatory variables

on a lifetime distribution.

By applying survival analysis to revenue management models,

casino operators can gain a truer picture of their table games revenue

(Peister, 2007).

Figure 7: Survival Analysis Model

There are several other data mining techniques that can be

used but the ones listed above are the most commonly used ones in the

industry and much of what you will need to glean from your data can

be discovered by using them. Once the data has been mined, a business

intelligence solution can tell you what's going on in your data while a

predictive analytics program can actually analyze current and historical

trends to make predictions about future events.

51

Page 21: Double Down On Your Data

Predictive Analytics: Actionable Intelligence

Customer analytics have evolved from simply reporting patron

behavior to segmenting customers based on profitability, to predicting

that profitability, to improving those predictions (because of the

inclusion of new data), to actually manipulating customer behavior

with target-specific promotional offers and marketing campaigns.

Predictive analytics can graph a customer’s value over time as

well as anticipate that customer’s behavior. From this analysis, a casino

operator can tailor highly specific, laser-focused marketing campaigns

to each customer in the casino’s patron database. By consolidating the

various patron touchpoint systems throughout the casino property, the

casino operator can create a full view of each patron.

Drawing on data from casino player cards, predictive models

can set budgets and calendars for the casino's gamblers, calculating

their predicted lifetime value in the process. If a gambler wagers less

than usual because they may have skipped a monthly visit, the casino

can intervene with a letter or phone call offering a free meal, a show

ticket or gaming comps. Without these customer analytics, casino

operators might not notice what could be a slight, almost imperceptible

change in customer behavior that portends problems. For example, if a

long-time customer decides to cash in all their player card points

perhaps it’s because they are dissatisfied with their last experience at

the casino. Predictive analytics can quickly spot these trends and alert

casino management to the issue so that they can approach the

individual to find out if there is a problem. This kind of personalized

attention can go a long way in appeasing disgruntled customers, which

might be the difference between retaining or losing them as a customer.

Predictive analytics can glean data from a variety of disparate

sources, including:

52

Page 22: Double Down On Your Data

• Data integrated throughout the casino's gaming systems.

• Feedback information derived from post-visit surveys.

• Web data mining from customer’s individual online

behavior.

• Social media websites.

With predictive analytics, gaming organizations can easily

segment their customers and coordinate marketing campaigns to

effectively target each segment across each outbound channel. For

example, if a casino customer is scheduled to receive all of his or her

event promotions via e-mail, the predictive analytics solution will

automatically remove him from concurrent campaigns being run

through other channels. This ensures consistency and also improves

customer satisfaction, since the organization respects the customer’s

contact preference and doesn’t inundate him or her with multiple offers.

Moreover, a predictive analytics solution monitors channel capacity

and usage to eliminate overload, while distributing campaigns equally

across the various channels. If one channel is at risk of overload, the

solution automatically shifts the remainder of a campaign to a different

channel to ensure completion. This enables organizations to maximize

the capacity and value of each channel without resorting to time-

consuming manual monitoring.

Manipulating Customer Behavior

Successful marketing is about reaching a consumer with an interesting

offer when he or she is primed to accept that offer. Knowing what

might interest a patron is half the battle of making a sale and this is

where customer intelligence and predictive analytics come in.

Customer analytics has evolved from simply reporting customer

53

Page 23: Double Down On Your Data

behavior to segmenting customers based on their profitability, to

predicting that profitability, to improving those predictions because of

the inclusion of new data, to actually manipulating customer behavior

with target-specific promotional offers and marketing campaigns.

Predictive analytics can deconstruct a casino’s massive data

warehouse, making the information held within these databases more

meaningful. It can extrapolate trends, invent and validate a hypothesis,

as well as predict future activity. Predictive analytics can be used in a

myriad of ways, but mostly for cross-selling/up-selling, campaign

management, customer acquisition, budgeting, forecasting and

attrition/churn/retention, amongst other things (see Figure 8).

Figure 8: Applications for Predictive Analytics

Casino operators can enhance their customer relationships by

cross-selling and up-selling items that the customer might actually be

54

Page 24: Double Down On Your Data

interested in, rather than offering them products they are likely to

reject.

Predictive analytics can also enable call center personnel to act

on inbound calls by providing offers that are likely to be attractive to

certain caller profiles. Inversely, telemarketers can listen for such

trigger phrases as “Baccarat tournament” or “seafood buffet” or “hotel

room” to help casino marketers come up with the most enticing offer

for an individual patron. In addition, automated systems like kiosks and

customer service agents on the casino floor can use predictive analytics

to provide customers with appropriate offers during other interactions.

In their article “Knowing What to Sell, When, and to Whom,”

authors V. Kumar, R. Venkatesan, and W. Reinartz (2006) showed how,

by simply understanding and tweaking behavioral patterns, they could

increase the hit rate for offers and promotions to consumers, which then

had an immediate impact on revenue.

By applying statistical models based on the work of Nobel prize-

winning economist Daniel McFadden, researchers accurately predicted

not only a specific person’s purchasing habits, but also the specific time

of the purchase to an accuracy of 80% (Venkatesan and Reinartz,

2006). The potential to market to an individual when he or she is

primed to accept the advertising is advantageous for both parties

involved; marketers don’t waste time advertising to consumers when

they aren’t primed to accept the advertisements, but do market to

consumers when and where they might want to use the advertisements.

Predictive modeling is only useful if it is deployed and it

creates an action. Taking advantage of the more powerful, statistically

based segmentation methods, customers can be segmented not only by

dollar values but also on all known information, which can include

behavioral information gleaned from resort activities, as well as the

patron’s simple demographic information. This more detailed

55

Page 25: Double Down On Your Data

segmentation allows for more targeted and customer-focused marketing

campaigns.

Models can be evaluated and reports generated on multiple

statistical measures, such as neural networks, decision trees, genetic

algorithms, the nearest neighbor method, rule induction, and lift and

gains charts.1 Once built, scores can be generated in a variety of ways

to facilitate quick and easy implementation. The projects themselves

can be re-used and shared to facilitate faster model development and

knowledge transfer.

In his paper Predictive Analytics, Wayne Eckerson (2007)

advises creating predictive models by using the following six steps:

1. Define the business objectives and desired outcomes for the

project and then translate them into predictive analytic

objectives and tasks.

2. Explore and analyze the source data to determine the most

appropriate data and model building approach and then

scope the effort.

3. Prepare the data by selecting, extracting, and transforming

the data, which will be the basis for the models.

4. Build the models, as well as test and validate them.

5. Deploy the models by applying them to the business

decisions and processes.

6. Manage and update the models accordingly.

By utilizing data from past campaigns and measures generated

by the predictive modeling process, casino operators can track actual

campaign responses versus expected campaign responses, which can

1 Cumulative gains and lift charts are visual aids for measuring model performance.

56

Page 26: Double Down On Your Data

often prove wildly divergent. Additionally, casino operators can

generate upper and lower “control” limits that can be used to

automatically alert campaign managers when a campaign is over or

underperforming, letting them focus on campaigns that specifically

require attention.

One of the benefits of automating campaigns is that offers

based on either stated or inferred preferences of patrons can be

developed. Analysis can identify which customers may be more

responsive to a food/beverage offer, a room offer, and/or a free chip

offer. The result: more individualized offers are sent out to the casino's

patrons and, because these offers tap into a customer’s wants, desires,

needs and expectations, they are more likely to be used; more offers

used means more successful campaigns.

By understanding what type of patron is on its property, why

they are there, and what they like to do while they are there, a casino

operator can individualize its marketing campaigns so that they are

more effective, thereby increasing the casino property's ROI.

With predictive analytics, casino operators can even predict

which low-tier and mid-tier customers are likely to become the next

high rollers. In so doing, casinos can afford to be more generous in

their offers as they know that there is a high likelihood that these

customers will appreciate the personalized attention and therefore

become long term – and, hopefully, highly profitable – patrons.

ROI: Predictive Modeling in the Real World

It is hard to get an exact Return on Investment (ROI) figure with

predictive analytic solutions because many companies who have

implemented these solutions haven’t conducted formal ROI studies.

The very nature of an ROI study can be rather nebulous as it isn't

always easily quantifiable. However, according to Wayne Erickson

57

Page 27: Double Down On Your Data

(2007), “companies with high-value analytic programs that have

calculated ROI invest on average $1.36 million and receive a payback

within 11.2 months. It should be noted, however, that this study was

based on responses from only 37 survey respondents.”

Although they are not in the casino industry, according to a

July 7, 2009 press release about their use of the SAS Analytics suite of

tools, the National Geographic Society saw “a return on investment of

200 to 300 percent, with the best-performing customer segments

realizing 50 percent overall campaign performance improvements”

(SAS, 2009).

Some other real world examples of how predictive analytics

have helped companies increase customer service as well as drive

profits straight to the bottom line include:

• To test a marketing campaign hypothesis, Harrah's chose

two similar groups of frequent slot players from Jackson,

Miss. Members of the control group were offered a typical

casino-marketing package worth $125 that included a free

room, two steak meals and $30 of free chips at the Tunica

casino. Members of the test group were offered $60 in

chips. The more modest offer generated far more gambling,

suggesting that Harrah's had been wasting money giving

customers free rooms. Thereafter, profits from the revamped

promotion nearly doubled to $60 per person per trip

(Binkley, 2000).

• When Pearl River Resort initiated a marketing campaign

using SAS PVO, they were surprised by the results, which

showed that not all high-value guests were the same, some

actually weren’t very big gamblers but their spend in other

parts of the resort more than made up for their lack of

58

Page 28: Double Down On Your Data

gambling. This information was put to good use during

times when the casino was hosting poker or blackjack

tournaments, a time when the casino property knew that the

tables would be crowded. Pearl River Resort was able to fill

the resort with people the company knew had little intention

of venturing onto the gaming floor7.

• By using predictive analytics Harrah’s, was able to identify

a small group of customers who accounted for about 30% of

their overall gamblers (Binkley, 2000). These customers

spent between $100 and $499 per trip but actually

accounted for about 80% of the casino’s revenue and nearly

100% of the casino’s profits (Binkley, 2000).

Predicting a Patron's Future Worth

Modern casino analytics and patron management systems have

provided the gaming industry with an enormous amount of highly

detailed data about when, where, how often and how much patrons are

playing at a casino (Sutton, 2011). This is information that can be used

to better segment its customers as well as predict future behavior, and

improve marketing outcomes (Sutton, 2011). “Patron analytics are

essential for maximizing revenue driven by mass market marketing

campaigns,” Sutton (2011) argues in his paper Patron Analytics in the

Casino and Gaming Industry: How the House Always Wins. Sutton

(2011) claims that, when it comes to casino patron analytics, casino

operators must seek answers to the following questions:

• How much is a patron worth, how much can we expect a

patron to lose in the future, and who are the most valuable

patrons?

7http://www.sas.com/news/feature/02jun05/pr.html (Accessed April 2, 2012)

59

Page 29: Double Down On Your Data

• What patrons come together?

• What patrons are most likely to abuse an offer?

• What patrons are the most and least likely to respond to an

offer?

• Which offers perform the best?

The single most important thing patron analytics must

determine is the patron's worth to the casino property (Sutton, 2011).

Predicting a patron's future behavior is not an easy thing to do as it can

be affected by a number of variables, including total income,

expendable income, ethnicity, reasons for a trip (convention vs.

vacation), among many other things (Sutton, 2011). However,

“although that information is often available to append through third

parties, there is still a plentiful amount of information found with in-

house data and this data can be used to build models and metrics to

predict a patron's future worth” (Sutton, 2011). Once worth has been

determined, patrons can be “segmented into groups based on other

behaviors and effective marketing campaigns can be developed around

those behaviors” (Sutton, 2011).

Determining a patron's worth is imperative because it helps

reveal how valuable that patron is to the casino, and this is information

that can dictate how much should be reinvested in the patron in the

future (Sutton, 2011). Sutton (2011) argues that, “There are two main

components of worth – the financial sources of worth (i.e., gaming or

hotel) and the unit of time to which it refers (daily, weekly, monthly,

etc.). Additionally, worth can refer to historical worth, which is already

known, or future worth, which is unknown” (Sutton, 2011).

Most revenue sources are fairly straightforward – room

revenue refers to how much the patron paid for the room, restaurant

revenue, obviously, is how much he or she paid for food and/or drinks,

60

Page 30: Double Down On Your Data

(Sutton, 2011). Gaming revenue, however, isn't as simple because

probability is involved and there are “two important measures used to

assess a patron's gaming worth – actual and theoretical loss” (Sutton,

2011). Actual loss refers to how much money the patron actually lost

(or won), “whereas theoretical loss usually refers to the amount of

money a patron is expected to lose based on the amount of money

wagered, the time spent playing, and the probability associated with the

type of games played” (Sutton, 2011). Whereas actual loss is generally

used to measure campaign performance and profitability, theoretical

loss relies more heavily upon predictive analytics and is a much

stronger predictor of future behavior (Sutton, 2011). For Sutton (2011),

the formulas used to calculate the theoretical loss for table games and

slots are as follows:

• Table Theoretical Loss = Average Bet x Time Played x

Speed of Game x House Advantage.

• Slot Theoretical Loss = Coin in x Hold Percentage.

Once patron worth has been defined, data mining and

modeling techniques can be used to estimate predicted worth in the

future (Sutton, 2011). “Simple metrics based on historical behavior,

such as Average Daily Theoretical Loss or Average Trip Theoretical

Loss, will produce fairly accurate predictions of future worth,” Sutton

(2011) notes. “However, advanced predictive models are able to predict

worth with more accuracy and power by accounting for both patterns in

behavior over time and relationships between predictive inputs that

exist within casino data,” Sutton (2011) argues.

There are many different techniques that can be used to

develop models to predict future worth, the most common of which are

regression models (Sutton, 2011). Multiple regression models “utilize a

61

Page 31: Double Down On Your Data

variety of predictors and the relationships between those predictors to

predict future worth” Sutton (2011) states. As an example, Sutton

(2011) explains that, “a model built to predict future gaming trip worth

might be generated based on historical information about theoretical

win, actual win, credit line, time on device, nights stayed, and average

bet.”

Regression models could also be “built using such categorical

variables predictors as gender, ethnicity, age range, or demographic

variables” (Sutton, 2011). “Regression models are particularly

effective,” Sutton (2011) concludes “because the model can be used to

score historical data to predict an unknown outcome, which is worth in

this case, with a certain degree of confidence.”

Identifying the Casino's Most Valuable Patron

One way to determine who the casino's best patrons are is to try to

separate the skilled gamblers from the unskilled gamblers (Sutton,

2011). Most casino databases don't have a very good measure of skill,

however, it is possible to look at whether a patron is usually a loser or a

winner (Sutton, 2011). “A quick and easy way to evaluate a player's

skill is by calculating the percentage of trips where the player actually

lost money,” Sutton (2011) explains. For example, as Sutton (2011)

states:

Did a player with five trips lose money on all five of

those trips? Although this might just be an indicator

that the patron will play until he is out of money or

time, it is also a fairly simple way to identify the

patrons that do not come away as winners very often.

This is an instance where actual loss might be a good

62

Page 32: Double Down On Your Data

predictor of worth, as we would rather have these

patrons in the casino.

Although slot machines are not really skill based and loses

could be more attributed to luck, a differentiation can be made between

patrons by looking at behaviors and strategies of slot players (Sutton,

2011). “Since casinos have to pay a certain percentage of win or handle

to the slot manufacturer for participation games, patrons that primarily

play non-participation games are slightly more valuable to the casino”

Sutton (2011). Casino operators should keep in mind how much play a

slot patron has on participation machines compared to machines it

owns outright.

Another thing to look at for slot players is their average bet

relative to the maximum bet on the games they play (Sutton, 2011). In

most cases, a patron has to play the maximum bet in order to be eligible

for jackpots and progressives (Sutton, 2011). Given two patrons of

comparable theoretical worth, the one who plays closer to the

maximum bet allowed is more likely to hit a jackpot than the other

(Sutton, 2011). Sutton (2011) argues, rather counterintuitively, that “the

patron with the higher average bet would seem to be more valuable, but

since the lower bet patron is less likely to hit a jackpot, the lower bet

patron might be a lower risk.” Although this metric has usefulness on

its own, casino management could also “use it as either a predictor in a

model for future worth or a decision tree predicting whether a patron

will respond” to an offer (Sutton, 2011).

Identifying Patrons Who Come Together

Beyond a patron's worth is the combined worth of a household, which

“refers to the combined worth of multiple patrons that tend to make

63

Page 33: Double Down On Your Data

their trips together” (Sutton, 2011). This can be difficult to identify

because patrons of the same household might stay in separate rooms,

take trips separately, or one patron might only come when accompanied

by another patron (Sutton, 2011). Although tricky, identifying

household worth can pay huge dividends by helping to “account for

revenue that looks like two separate individuals but can be combined

into one 'household'” (Sutton, 2011).

Although many patron management systems allow the linking

of accounts so that patrons who come together can be easily identified,

data mining can help to identify groups of patrons who come together

without linked accounts (Sutton, 2011). To do this, the casino must first

identify patrons who make their trips at the same time as each another.

Then a combination of various data points are studied to identify the

“households” (Sutton, 2011). For Sutton (2011), these include:

• Last name: this can identify relatives who come together.

• Address: this reveals roommates or patrons living together

with separate last names.

• Room or floor: patrons in a group tend to request rooms

that are near each other.

• City and State: this could reveal friends and/or relatives

who are from the same area.

• Time of day that games are played: this will reveal

players who are together on the casino floor, which is

obviously the behavior of friends or family members.

• Game type: this reveals patrons who are playing in the

same location on the casino floor, or at least close to one

another.

• Restaurant and retail charges: this reveals patrons who

have charges from the same outlet on the same day.

64

Page 34: Double Down On Your Data

By grouping patrons into a “household worth”, a group of four

patrons who might be of modest worth individually, can “come together

and stay in the same room every time and thus are worth more as a

group” (Sutton, 2011). Armed with this grouping information, a casino

can adjust its marketing effort and send an offer that is based on the

patron's combined worth rather than on their individual worth (Sutton,

2011). It is a distinction that could be the difference between the offer's

acceptance or its rejection.

Minimizing Patron Abuse

Predictive models of worth should take into account the likelihood of a

guest playing on a future trip (Sutton, 2011). It is also advisable to

build a separate model that identifies patrons who are likely to use a

future offer but not play in the casino, thereby taking advantage of the

property (Sutton, 2011). “Since many offers in the casino industry tend

to be for complimentary rooms that are given to patrons upfront,

patrons that redeem offers and do not play have a considerable impact

on campaign success and profitability,” Sutton (2011) points out. For

marketing campaigns to be successful over the long term, it is

important to not only identify the patrons who are abusing the system,

but also to adjust the offers they receive (Sutton, 2011). According to

Sutton (2011) “Decision trees and logistic regression are common

statistical methods used to identify patron characteristics that predict

the likelihood of a patron (or segment of patrons) to abuse an offer.”

Factors such as age, gender and history of abuse are likely

predictors of abuse (Sutton, 2011). Survey data from post-visit follow-

up surveys can be used to identify these predictors (Sutton, 2011). “If a

patron had a bad experience in the past, they might take an offer for a

free room as revenge for that bad experience,” Sutton (2011) warns.

65

Page 35: Double Down On Your Data

But by identifying those patrons at risk of abusing offers, the casino

property can decide how best to market to these individuals (Sutton,

2011). Instead of receiving the general offer of a free room, the patron

would be sent an offer that requires them to play up to a certain level or

they would be required to pay for their room” (Sutton, 2011).

Campaign Optimization

In addition to predicting the future worth of patrons, casino marketers

must know which marketing campaigns are the most effective at

driving up response rates, as well as which campaigns will increase

gambling revenues and property profits (Sutton, 2011). Understanding

a patron's probable future worth is critical in determining the eligible

reinvestment levels that make financial sense for the casino property

(Sutton, 2011).

“A patrons' behavior and interests can be used to identify the

offer(s) that will be most appealing to each patron and generate the

most profitable response,” argues Sutton (2011). Although offers of free

rooms and free gaming play are historically the strongest drivers of

response, the cost associated with them can be detrimental to the

property (Sutton, 2011). Sometimes it doesn't make good business

sense because “not every patron who is eligible for a free room has to

be offered a free room to respond – some might be willing to pay for a

discounted room or even a full price room” (Sutton, 2011). “By

analyzing the likelihood that a patron will respond to a certain offer or

offers, casino executives can optimize the offer that each patron is

given in order to maximize the amount of revenue and profit driven by

marketing campaigns as a whole,” concludes Sutton (2011).

A/B testing is the most effective way to identify the best

available offer (Sutton, 2011). A/B testing “involves testing two

66

Page 36: Double Down On Your Data

different offers against one another in order to identify the offer that

drives the highest response and the most revenue/profit” (Sutton, 2011).

Logistic regression, decision trees, and discriminant analysis can also

be used to cull through a casino's historical data and uncover factors

that are related to whether a patron responds to an offer or not (Sutton,

2011). “These factors can then be used to assess the likelihood of

response based on the similarity of a patron profile to that of

responders,” explains Sutton (2011). Obviously, “to build accurate and

predictive response models, historical data about response is required”

(Sutton, 2011). “The likelihood of response might be a broad measure

of response that refers to the likelihood that a patron will respond to

any offer, or it might be specific to the likelihood of response to a

specific type of offer,” warns Sutton (2011). Effective response models

have a dual purpose; they can help identify which patrons are most

likely to respond to an offer, as well as reveal which offer patrons are

most likely to respond to (Sutton, 2011). According to Sutton (2011),

“There are at least three main uses of response modeling that can

improve marketing results:

• Identify the likelihood of patrons to respond to the offer.

• Identify the offer(s) to which patrons are most likely to

respond.

• Predict when a patron is likely to return.”

Determining which offers a patron is most likely to respond to

is only half the battle. It is also important to know exactly when that

patron is planning to make his or her trip as well (Sutton, 2011).

Although it's almost impossible to know exactly when a patron is likely

to return (without them making a reservation, of course), there are a

variety of predictive analytics methods that can help take out the guess

67

Page 37: Double Down On Your Data

work (Sutton, 2011). Frequency analysis, regression, and survival

analysis can all be used to assess when a patron is likely return to a

casino (Sutton, 2011).

Knowing when a patron is likely to return can help identify

patrons who haven't been at the property in while and might be at risk

of leaving for good (Sutton, 2011). To identify these patrons, the casino

should discover the average or median time between a patron's trips

(Sutton, 2011). This could be segmented by geography, worth, or even

historical frequency, such as trips made weekly, monthly, quarterly,

annually, bi-annually, and so forth (Sutton, 2011). Patrons who “have

not made a trip within the decided amount of time for their segment are

subsequently flagged and dealt with appropriately” (Sutton, 2011).

Sutton (2011) believes a casino's marketing department should have

two primary goals; generate trips sooner than expected and convert

patrons into more frequent visitors.

A lesser goal would be identifying patrons who are at risk of

leaving the property for good (Sutton, 2011). In cases such as these,

sending the patron an offer using “last chance” or “we miss you”

language could help retain them (Sutton, 2011). The offer contained

within should probably be better than the patron has received in the

past to really catch their attention (Sutton, 2011). By knowing when a

patron is likely to return, a casino “can adjust marketing strategies

appropriately in order to save money on mail costs, retain guests and

increase loyalty” (Sutton, 2011). These are all important strategies that

should help drive profits straight to the casino's bottom line.

Conclusion

I started this chapter with a quote that asked the question, “Do you

want to ride the analytics horse to profitability...or follow it with a

68

Page 38: Double Down On Your Data

shovel?” and I believe the return on investment of implementing a

predictive analytics solution in the casino business would be

substantial. Casino patrons the world over like the same thing – free

rooms, free gaming comps or free meals – all delivered within a

marketing campaign that taps into their own personal wants, desires,

needs and expectations. This can only be done with predictive

analytics.

Modern casino analytics and patron management systems

contain enormous amounts of highly detailed data about when, where,

how often and how much a casino patron is spending at a casino property

(Sutton, 2011). This information can be used to better segment customers,

to predict future behavior, and to improve marketing outcomes (Sutton,

2011). Sutton (2011) claims that, when it comes to casino patron analytics,

casino operators must seek answers to the following questions:

• How much is a patron worth, how much can we expect a

patron to lose in the future, and who are the casino's most

valuable patrons?

• What patrons come together?

• Which patrons are most likely to abuse an offer?

• What patrons are the most and least likely to respond to an

offer?

• Which offers perform the best?

Answers to all of these questions can be found by utilizing

predictive analytics, a variety of statistical techniques that analyzes

current and historical facts to make predictions about future events.

Using such techniques as predictive modeling, machine learning, data

mining and game theory8, predictive analytics can build models that

8The study of mathematical models of conflict and cooperation between intelligent rational decision-makers (Myerson, 1991).

69

Page 39: Double Down On Your Data

exploit patterns found in historical and transactional data to identify

business risks and opportunities. Predictive analytics can be broken

down into three different types of models, predictive, descriptive and

decision. By combing through copious amounts of data to discover

patterns, trends and relationships that are too subtle or complex for

humans to detect, data mining, along with predictive analytics, can

identify potentially useful and understandable correlations and patterns

in a property's datasets.

For the casino industry, data mining can cull through copious

amount of data, data that is coming from such disparate sources and

departments as sales, credit, and marketing and then measure patron

behavior against more than a hundred different attributes. This is a far cry

from the three or four different attributes that statistical modeling used to

offer. The most common data mining techniques are linear regressions,

neural networks, logistics regressions, decision trees and A/B testing.

Predictive analytics can graph a customer’s value over time as

well as anticipate that customer’s behavior. From this analysis, a casino

operator can tailor highly specific, laser-focused marketing campaigns to

each customer in the casino’s patron database. Drawing on data from

casino player cards, predictive models can set budgets and calendars for

the casino's gamblers, calculating their predicted lifetime value in the

process. Armed with this information, casino properties can use predictive

analytics to:

• Create direct mailing campaigns.

• Create seasonal promotions.

• Plan the timing and placement of advertising campaigns.

• Create personalized advertisements.

• Define which market segments are growing most rapidly.

70

Page 40: Double Down On Your Data

• Determine the number of rooms to reserve for wholesale

customers and business travelers.

Patron worth is the single most important thing patron analytics

must determine (Sutton, 2011). Once it has been determined, patrons can

be “segmented into groups based on other behaviors and effective

marketing campaigns can be developed around those behaviors” (Sutton,

2011). Determining a patron's worth is imperative because it helps reveal

how valuable that patron is to the casino, and this is information that can

dictate how much should be reinvested in the patron in the future (Sutton,

2011). Once patron worth has been defined, data mining and modeling

techniques can be used to estimate predicted worth into the future (Sutton,

2011).

Regression models are particularly effective to find patron worth

because the model can be used to score historical data to predict an

unknown outcome (Sutton, 2011). Multiple regression models “utilize a

variety of predictors and the relationships between those predictors to

predict future worth” (Sutton, 2011).

In addition to predicting the future worth of a patron, casino

marketers must know which marketing campaigns they are running are

the most effective at driving up response rates (Sutton, 2011).

Understanding a patron's probable future worth helps determine the

eligible reinvestment levels that make financial sense for the casino

property (Sutton, 2011).

A/B testing is one of the most effective way to identify the best

available offer to be made to a patron, but logistic regression, decision

trees, and discriminant analysis can also be used successfully (Sutton,

2011). Effective response models have a dual purpose; they can help

identify which patrons are most likely to respond to an offer, as well as

reveal which offer patrons are most likely to respond (Sutton, 2011).

71

Page 41: Double Down On Your Data

Casino operators should keep in mind that data mining will only

be successful if their casino patrons are willing to provide information on

themselves. Privacy is a big issue and will always remain so in the mobile

age. Casino properties that can honor a patron's privacy demands will find

patron loyalty comes with voluminous amounts of priceless patron data.

This is data that can be used to create marketing campaigns that should

prove highly effective. By understanding what type of patron is on its

property, why they are there, and what they like to do while they are there,

casino operators can individualize their marketing campaigns so that these

campaigns are more effective, thereby increasing the casino property's

ROI.

72