discriminant analysis group no. 4

DISCRIMINANT ANALYSIS

CAN WE

ACCURATEL

Y CLA

SSIFY

CONSU

MERS

FROM S

URVEY R

ESULT

S?

NAMES PRN

AMIT 12020841123ADVAIT 12020841116ANANT 12020841119SANGRAM 12020841120SHARAD 12020841117

GROUP NO. 4

DISCRIMINANT ANALYSIS• Discriminant analysis is a statistical procedure

which allows us to classify cases in separate categories to which they belong on the basis of a set of characteristic independent variables called predictors or discriminant variables

• The target variable (the one determining allocation into groups) is a qualitative (nominal or ordinal) one, while the characteristics are measured by quantitative variables.

• DA looks at the discrimination between two groups

• Multiple discriminant analysis (MDA) allows for classification into three or more groups.

APPLICATIONS OF DADA is especially useful to understand the

differences and factors leading consumers to make different choices allowing them to develop marketing strategies which take into proper account the role of the predictors.

Examples:• Determinants of customer loyalty• Shopper profiling and segmentation• Determinants of purchase and non-purchase

EXAMPLE ON THE TRUST DATA-SET• Purchasers of Chicken at the Butcher’s Shop • Respondents may belong to one of two

groups• Those who purchase chicken at the butcher’s shop• Those who do not

• Discrimination between these groups through a set of consumer characteristics• Expenditure on chicken in a standard week • Age of the respondent • Whether respondents agree (on a seven-point ranking scale) that butchers sell safe chicken • Trust (on a seven-point ranking scale) towards supermarkets

• Does a linear combination of these four characteristics allow one to discriminate between those who buy chicken at the butcher’s and those who do not?

DISCRIMINANT ANALYSIS (DA)• Two groups only, thus a single discriminating value

(discriminating score)• For each respondent a score is computed using the

estimated linear combination of the predictors (the discriminant function)

• Respondents with a score above the discriminating value are expected to belong to one group, those below to the other group.

• When the discriminant score is standardized to have zero mean and unity variance it is called Z score

• DA also provides information about the discriminating power of each of the original predictors

MULTIPLE DISCRIMINANT ANALYSIS (MDA) (1)

Discriminant analysis may involve more than two groups, in which case it is termed multiple discriminant analysis (MDA).

Example from the Trust data-set• Dependent variable: Type of chicken purchased ‘in a typical week’, choosing among four categories: value (good value for money), standard, organic and luxury

• Predictors: age , stated relevance of taste , value for money and animal welfare , plus an indicator of income

MULTIPLE DISCRIMINANT ANALYSIS (2)

• In this case there will be more than one discriminant function.

• The exact number of discriminant functions is equal to either (g-1), where g is the number of categories in classification or to k, the number of independent variables, whichever is the smaller

• Trust example: four groups and five explanatory variables, the number of discriminant functions is three (that is g-1 which is smaller than k=5).

THE OUTPUT OF MDSSimilarities with factor (principal component) analysis• the first discriminant function is the most relevant for discriminating across groups, the second is the second most relevant, etc.

• the discriminant functions are also independent, which means that the resulting scores are non-correlated.

• Once the coefficients of the discriminant functions are estimated and standardized, they are interpreted in a similar fashion to the factor loadings.

• The larger the standardised coefficients (in absolute terms), the more relevant the respective variables to discriminating between groups

There is no single discriminant score in MDA• group means are computed (centroids) for each of the discriminant functions to have a clearer view of the classification rule

RUNNING DISCRIMINANT ANALYSIS(2 GROUPS)

9

0 1 1 2 2 3 3 4 4z x x x x

Discriminant function (Target variable: purchasers of chicken at the butcher’s shop)

Discriminant score Predictors

• weekly expenditure on chicken

• age

• safety of butcher’s chicken

• trust in supermarkets

The discriminant coefficients need to be estimated

FISHER’S LINEAR DISCRIMINANT ANALYSISThe discriminate function is the starting point.Two key assumptions behind linear DA

(a) the predictors are normally distributed;(b) the covariance matrices for the predictors within each of the

groups are equal.

Departure from condition (a) should suggest use of alternative methods.

Departure from condition (b) requires the use of different discriminant techniques (usually quadratic discriminant functions).

In most empirical cases, the use of linear DA is appropriate.

10

ESTIMATIONThe first step is the estimation of the

coefficients, also termed as discriminant coefficients or weights.

Estimation is similar to factor analysis or PCA, as the coefficients are those which maximize the variability between groups

In MDA, the first discriminating function is the one with the highest between-group variability, the second discriminating function is independent from the first and maximizes the remaining between-group variability and so on

11

SPSS – TWO GROUPS CASE

12

1. Choose the target variable

2. Define the range of the dependent variable

3.Select the predictors

COEFFICIENT ESTIMATES

13

Fisher’s and standardized estimates of the discriminant function coefficients need to be asked for

Additional statistics and diagnostics

CLASSIFICATION OPTIONS

14

Decide whether prior probabilities are equal across groups or group sizes reflect different allocation probabilities

These are diagnostic indicators to evaluate how well the discriminant function predict the groups

SAVE CLASSIFICATION

15

Create new variables in the data-set, containing the predicted group membership and/or the discriminant score for each case and each function

OUTPUT – COEFFICIENT ESTIMATES

16

Canonical Discriminant Function Coefficients

.095

.454

-.297

.025

-2.515

In a typical week howmuch do you spendon fresh or frozenchicken (Euro)?

From the butcher

Supermarkets

Age

(Constant)

1

Function

Unstandardized coefficients

Standardized Canonical Discriminant Function Coefficients

.378

.748

-.453

.394

In a typical week howmuch do you spendon fresh or frozenchicken (Euro)?

From the butcher

Supermarkets

Age

1

Function

Unstandardized coefficients depend on the measurement unit

Standardized coefficients do not depend on the measurement unit

Most important predictor

Trust in supermarkets has a – sign (thus it reduces the discriminant score)

CENTROIDS

17

Prior Probabilities for Groups

.660 277 277.000

.340 143 143.000

1.000 420 420.000

Butcherno

yes

Total

Prior Unweighted Weighted

Cases Used in Analysis

Functions at Group Centroids

-.307

.594

Butcherno

yes

1

Funct ion

Unstandardized canonical discriminantfunctions evaluated at group means

These are the means of the discriminant score for each of the two groups

Thus, the group of those not purchasing chicken at the butcher’s shop have a negative centroid

With two groups, the discriminating score is zero

This can be computed by weighting the centroids with the initial probabilities

From these prior probabilities it follows that the discriminating score is -0.307 x 0.66 + 0.594 x 0.34 = 0

OUTPUT – CLASSIFICATION SUCCESS

18

Classification Resultsa

244 33 277

88 55 143

1 1 2

88.1 11.9 100.0

61.5 38.5 100.0

50.0 50.0 100.0

Butcherno

yes

Ungrouped cases

no

yes

Ungrouped cases

Count

%

Originalno yes

Predicted GroupMembership

Total

71.2% of original grouped cases correctly classified.a.

Using the discriminant function, it is possible to correctly classify 71.2% of original cases (244 no-no + 55 yes-yes)/420

DIAGNOSTICS (1)• Box’s M test. This tests whether covariances are equal across groups

• Wilks’ Lambda (or U statistic) tests discrimination between groups. It is related to analysis of variance.

• Individual Wilks’Lambda for each of the predictors in a discriminant function; univariate ANOVA (are there significant differences in the predictor’s means between the groups?), p-value from the F distribution.

• Wilks’ Lambda for the function as a whole. Are there significant differences in the group means for the discriminant function p-value from the Chi-square distribution?

• The overall Wilks’ Lambda is especially helpful in multiple discriminant analysis as it allows one to discard those functions which do not contribute towards explaining differences between groups.

19

DIAGNOSTICS (2)DA returns one eigenvalue (or more eigenvalues

for MDA) of the discriminant function. These can be interpreted as in principal

component analysisIn MDA (more than one discriminant function)

eigenvalues are exploited to compute how each function contributes to explain variability

The canonical correlation measures the intensity of the relationship between the groups and the single discriminant function

20

TRUST EXAMPLE: DIAGNOSTICS

21

Statistic P-value

Box's M statistic 37.3 0.000

Overall Wilks' Lambda 0.85 0.000

Wilks Lambda for

Expenditure 0.98 0.002

Age 0.97 0.001

Safer for Butcher 0.91 0.000

Trust in Supermarket 0.98 0.002

Eigenvalue 0.18

Canonical correlation 0.39

% OF CORRECT PREDICTIONS

71.2%

Covariance matrices are not equal

The overall discriminating power of the DF is good

All of the predictors are relevant to discriminating between the two groups

The eigenvalue is the ratio between variances between and variance within groups (the larger the better)

Square root of the ratio between variability between and total variability

MDA

22

To run MDA in SPSS the only difference is that the range has more than two categories

PREDICTORS

23

Test Results

65.212

1.382

45

53286.386

.045

Box's M

Approx.

df1

df2

Sig.

F

Tests null hypothesis of equal population covariance matrices.

Tests of Equality of Group Means

.981 1.798 3 282 .148

.971 2.761 3 282 .042

.960 3.878 3 282 .010

.982 1.679 3 282 .172

.919 8.272 3 282 .000

Age

Tasty food

Value for money

Animal welfare

Please indicate yourgross annual householdincome range

Wilks'Lambda F df1 df2 Sig.

Three predictors only appear to be relevant in discriminating among preferred types of chicken

Null rejected at 95% c.l., but not at 99% c.l.

DISCRIMINANT FUNCTIONS

24

Eigenvalues

.102a 61.0 61.0 .304

.051a 30.8 91.8 .221

.014a 8.2 100.0 .116

Function1

2

3

Eigenvalue % of Variance Cumulative %CanonicalCorrelation

First 3 canonical discriminant functions were used in theanalysis.

a.

Three discriminant functions (four groups minus one) can be estimated

Wilks' Lambda

.851 45.098 15 .000

.938 17.904 8 .022

.986 3.818 3 .282

Test of Function(s)1 through 3

2 through 3

3

Wilks'Lambda Chi-square df Sig.

The first two discriminant functions have a significant discriminating power.

COEFFICIENTS

25

Discriminant functions’ coefficients

Unstandardized Standardized

1 2 1 2 Value for money -.043 .603 -.053 .746 Age -.009 -.013 -.148 -.208 Tasty food .169 .416 .152 .374 Animal welfare .186 -.132 .313 -.222 Please indicate your gross annual household income range

.652 -.033 .870 -.044

(Constant) -2.298 -4.868

Income is very relevant for the first function

Value for money is very relevant for the second function

STRUCTURE MATRIX

26

Structure Matrix

.929* -.021 .078

.390* -.206 .125

-.010 .891* .168

.241 .660* .273

-.217 -.204 .944*


Animal welfare

Value for money

Tasty food

Age

1 2 3

Function

Pooled within-groups correlat ions between discriminatingvariables and standardized canonical disc riminant functions Variables ordered by absolute size of correlation within function.

Largest absolute correlation between each variable andany discriminant function

*.

The values in the structure matrix are the correlations between the individual predictors and the scores computed on the discriminant functions.

For example, the income variable has a strong correlation with the scores of the first function

The structure matrix help interpreting the functions

Income

Value and taste

Age

CENTROIDS

27

Functions at Group Centroids

-.673 -.262 -.040

.058 .156 -.065

.525 -.470 -.030

.003 .052 .242

In a typical week, whattype of fresh or frozenchicken do you buy foryour household'shome consumption?

'Value' chicken

'Standard' chicken

'Organic' chicken

'Luxury' chicken

1 2 3

Function

Unstandardized canonical discriminant functions evaluated atgroup means

The first function discriminates well between value and organic (income matters to organic buyers)

The second allows some discrimination standard-organic, value-standard, organic-luxury (taste and value matter)

PLOT OF TWO FUNCTIONS

28

The ‘territorial map’ shows the scores for the first two functions considering all groups

Tick ‘separate-groups’ to show graphs of the first two functions for each individual group

PLOTS: INDIVIDUAL GROUPS

29

Example: organic chicken

Most cases tend to be relatively high on function 1 (income)

Example: organic chicken

Most cases tend to be relatively high on function 1 (income)

PLOTS – ALL GROUPS

30

PREDICTION RESULTS

31

Classification Resultsa

3 38 0 0 41

2 154 1 0 157

1 30 4 0 35

1 51 1 0 53

0 51 3 0 54

7.3 92.7 .0 .0 100.0

1.3 98.1 .6 .0 100.0

2.9 85.7 11.4 .0 100.0

1.9 96.2 1.9 .0 100.0

.0 94.4 5.6 .0 100.0

In a typical week, whattype of fresh or frozenchicken do you buy foryour household'shome consumption?'Value' chicken

'Standard' chicken

'Organic' chicken

'Luxury' chicken

Ungrouped cases

'Value' chicken

'Standard' chicken

'Organic' chicken

'Luxury' chicken

Ungrouped cases

Count

%

Original

'Value'chicken

'Standard'chicken

'Organic'chicken

'Luxury'chicken

Predicted Group Membership

Total

56.3% of original grouped cases correctly classified.a.

The functions do not predict well; most units are allocated to standard chicken – on average only 56.3% of the cases are allocated correctly

STEPWISE DISCRIMINANT ANALYSISAs for linear regression it is possible to decide whether all

predictors should appear in the equation regardless of their role in discriminating (the Enter option) or a sub-set of predictors is chosen on the basis of their contribution to discriminating between groups (the Stepwise method)

32

THE STEP-WISE METHOD

1. A one-way ANOVA test is run on each of the predictors, where the target grouping variable determines the treatment levels. The ANOVA test provides a criterion value and tests statistics (usually the Wilks Lambda). According to the criterion value, it is possible to identify the predictor which is most relevant in discriminating between the groups

2. The predictor with the lowest Wilks Lambda (or which meets an alternative optimality criterion) enters the discriminating function, provided the p-value is below the set threshold (for example 5%).

3. An ANCOVA test is run on the remaining predictors, where the covariates are the target grouping variables and the predictors that have already entered the model. The Wilks Lambda is computed for each of the ANCOVA options.

4. Again, the criteria and the p-value determine which variable (if any) enter the discriminating function (and possibly whether some of the entered variables should leave the model).

5. The procedure goes back to step 3 and continues until none of the excluded variables have a p-value below the threshold and none of the entered variables have a p-value above the threshold (the stopping rule is met).

33

ALTERNATIVE CRITERIAUnexplained variance

Smallest F ratio

Mahalanobis distance

Rao’s V

34

IN SPSS

35

The step-wise method allows selection of relevant predictors

OUTPUT OF THE STEP-WISE METHOD

36

Variables in the Analysis

1.000 8.272

1.000 8.241 .960

1.000 3.863 .919



Value for money

Step1

2

Tolerance F to RemoveWilks'

Lambda

Variables Not in the Analysis

1.000 1.000 1.798 .981

1.000 1.000 2.761 .971

1.000 1.000 3.878 .960

1.000 1.000 1.679 .982

1.000 1.000 8.272 .919

.988 .988 1.507 .905

.991 .991 2.437 .896

1.000 1.000 3.863 .883

.992 .992 1.052 .909

.987 .987 1.549 .868

.821 .821 .793 .875

.992 .992 1.057 .873

Age

Tasty food

Value for money

Animal welfare


Age

Tasty food

Value for money

Animal welfare

Age

Tasty food

Animal welfare

Step0

1

2

ToleranceMin.

Tolerance F to EnterWilks'

Lambda

Only two predictors are kept in the model

APPLICATIONS IN MARKETING:

After getting to know the Technical Aspect of this useful concept,

we can conclude that DA has the following applications in the field

of Marketing:

• Discriminate analysis, a multivariate technique used for market segmentation and predicting group membership is often used for this type of problem because of its ability to classify individuals or experimental units into two or more uniquely defined populations.

• Product research – Distinguish between heavy, medium, and light users of a product in terms of their consumption habits and lifestyles.

• Perception/Image research – Distinguish between customers who exhibit favorable perceptions of a store or company and those who do not.

• Advertising research – Identify how market segments differ in media consumption habits.

• Direct marketing – Identify the characteristics of consumers who will respond to a direct marketing campaign and those who will not.

THANKYOU

discriminant analysis group no. 4

Technology

discriminant variables

discriminant coefficients

discriminant analysis

single discriminant

factor analysis

discriminating function

z score da

type of chicken