factor analysis

52
Factor Analysis N P Singh Professor

Upload: chinnu

Post on 23-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Factor Analysis

Factor AnalysisN P SinghProfessor

Page 2: Factor Analysis

Factor analysis was invented by psychologist Charles Spearman

History

Page 3: Factor Analysis

Combination of original variables

What is a factor?

Page 4: Factor Analysis

Grades inStudent No

Finance (Y1)

Marketing Y2

Policy (y3)

1 3 6 52 7 3 33 10 9 84 3 9 75 10 6 5

Example

Page 5: Factor Analysis

It has been suggested that these grades are functions of two underlying factors, F1 and F2, tentatively

These are as quantitative ability and verbal ability, respectively.

It is assumed that each Y variable is linearly related to the two factors, as given in next slide.

Examples

Page 6: Factor Analysis

Factors

Page 7: Factor Analysis

The error terms e1, e2, and e3, serve to indicate that the hypothesized relationships are not exact.

the parameters ij are referred to as loadings. For example, 12 is called the loading of variable Y1 on factor F2.

What are these Error Terms

Page 8: Factor Analysis

Factor Analysis

A data reduction technique designed to represent a wide range of attributes on a smaller number of dimensions.

Page 9: Factor Analysis

In this MBA program, ¯finance is highly quantitative, while marketing and policy have a strong qualitative orientation.

Quantitative skills should help a student in finance, but not in marketing or policy.

Verbal skills should be helpful in marketing or policy but not in finance.

In other words, it is expected that the loadings have roughly the following structure:

Continued ……..

Page 10: Factor Analysis

ContinuedIt is expected that the loadings have roughly the following structure:

Page 11: Factor Analysis

The Common Factor Model

Page 12: Factor Analysis

This model proposes that each observed response (measure 1 through measure 5) is influenced partially by underlying common factors (factor 1 and factor 2) and partially by underlying unique factors (E1through E5).

The strength of the link between each factor and each measure varies, such that a given factor influences some measures more than others.

The Common Factor Model

Page 13: Factor Analysis

Factor Analysis

For example, suppose that a bank asked a large number of questions about a given branch. Consider how the following characteristics might be more parsimoniously represented by just a few constructs (factors).

Page 14: Factor Analysis

Factor Analysis

Page 15: Factor Analysis

Factor Analysis

- Benefits include: (1) a more concise representation of the marketing situation and hence communication may be enhanced; (2) fewer questions may be required on future surveys; and, (3) perceptual maps become feasible. - Ideally, interval data (e.g., a rating on a 7 point scale), regarding the perceptions of consumers are required regarding a number of features, such as those noted above for the bank are gathered.

Page 16: Factor Analysis

personality.sav - a set of responses from a personality questionnaire.

SAQ.sav - fictional statistics anxiety questionnaire from Andy Field's textbook resources

Examples of Data

Page 17: Factor Analysis

The purpose of PCA is to derive a relatively small number of components that can account for the variability found in a relatively large number of measures.

This procedure, called data reduction, is typically performed when a researcher does

not want to include all of the original measures in analyses but still wants to work with the information that they contain.

Principal Component Analysis

Page 18: Factor Analysis

PCA Model

Page 19: Factor Analysis

The first difference is that the direction of influence is reversed: EFA assumes that the measured responses are based on the underlying factors while in PCA the principal components are based on the measured responses.

The second difference is that EFA assumes that the variance in the measured variables can be decomposed into that accounted for by common factors and that accounted for by unique factors. The principal components are defined simply as linear combinations of the measurements, and so will contain both common and unique variance.

Difference

Page 20: Factor Analysis

FA and PCA (principal components analysis) are methods of data reduction◦Take many variables and explain them

with a few “factors” or “components”◦Correlated variables are grouped together

and separated from other variables with low or no correlation

What is FA & PCA?

Page 21: Factor Analysis

FA and PCA are not much different than canonical correlation in terms of generating canonical variates from linear combinations of variables◦ Although there are now no “sides” of the

equation◦ And your not necessarily correlating the

“factors”, “components”, “variates”, etc.

What is FA & PCA?

Page 22: Factor Analysis

FA produces factors; PCA produces components

Factors cause variables; components are aggregates of the variables

FA vs. PCA conceptually

Page 23: Factor Analysis

FA analyzes only the variance shared among the variables (common variance without error or unique variance); PCA analyzes all of the variance

FA: “What are the underlying processes that could produce these correlations?”; PCA: Just summarize empirical associations, very data driven

FA vs. PCA conceptually

Page 24: Factor Analysis

Step 1: Selecting and Measuring a set of variables in a given domain

Step 2: Data screening in order to prepare the correlation matrix

Step 3: Factor Extraction Step 4: Factor Rotation to increase

interpretability Step 5: Interpretation Further Steps: Validation and Reliability of

the measures

General Steps to FA

Page 25: Factor Analysis

A good factor: ◦ Makes sense◦ will be easy to interpret ◦ simple structure◦ Lacks complex loadings

“Good Factor”

Page 26: Factor Analysis
Page 27: Factor Analysis
Page 28: Factor Analysis
Page 29: Factor Analysis
Page 30: Factor Analysis

We are looking for an eigenvalue above 1.0.

Cumulative percent of variance explained.

Page 31: Factor Analysis
Page 32: Factor Analysis
Page 33: Factor Analysis
Page 34: Factor Analysis

Expensive

Exciting

Luxury

Distinctive

Not Conservative

Not Family

Not Basic

Appeals to Others

Attractive Looking

Trend Setting

Reliable

Latest Features

Trust

Page 35: Factor Analysis

Expensive

Exciting

Luxury

Distinctive

Not Conservative

Not Family

Not Basic

Appeals to Others

Attractive Looking

Trend Setting

Reliable

Latest Features

Trust

What shall these components be called?

Page 36: Factor Analysis

Expensive

Exciting

Luxury

Distinctive

Not Conservative

Not Family

Not Basic

Appeals to Others

Attractive Looking

Trend Setting

Reliable

Latest Features

Trust

EXCLUSIVE TRENDY RELIABLE

Page 37: Factor Analysis

= (Expensive + Exciting + Luxury + Distinctive – Conservative – Family – Basic)/7

= (Appeals to Others + Attractive Looking + Trend Setting)/3

= (Reliable + Latest Features + Trust)/3

EXCLUSIVE

TRENDY

RELIABLE

Calculate Component Scores

Page 38: Factor Analysis
Page 39: Factor Analysis
Page 40: Factor Analysis

Exclusive Trendy ReliableBeetle 1.4 6.7 6.9Hummer 3.9 6.2 6.7Lotus 4.1 7.3 6.7Minivan -1.67 4.83 6.5Pick-Up -0.43 4.93 6.3

Not much differing on this dimension.

Page 41: Factor Analysis

Exclusive Trendy ReliableBeetle 1.4 6.7 6.9Hummer 3.9 6.2 6.7Lotus 4.1 7.3 6.7Minivan -1.67 4.83 6.5Pick-Up -0.43 4.93 6.3

Page 42: Factor Analysis

Vehicle by Component

-3 -2 -1 0 1 2 3 4 5 6 7 8

Beetle

Hummer

Lotus

Minivan

Pick-Up

Exclusive Trendy

Page 43: Factor Analysis

Exploratory FA◦ Summarizing data by grouping correlated

variables◦ Investigating sets of measured variables related

to theoretical constructs◦ Usually done near the onset of research◦ The type of FA and PCA we are talking here

Types of FA

Page 44: Factor Analysis

Confirmatory FA◦ More advanced technique◦ When factor structure is known or at least

theorized◦ Testing generalization of factor structure to new

data, etc.◦ This is tested through SEM

Types of FA

Page 45: Factor Analysis

Observed Correlation Matrix Reproduced Correlation Matrix Residual Correlation Matrix

Terminology

Page 46: Factor Analysis

Orthogonal Rotation◦ Loading Matrix – correlation between each

variable and the factor Oblique Rotation

◦ Factor Correlation Matrix – correlation between the factors

◦ Structure Matrix – correlation between factors and variables

◦ Pattern Matrix – unique relationship between each factor and variable uncontaminated by overlap between the factors

Terminology

Page 47: Factor Analysis

Factor Coefficient matrix – coefficients used to calculate factor scores (like regression coefficients)

Terminology

Page 48: Factor Analysis

Three general goals: data reduction, describe relationships and test theories about relationships (next chapter)

How many interpretable factors exist in the data? or How many factors are needed to summarize the pattern of correlations?

Questions

Page 49: Factor Analysis

What does each factor mean? Interpretation?

What is the percentage of variance in the data accounted for by the factors?

Questions

Page 50: Factor Analysis

Which factors account for the most variance?

How well does the factor structure fit a given theory?

What would each subject’s score be if they could be measured directly on the factors?

Questions

Page 51: Factor Analysis

Hypotheses about factors believed to underlie a domain◦ Should have 6 or more for stable solution

Include marker variables◦ Pure variables – correlated with only one factor◦ They define the factor clearly◦ Complex variables load on more than on factor and

muddy the water

Considerations (from Comrey and Lee, 1992)

Page 52: Factor Analysis

Make sure the sample chosen is spread out on possible scores on the variables and the factors being measured

Factors are known to change across samples and time points, so samples should be tested before being pooled together

Considerations (from Comrey and Lee, 1992)