exploratory factor analysis
Post on 16-Dec-2014
1.672 Views
Preview:
DESCRIPTION
TRANSCRIPT
Exploratory factor analysis
Dr. M. Shakaib Akram
Note: Most of the material used in this lecture has been taken from “Discovering Statistics Using SPP” by Andy Field, 3rd Ed
.
What is factor analysis?
Factor analysis (and principal component analysis) is a technique for identifying groups or clusters of variables underlying a set of measures.
Those variables are called 'factors', or 'latent variables' since they are not directly observable, e.g., 'intelligence'.
A 'latent variable' is “a variable that cannot be directly measured, but is assumed to be related to several variables that can be measured.“ (Glossary, p 736)
What is factor analysis used for?Factor analysis has 3 main uses:
To understand the structure of a set of variables, e.g., intelligenceTo construct a questionnaire to measure an underlying variableTo reduce a large data set to a more manageable size
The most basic data basisR-matrix
An R-matrix is simply a correlation matrix with Pearson r-coefficients between pairs of variables as the off-diagonal elements. In factor analysis one tries to find latent variables that underlie clusters of correlations in such an R-matrix.
Example: What makes a person popular?
Talk 1
Social Skills
Interest
Talk 2Selfish How selfish is the person?Liar How often does the person lie?
Amount of time someone talks about the other person during a conversationHow good are the person's social skills?How interesting does the other find that person?Amount of time someone talks about oneself during a conversation
These measures all tapdifferent aspects of
'popularity' of a person.
Are there a few underlyingfactors that can account
for them?
These measures all tapdifferent aspects of
'popularity' of a person.
Are there a few underlyingfactors that can account
for them?Factor 1 = sociability
Factor 2 = consideration to others
Graphical representations of factors
Factors can be visualized as axes along which we can plot variables. The coordinates of variables along each axis represents the strength of the relationship between that variable and each factor. In our expl., we have 2 underlying factors. The axis line ranges from -1 to +1, which is the range of possible correlations r. The position of a variable depends on its correlation coefficient with the 2 factors.
2-D Factor plot
-1 -0.75 -0.50 -0.25 0 0.25 0.50 0.75 1
1
0.75
0.50
0.25
0
-0.25
-0.50
-0.75
-1
Talk 1Interest
Soc Skills
Consideration
Liar Talk 2Selfish
In this 2-dimensional factor plot, there are only 2 latent variables. Variables either load high on 'Sociability' or on 'Consideration to others'.With 3 variables, we would have a 3D-factor plot.With >3 factors, no graphical factor plots are available any more.
The coordinate of a variable along a classification axis is called 'factor loading' . It is the Pearson correlation r between a factor and a variable.
Sociability
Research example: The ‘SPSS-Anxiety Questionnaire' SAQ
One use of Factor Analysis is constructing questionnaires.
With the SAQ, students' anxiety towards SPSS shall be measured, using 23 questions.
The questionnaire can be used to predict individuals' anxiety towards learning SPSS. Furthermore, the factor structure behind 'anxiety to use SPSS' shall be explored: which latent variables contribute to anxiety about SPSS?
SD = Strongly disagree, D = Disagree, N = Neither, A = Agree, SA = Strongly Agree
S D D N A S A
1 Statistics makes me cry O O O O O
2 My friends will think I'm stupid for not being able to cope with SPSS. O O O O O
3 Standard deviations excite me. O O O O O
4 I dream that Pearson is attacking me with correlation coefficients. O O O O O
5 I don't understand statistics. O O O O O
6 I have little experience of computers. O O O O O
7 All computers hate me. O O O O O
8 I have never been good at mathematics. O O O O O
9 My friends are better at statistics than me. O O O O O
10 Computers are useful only for playing games O O O O O
11 I did badly at mathematics at school. O O O O O
12 O O O O O
13 O O O O O
14 O O O O O
15 Computers are out to get me. O O O O O
16 I weep openly at the mention of central tendency. O O O O O
17 I slip into a coma whenever I see an equation. O O O O O
18 SPSS always crashes when I try to use it. O O O O O
19 Everybody looks at me when I use SPSS. O O O O O
20 I can't sleep for thoughts of eigenvectors. O O O O O
21 O O O O O
22 My friends are better a SPSS than I am. O O O O O
23 If I am good at statistics people will think I am a nerd. O O O O O
People try to tell you that SPSS makes statistics easier to understand but it doesn't.
I worry that I will cause irreparable damage because of my incomptence with computers.
Computers have minds of their own and deliberately go wrong whenever I use them.
I wake up under my duvet thinking that I am trapped under a normal distribution.
The SAQ
Initial considerations: sample size
The reliability of factor analysis relies on the sample size.As a 'rule of thumb', there should be 10-15 subjects per variable.
The stability of a factor solution depends on:1. Absolute sample size2. Magnitude of factor loading (>.6) 3. Communalities (>.6; the higher the better)
The KMO*-measure is the ratio of the squared correlation between variables to the squared partial correlation between variables. It ranges from 0-1. Values between .7 and .8 are good. They suggest a factor analysis.
*KMO: Kaiser-Meyer-Olkin measure of sampling adequacy
Data screening
The variables in the questionnaire should intercorrelate if they measure the same thing. Questions that tap the same sub-variable, e.g., worry, intrusive thoughts, or physiological arousal, should be highly correlated.
If there are questions that are not intercorrelated with others, they should not be entered into the factor analysis.
If questions correlate too highly, extreme multi-collinearity or even singularity (perfectly correlated variables) result.
– Too low and too high intercorrelations should be avoided.
Finally, variables should be roughly normally distributed.
Running the analysis(using SAQ.sav) Analyze Data Reduction Factor ...
To compute a principal component analysis in SPSS, select the Dimension Reduction | Factor… command from the Analyze menu.
Transfer all questionsto the variables window
DescriptivesFirst, mark the Univariate descriptives checkbox to get mean & Std. Deviation etc.
Third, mark the Coefficients checkbox to get a correlation matrix, one of the outputs needed to assess the appropriateness of factor analysis for the variables.
Second, keep the Initial solution checkbox to get the statistics needed to determine the number of factors to extract.
Fourth, mark the KMO and Bartlett’s test of sphericity checkbox to assess the appropriateness of factor analysis for the variables.
Fifth, mark the Anti-image checkbox to assess the appropriateness of factor analysis for the variables.
Sixth, click on the Continue button.
The determinant should be > .00001
ExtractionFirst, click on the Extraction… button to specify statistics to include in the output.
The extraction method refers to the mathematical method that SPSS uses to compute the factors or components.
Extraction
Choose Principal components
Other options:
Extraction
Two plots can be displayed:
Unrotated factorsScree plot
Analyze the Corr matrixOR the covariance matrix
Cattel's (>1) or Kaiser's (>.7) recommendation
Rotation
Choose Varimax
Normally, 25 iterationsare enough
Helps interpret the finalrotated analysis
The rotation method refers to the mathematical method that SPSS rotate the axes in geometric space. This makes it easier to determine which variables are loaded on which components.
Scores
Factor scores for eachsubject will be saved
in the data editor
Produces matrix Bwith the b-values
Best method of obtainingfactor scores:
Anderson-Rubin
Options
Subjects with missing datafor any variable are excluded
Variables are sorted bysize of their factor loadings
Too small variables shouldnot be displayed
Run the Factor Analysis
Then rerun it again, this time changing the rotation to oblique rotation: 'Direct Oblimin'
The output will be the same except for the rotation.
Choose 'Direct Oblimin'this time
Interpreting output from SPSS
Preliminary analysis:–data screening–assumption testing –sampling adequacy
'Univariate Descriptives‘: Mean, SD, and no. of sample
Correlation Matrixa
1,000 -,099 -,337 ,436 ,402 -,189 ,214 ,329 -,104 -,004
-,099 1,000 ,318 -,112 -,119 ,203 -,202 -,205 ,231 ,100
-,337 ,318 1,000 -,380 -,310 ,342 -,325 -,417 ,204 ,150
,436 -,112 -,380 1,000 ,401 -,186 ,243 ,410 -,098 -,034
,402 -,119 -,310 ,401 1,000 -,165 ,200 ,335 -,133 -,042
,217 -,074 -,227 ,278 ,257 -,167 ,101 ,272 -,165 -,069
,305 -,159 -,382 ,409 ,339 -,269 ,221 ,483 -,168 -,070
,331 -,050 -,259 ,349 ,269 -,159 ,175 ,296 -,079 -,050
-,092 ,315 ,300 -,125 -,096 ,249 -,159 -,136 ,257 ,171
,214 -,084 -,193 ,216 ,258 -,127 ,084 ,193 -,131 -,062
,357 -,144 -,351 ,369 ,298 -,200 ,255 ,346 -,162 -,086
,345 -,195 -,410 ,442 ,347 -,267 ,298 ,441 -,167 -,046
,355 -,143 -,318 ,344 ,302 -,227 ,204 ,374 -,195 -,053
,338 -,165 -,371 ,351 ,315 -,254 ,226 ,399 -,170 -,048
,246 -,165 -,312 ,334 ,261 -,210 ,206 ,300 -,168 -,062
,499 -,168 -,419 ,416 ,395 -,267 ,265 ,421 -,156 -,082
,371 -,087 -,327 ,383 ,310 -,163 ,205 ,363 -,126 -,092
,347 -,164 -,375 ,382 ,322 -,257 ,235 ,430 -,160 -,080
-,189 ,203 ,342 -,186 -,165 1,000 -,249 -,275 ,234 ,122
,214 -,202 -,325 ,243 ,200 -,249 1,000 ,468 -,100 -,035
,329 -,205 -,417 ,410 ,335 -,275 ,468 1,000 -,129 -,068
-,104 ,231 ,204 -,098 -,133 ,234 -,100 -,129 1,000 ,230
-,004 ,100 ,150 -,034 -,042 ,122 -,035 -,068 ,230 1,000
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,410
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,043
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,017
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000
,000 ,006 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,005
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,001
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,009
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,004
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,007
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,001
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,039
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000
,410 ,000 ,000 ,043 ,017 ,000 ,039 ,000 ,000
Q01
Q02
Q03
Q04
Q05
Q06
Q07
Q08
Q09
Q10
Q11
Q12
Q13
Q14
Q15
Q16
Q17
Q18
Q19
Q20
Q21
Q22
Q23
Q01
Q02
Q03
Q04
Q05
Q06
Q07
Q08
Q09
Q10
Q11
Q12
Q13
Q14
Q15
Q16
Q17
Q18
Q19
Q20
Q21
Q22
Q23
Correlation
Sig. (1-tailed)
Q01 Q02 Q03 Q04 Q05 Q19 Q20 Q21 Q22 Q23
Determinant = 5,271E-04a.
Correlation Matrix
Selected outputfor Q-5; 19-23
Labels of questionsomitted
These are thePearson corr
coefficients betweenall pairs of variables
These are theSignificance levelsfor all correlations.
Note: they are almostall significant!
Determinant:.0005271
OK!
Correlation Matrixa
1,000 -,099 -,337 ,436 ,402 -,189 ,214 ,329 -,104 -,004
-,099 1,000 ,318 -,112 -,119 ,203 -,202 -,205 ,231 ,100
-,337 ,318 1,000 -,380 -,310 ,342 -,325 -,417 ,204 ,150
,436 -,112 -,380 1,000 ,401 -,186 ,243 ,410 -,098 -,034
,402 -,119 -,310 ,401 1,000 -,165 ,200 ,335 -,133 -,042
,217 -,074 -,227 ,278 ,257 -,167 ,101 ,272 -,165 -,069
,305 -,159 -,382 ,409 ,339 -,269 ,221 ,483 -,168 -,070
,331 -,050 -,259 ,349 ,269 -,159 ,175 ,296 -,079 -,050
-,092 ,315 ,300 -,125 -,096 ,249 -,159 -,136 ,257 ,171
,214 -,084 -,193 ,216 ,258 -,127 ,084 ,193 -,131 -,062
,357 -,144 -,351 ,369 ,298 -,200 ,255 ,346 -,162 -,086
,345 -,195 -,410 ,442 ,347 -,267 ,298 ,441 -,167 -,046
,355 -,143 -,318 ,344 ,302 -,227 ,204 ,374 -,195 -,053
,338 -,165 -,371 ,351 ,315 -,254 ,226 ,399 -,170 -,048
,246 -,165 -,312 ,334 ,261 -,210 ,206 ,300 -,168 -,062
,499 -,168 -,419 ,416 ,395 -,267 ,265 ,421 -,156 -,082
,371 -,087 -,327 ,383 ,310 -,163 ,205 ,363 -,126 -,092
,347 -,164 -,375 ,382 ,322 -,257 ,235 ,430 -,160 -,080
-,189 ,203 ,342 -,186 -,165 1,000 -,249 -,275 ,234 ,122
,214 -,202 -,325 ,243 ,200 -,249 1,000 ,468 -,100 -,035
,329 -,205 -,417 ,410 ,335 -,275 ,468 1,000 -,129 -,068
-,104 ,231 ,204 -,098 -,133 ,234 -,100 -,129 1,000 ,230
-,004 ,100 ,150 -,034 -,042 ,122 -,035 -,068 ,230 1,000
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,410
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,043
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,017
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000
,000 ,006 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,005
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,001
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,009
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,004
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,007
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,001
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,039
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000
,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000 ,000
,410 ,000 ,000 ,043 ,017 ,000 ,039 ,000 ,000
Q01
Q02
Q03
Q04
Q05
Q06
Q07
Q08
Q09
Q10
Q11
Q12
Q13
Q14
Q15
Q16
Q17
Q18
Q19
Q20
Q21
Q22
Q23
Q01
Q02
Q03
Q04
Q05
Q06
Q07
Q08
Q09
Q10
Q11
Q12
Q13
Q14
Q15
Q16
Q17
Q18
Q19
Q20
Q21
Q22
Q23
Correlation
Sig. (1-tailed)
Q01 Q02 Q03 Q04 Q05 Q19 Q20 Q21 Q22 Q23
Determinant = 5,271E-04a.
Scanning the Correlation Matrix
Look for many low correlations (p > .05)for a single variable
none!
2. Then scan the corrcoefficients for >.9
none! no problem withmulticollinearity
All Q seem tobe fine!
Bartlett's test of sphericityKMO statistics
KMO and Bartlett's Test
,930
19334,492
253
,000
Kaiser-Meyer-Olkin Measure of SamplingAdequacy.
Approx. Chi-Square
df
Sig.
Bartlett's Test ofSphericity
KMO-measures >.9are superb!
KMO measures the ratio of the squared correlation between variables
to the squared partial correlationbetween variables.
KMO measures forindividual factors are
produced on the diagonalof the anti-image corr
matrix The KMO-measures
give us a hint atwhich variables should
be excluded from the factor analysis
Bartlett's test tests if the R-matrix is anidentity matrix (matrix with only 1's in thediagonal and 0's off-diagonal). However,we want to have correlated variables, sothe off-diagonal elements should NOT be
0. Thus, the test should be significant,i.e., the R-matrix should NOT be an
identity matrix.
(2nd part of the) Anti-Images Matrices
Anti-Image Correlation
Red underlined are theKMO-measures for
the individual variablesThey are all high
The off-diagonal numbersare the partial corr between
variables. They should allbe very small, which they are.
Q1 Q2 Q3 Q4Q5....
Q19 Q20 Q21 Q22Q23
Factor extraction
Total Variance Explained
7,290 31,696 31,696 7,290 31,696 31,696 3,730 16,219 16,219
1,739 7,560 39,256 1,739 7,560 39,256 3,340 14,523 30,742
1,317 5,725 44,981 1,317 5,725 44,981 2,553 11,099 41,842
1,227 5,336 50,317 1,227 5,336 50,317 1,949 8,475 50,317
,988 4,295 54,612
,895 3,893 58,504
,806 3,502 62,007
,783 3,404 65,410
,751 3,265 68,676
,717 3,117 71,793
,684 2,972 74,765
,670 2,911 77,676
,612 2,661 80,337
,578 2,512 82,849
,549 2,388 85,236
,523 2,275 87,511
,508 2,210 89,721
,456 1,982 91,704
,424 1,843 93,546
,408 1,773 95,319
,379 1,650 96,969
,364 1,583 98,552
,333 1,448 100,000
Component1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
Total % of Variance Cumulative % Total % of Variance Cumulative % Total % of Variance Cumulative %
Initial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings
Extraction Method: Principal Component Analysis.
Before extraction
Beforeextraction,there areas manyfactors
as thereare
variables,n=23
After extraction After rotation
Only 4 factorswith an eigenvalue
> 1 are retained(Fisher's criterion)
Initial eigenvalues and explained variances areordered in decreasing
magnitude
Rotation optimizesfactor structure
(Varimax).The relative impor-tance of factors is
equalized. Theexplained variance
of the 4 factorsis more similarafter rotation.
Communalities
1,000 ,435
1,000 ,414
1,000 ,530
1,000 ,469
1,000 ,343
1,000 ,654
1,000 ,545
1,000 ,739
1,000 ,484
1,000 ,335
1,000 ,690
1,000 ,513
1,000 ,536
1,000 ,488
1,000 ,378
1,000 ,487
1,000 ,683
1,000 ,597
1,000 ,343
1,000 ,484
1,000 ,550
1,000 ,464
1,000 ,412
Q01
Q02
Q03
Q04
Q05
Q06
Q07
Q08
Q09
Q10
Q11
Q12
Q13
Q14
Q15
Q16
Q17
Q18
Q19
Q20
Q21
Q22
Q23
Initial Extraction
Extraction Method: Principal Component Analysis.
Communalities
Communality is the proportion of common variance within a variable.Initially, communality is assumed to be 1 ('all variance is common'). After extraction, the true communalities can be judged better.
Before and afterextraction
E.g.: 43,5% of variance inQ1 is common, shared
variance
Before extraction, there areas many factors as there arevariables, n=23, so that allvariance is explained by thefactors and communality is1. (No data reduction yet).
After extraction, some of thefactors are retained, othersare dismissed. This leads toa welcome data reduction.
Now the amount of variationin each variable explained
by the factors is the communality.
Component matrix
The component matrix shows the factor loadings of each variable before rotation. SPSS has already extracted 4 components (factors).
How can we decide how many factors we should retain?
scree plot
Component Matrixa
,701
,685
,679
,673
,669
,658
,656
,652 -,400
,643
,634
-,629
,593
,586
,556
,549 ,401 -,417
,437
,436 -,404
-,427
,627
,548
,465
,562 ,571
,507
Q18
Q07
Q16
Q13
Q12
Q21
Q14
Q11
Q17
Q04
Q03
Q15
Q01
Q05
Q08
Q10
Q20
Q19
Q09
Q02
Q22
Q06
Q23
1 2 3 4
Component
Extraction Method: Principal Component Analysis.
4 components extracted.a.
Loadings <.3are suppressed,
hence the blank spaces.
Before rotation, most variables loaded highest on the first factor (which
can therefore explain a high amount of variation (31,7%)
Scree plot
After 2 or after 4 factors, the curve inflects.
Since we have a huge sample, Eigenvalues can still be well interpreted >1, so retaining 4 is justified.
However, it is also possible to retain just 2.
Rotated component matrix: orthogonal rotation
Rotated Component Matrixa
,800
,684
,647
,638
,579
,550
,459
,677
,661
-,567
,473 ,523
,516
,514
,496
,429
,833
,747
,747
,648
,645
,586
,543
,427
Q06
Q18
Q13
Q07
Q14
Q10
Q15
Q20
Q21
Q03
Q12
Q04
Q16
Q01
Q05
Q08
Q17
Q11
Q09
Q22
Q23
Q02
Q19
1 2 3 4
Component
Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization.
Rotation converged in 9 iterations.a.
The Rotated component matrix has the same information as the component matrix, only that it is calculated after orthogonal rotation (here with VARIMAX).
Loadings <.3are suppressed,
hence the blank spaces.
Component vs. Rotated Component Matrix
Rotated Component Matrixa
,800
,684
,647
,638
,579
,550
,459
,677
,661
-,567
,473 ,523
,516
,514
,496
,429
,833
,747
,747
,648
,645
,586
,543
,427
Q06
Q18
Q13
Q07
Q14
Q10
Q15
Q20
Q21
Q03
Q12
Q04
Q16
Q01
Q05
Q08
Q17
Q11
Q09
Q22
Q23
Q02
Q19
1 2 3 4
Component
Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization.
Rotation converged in 9 iterations.a.
Component Matrixa
,701
,685
,679
,673
,669
,658
,656
,652 -,400
,643
,634
-,629
,593
,586
,556
,549 ,401 -,417
,437
,436 -,404
-,427
,627
,548
,465
,562 ,571
,507
Q18
Q07
Q16
Q13
Q12
Q21
Q14
Q11
Q17
Q04
Q03
Q15
Q01
Q05
Q08
Q10
Q20
Q19
Q09
Q02
Q22
Q06
Q23
1 2 3 4
Component
Extraction Method: Principal Component Analysis.
4 components extracted.a.
Before rotation, most Qs loaded highly on the first extracted
factor and much lower on the following ones.
After rotation, all 4extracted factors havea couple of Qs loading
highly on them.
Q12 loads equallyhigh on factor 1 and 2!
Q12: People try to tell you that
SPSS makes statistics easier to understand but it doesn't
Looking at the content of the Qs:
In order to interpret the factors, we have to look at the content of the Qs that load highly on them:
Factor 1: 'Fear of computers' LoadF1 F2 F3 F4
Q06 I have little experience of computers .800Q18 SPSS always crashes when I try to use it .684
Q13 .647Q7 All computers hate me .638
Q14 .579
Q10 .550Q15 Computers are out to get me .459
I worry that I will cause irreparable damage because of my incompetence with computers
Computers have minds of their own and deliberately go wrong whenever I use themComputers are useful only for playing games
Factor 2: 'Fear of statistics' LoadF1 F2 F3 F4
Q20 I can't sleep for thoughts of eigenvectors .677
Q21 .661Q03 Standard deviations excite me -.567
Q12 .473 .523
Q04 .516
Q16 .514Q01 Statistics makes me cry .496Q05 I don't understand statistics .429
I wake up under my duvet thinking that I am trapped under a normal distribution
People try to tell you that SPSS makes statistics easier to understand but it doesn'tI dream that Pearson is attacking me with correlation coefficientsI weep openly at the mention of central tendency
Looking at the content of the Qs:
Factor 3: 'Fear of mathematics' LoadF1 F2 F3 F4
Q08 I have never been good at mathematics .833
Q17 .747Q11 I did badly at mathematics at school .747
I slip into a coma whenever I see an equation
Looking at the content of the Qs:
Factor 4: 'Peer evaluation' LoadF1 F2 F3 F4
Q09 My friends are better at statistics than me .648Q22 My friends are better at SPSS than me .645
Q23 .586
Q02 .543Q19 Everybody looks at me when I use SPSS .427
If I am good at statistics my friends will think I'm a nerdMy frieds with think I'm stupid for not being able to cope with SPSS
Looking at the content of the Qs:
4 subscales of the SAQ
Now the question arises if
1. SAQ does not measure what it says ('SPSS anxiety') but some related constructs
2. These four constructs are sub-components of SPSS anxiety.
The Factor Analysis does not tell us
Factor Subscale of SAQ1 Fear of computers2 Fear of statistics3 Fear of mathematics4 Fear of negative peer evalution
Oblique rotation
Pattern matrixcontains the factor
loadings and is interpreted like the factor matrix.
is easier to interpret should be reported
Structure Matrixtakes into account the
relationship betweeen factors
should be used as a check on the pattern matrix
should also be reported
While in orthogonal rotation, we have only one matrix, the factor matrix, in oblique rotation the factor matrix is split up into the pattern matrix and the structure matrix.
Oblique rotation – pattern matrix
Pattern Matrixa
,706
,591
-,511
,405
,400
,643
,621
,615
,507
,885
,713
,653
,650
,588
,585
,412 ,462
,411
-,902
-,774
-,774
Q20 I can't sleep for thoughts of eigen vectors
Q21 I wake up under my duvet thinking that I am trapped under a normaldistribtion
Q03 Standard deviations excite me
Q04 I dream that Pearson is attacking me with correlation coefficients
Q16 I weep openly at the mention of central tendency
Q01 Statiscs makes me cry
Q05 I don't understand statistics
Q22 My friends are better at SPSS than I am
Q09 My friends are better at statistics than me
Q23 If I'm good at statistics my friends will think I'm a nerd
Q02 My friends will think I'm stupid for not being able to cope with SPSS
Q19 Everybody looks at me when I use SPSS
Q06 I have little experience of computers
Q18 SPSS always crashes when I try to use it
Q07 All computers hate me
Q13 I worry that I will cause irreparable damage because of myincompetenece with computers
Q14 Computers have minds of their own and deliberately go wrongwhenever I use them
Q10 Computers are useful only for playing games
Q12 People try to tell you that SPSS makes statistics easier to understandbut it doesn't
Q15 Computers are out to get me
Q08 I have never been good at mathematics
Q17 I slip into a coma whenever I see an equation
Q11 I did badly at mathematics at school
1 2 3 4
Component
Extraction Method: Principal Component Analysis. Rotation Method: Oblimin with Kaiser Normalization.
Rotation converged in 29 iterations.a.
The pattern matrix gives us the unique contributionof a variable to a factor.
The same 4 patterns seem to have emerged
F1:'Fear of statistics'
F2:'Fear of peerevaluation'
F3:'Fear of computers'
F4:'Fear of mathematics'
Oblique rotation – structure matrix
Structure Matrix
,695 ,477
,685
-,632 -,407
,567 ,516 -,491
,548 ,487 -,485
,520 ,413 -,501
,462 ,453
,660
,653
,588
,546
-,435 ,446
,777
,404 ,761
,401 ,723
,723 -,429
,426 ,671
,576 ,606
,561 -,441
,556
-,855
,453 -,822
,451 -,818
Q21 I wake up under my duvet thinking that I am trappedunder a normal distribtion
Q20 I can't sleep for thoughts of eigen vectors
Q03 Standard deviations excite me
Q16 I weep openly at the mention of central tendency
Q04 I dream that Pearson is attacking me withcorrelation coefficients
Q01 Statiscs makes me cry
Q05 I don't understand statistics
Q22 My friends are better at SPSS than I am
Q09 My friends are better at statistics than me
Q23 If I'm good at statistics my friends will think I'm anerd
Q02 My friends will think I'm stupid for not being able tocope with SPSS
Q19 Everybody looks at me when I use SPSS
Q06 I have little experience of computers
Q18 SPSS always crashes when I try to use it
Q07 All computers hate me
Q13 I worry that I will cause irreparable damagebecause of my incompetenece with computers
Q14 Computers have minds of their own anddeliberately go wrong whenever I use them
Q12 People try to tell you that SPSS makes statisticseasier to understand but it doesn't
Q15 Computers are out to get me
Q10 Computers are useful only for playing games
Q08 I have never been good at mathematics
Q17 I slip into a coma whenever I see an equation
Q11 I did badly at mathematics at school
1 2 3 4
Component
Extraction Method: Principal Component Analysis. Rotation Method: Oblimin with Kaiser Normalization.
In the structure matrix, the shared variance is not ignored.Now several variables load highly onto more than 1 factor.
Factors 1 and 3 'fear of statistics' and
'fear of computers'go together.
Also F4 'fear of math'is related
Factors 3 and 4 'fear of computers'and 'fear of math'
go together
Note: Factor 3 'fear of computers' appears
twice, each time together with a different
factor
Component Correlation Matrix
1,000 -,154 ,364 -,279
-,154 1,000 -,185 8,155E-02
,364 -,185 1,000 -,464
-,279 8,155E-02 -,464 1,000
Component1
2
3
4
1 2 3 4
Extraction Method: Principal Component Analysis. Rotation Method: Oblimin with Kaiser Normalization.
Oblique rotation: Component correlation matrix
The Component Correlation matrix contains the correlation coefficients between factors.
F2 'fear of peer evaluation' has little relation with the others, but F1,3,4 'fear of stats, computers,
and maths', are somewhat interrelated.
Independence of factors cannot be upheld, given the correlations between the factors and also the content of the factors: 'fear of stats, computers, and maths's, all have a similar meaning. oblique rotation is more sensible.
Factors – statistically and conceptually
The Factor Analysis has extracted 4 factors, 3 of which are correlated with each other, one of which is rather independent. An oblique rotation is more sensible given the interrelation between 3 factors.
How does that match the interpretation of the factors?
The three correlated factors
– fear of stats – fear of math – fear of computers
are also conceptually closely related whereas the4th factor 'fear of negative peer evaluation', being socially based, is also conceptually different.
Hence, the statistics and the meaning of the factors go along with each other rather nicely.
Interim summary
SAQ has 4 factors underlyingly, which we can identify as fear of
– stats – maths – computers – peer evaluation
Oblique rotation is to be preferred since three of the four factors are inter-related, statistically as well as conceptually
The use of Factor Analysis here is purely exploratory. It helps you understand what factors are underlying large data sets
Informed decisions may follow from such an exploratory Factor Analysis, e.g., wrt working out a better questionnaire.
top related