data-pipeline using alspac data
DESCRIPTION
Data-pipeline using ALSPAC data. Contents. Introduction to ALSPAC Description of the measures Preparing my data for the pipeline The pipeline (in stata) Summarize / Codebook Polychoric correlations Polychoric PCA Loevinger’s H Mokken Scale Procedure Options for SPSS users. - PowerPoint PPT PresentationTRANSCRIPT
Data-pipelineusing ALSPAC data
Contents
• Introduction to ALSPAC– Description of the measures
• Preparing my data for the pipeline– The pipeline (in stata)
• Summarize / Codebook• Polychoric correlations• Polychoric PCA• Loevinger’s H• Mokken Scale Procedure
• Options for SPSS users
Day 2: Contents
• Introduction to Psychometrics:– Item Response Theory in Stata
• Non-parametric procedures: – Mokken,Description of the measures
• Parametric models– Singe parameter logistic model (Rasch)– Two parameter logistic (Lord-Birnbaum)
• An R – 2 – Detour (a detour to R)• Running psychometric analyses from Stata
• From Data to PAPER– Automated IRT analyses that yield publication quality graphics– Connections from Stata
Contents
• Introduction to ALSPAC– Description of the measures
• Preparing my data for the pipeline– The pipeline (in stata)
• Summarize / Codebook• Polychoric correlations• Polychoric PCA• Loevinger’s H• Mokken Scale Procedure
• Options for SPSS users
What is ALSPAC?
• “Avon Longitudinal Study of Parents and Children” AKA Children of the Nineties
• Cohort study of ~14,000 children and their parents, based in south-west England
• Eligibility criteria: Mothers had to be resident in Avon and have an expected date of delivery between April 1st 1991 and December 31st 1992
• Population based prospective cohort study
Where’s Avon to, my luvver?trans: Where is Avon?
The county of Avon
• 1) A nice short name• 2) Known for it’s “ladies”• 3) Replaced in 1996 with
– Bristol– North Somerset– Bath and North East Somerset– South Gloucestershire
– Collectively known as “CUBA” (Counties which Used to Be Avon)
What data does ALSPAC have?
• Self completion questionnaires– Mothers, Partners, Children, Teachers
• Hands on assessments– 10% sample tested regularly since birth– Yearly clinics for all since age 7
• Data from external sources – SATS from LEA, Child Health database
• Biological samples– DNA / cell lines
Contents
• Introduction to ALSPAC– Description of the measures
• Preparing my data for the pipeline– The pipeline (in stata)
• Summarize / Codebook• Polychoric correlations• Polychoric PCA• Loevinger’s H• Mokken Scale Procedure
• Options for SPSS users
Today’s Measures 1 - MFQ
• Moods and Feelings Questionnaire
• Angold and Costello (1987).
Mood and feelings questionnaire (MFQ). Durham: Duke University, Developmental Epidemiology Program.
• Short version, 13 items• Parental response at 13 years [Questionnaire]• Child response at 14 years [Clinic, computer]
Today’s Measures 2 - EAS
• EAS Temperament Survey (Parental Ratings)
• Buss and Plomin, (1984).
A temperament theory of personality development. New York: John Wiley.
• 20 questions• 4 subscales:
Emotionality, Activity, Shyness & Sociability• Parental response at 4 years [Questionnaire]
Contents
• Introduction to ALSPAC– Description of the measures
• Preparing my data for the pipeline– The pipeline (in stata)
• Summarize / Codebook• Polychoric correlations• Polychoric PCA• Loevinger’s H• Mokken Scale Procedure
• Options for SPSS users
Rename variables for clarity and consistency
gen mum01_012 = ta5020gen mum02_012 = ta5021gen mum03_012 = ta5022gen mum04_012 = ta5023gen mum05_012 = ta5024gen mum06_012 = ta5025gen mum07_012 = ta5026gen mum08_012 = ta5027gen mum09_012 = ta5028gen mum10_012 = ta5029gen mum11_012 = ta5030gen mum12_012 = ta5031gen mum13_012 = ta5032
gen kid01_012 = fg6410 gen kid02_012 = fg6412 gen kid03_012 = fg6413 gen kid04_012 = fg6414 gen kid05_012 = fg6415 gen kid06_012 = fg6416 gen kid07_012 = fg6418 gen kid08_012 = fg6419 gen kid09_012 = fg6421 gen kid10_012 = fg6422 gen kid11_012 = fg6423 gen kid12_012 = fg6424 gen kid13_012 = fg6425
ta5020 ~ fg6410 or ta5027 ~ fg6419???
mum01_012 ~ kid01_012 and mum08_012 ~ kid08_012
Derive binary variables
recode *_012 (3=0)(2=1)(1=2)
foreach x in "mum01" "mum02" "mum03" "mum04" "mum05" "mum06" /// "mum07" "mum08" "mum09" "mum10" "mum11" "mum12" "mum13" ///"kid01" "kid02" "kid03" "kid04" "kid05" "kid06" "kid07" "kid08" /// "kid09" "kid10" "kid11" "kid12" "kid13" {gen `x'_001 = `x'_012recode `x'_001 (0=0)(1=0)(2=1)gen `x'_011 = `x'_012recode `x'_011 (0=0)(1=1)(2=1)}
mum01_012mum01_001mum01_011
Variable labels
foreach var of varlist *01_* {label variable `var' "Felt miserable/unhappy [`var']"}
foreach var of varlist *02_* {label variable `var' "Didnt enjoy anything at all [`var']"}
foreach var of varlist *03_* {label variable `var' "Felt so tired they just sat around & did nothing [`var']"}
foreach var of varlist *04_* {label variable `var' "Was restless [`var']"}
Etc.
Value Labels
foreach var of varlist *_012 {label define `var'_lab 0 "Not true" 1 "Sometimes true" 2 "True"label values `var' `var'_lab}
foreach var of varlist *_011 {label define `var'_lab 0 "Not true" 1 "Sometimes true / True"label values `var' `var'_lab}
foreach var of varlist *_001 {label define `var'_lab 0 "Sometimes true / not true" 1 "True"label values `var' `var'_lab}
Contents
• Introduction to ALSPAC– Description of the measures
• Preparing my data for the pipeline– The pipeline (in Stata)
• Summarize / Codebook• Polychoric correlations• Polychoric PCA• Loevinger’s H• Mokken Scale Procedure
• Options for SPSS users
log using "mfq_dataprep.log", replaceforeach x in "mum" "kid" {
su `x'*_012codebook `x'*_012 loevH `x'*_012polychoric `x'*_012polychoricpca `x'*_012msp `x'*_012}
log close
Repeat with *_011 and *_001
Typical data-pipeline syntax
summarize / codebook
su emo_*_01234
Variable | Obs Mean Std. Dev. Min Max-------------+------------------------------------------------------emo_l_02_0~4 | 9467 1.564276 .806012 0 4emo_l_06_0~4 | 9445 1.7081 .8448107 0 4emo_l_11_0~4 | 9448 1.274238 .8241389 0 4emo_l_15_0~4 | 9431 1.613933 .8029195 0 4emo_l_19_0~4 | 9342 1.594198 1.008401 0 4
codebook emo_*_01234 -----------------------------------------------------------------------------------------------emo_l_02_01234 Child cries easily [emo_l_02_01234]----------------------------------------------------------------------------------------------- type: numeric (float) label: emo_l_02_01234_lab range: [0,4] units: 1 unique values: 5 missing .: 5196/14663 tabulation: Freq. Numeric Label 761 0 E-Like 3620 1 Q-like 4202 2 S-like 751 3 NM-Like 133 4 NAA-Like 5196 . -----------------------------------------------------------------------------------------------emo_l_06_01234 Child tends to be somewhat emotional [emo_l_06_01234]----------------------------------------------------------------------------------------------- type: numeric (float) label: emo_l_06_01234_lab range: [0,4] units: 1 unique values: 5 missing .: 5218/14663 tabulation: Freq. Numeric Label 632 0 E-Like 3018 1 Q-like 4507 2 S-like 1051 3 NM-Like 237 4 NAA-Like 5218 .
codebook emo_*_01234 -----------------------------------------------------------------------------------------------emo_l_11_01234 Child often fusses and cries [emo_l_11_01234]----------------------------------------------------------------------------------------------- type: numeric (float) label: emo_l_11_01234_lab range: [0,4] units: 1 unique values: 5 missing .: 5215/14663 tabulation: Freq. Numeric Label 1538 0 E-Like 4420 1 Q-like 2942 2 S-like 457 3 NM-Like 91 4 NAA-Like 5215 . -----------------------------------------------------------------------------------------------emo_l_15_01234 Child gets upset easily [emo_l_15_01234]----------------------------------------------------------------------------------------------- type: numeric (float) label: emo_l_15_01234_lab range: [0,4] units: 1 unique values: 5 missing .: 5232/14663 tabulation: Freq. Numeric Label 559 0 E-Like 3689 1 Q-like 4214 2 S-like 772 3 NM-Like 197 4 NAA-Like 5232 .
codebook emo_*_01234 -----------------------------------------------------------------------------------------------emo_l_19_01234 Child reacts intensely when upset [emo_l_19_01234]----------------------------------------------------------------------------------------------- type: numeric (float) label: emo_l_19_01234_lab range: [0,4] units: 1 unique values: 5 missing .: 5321/14663 tabulation: Freq. Numeric Label 1329 0 E-Like 3038 1 Q-like 3459 2 S-like 1127 3 NM-Like 389 4 NAA-Like 5321 .
Multihist
pause onforeach x in "01" "02" "03" "04" "05" "06" "07" "08" "09" "10" "11" "12" "13" {
multihist *`x'_012pause}
pause off
•Compare response to same questions at different times•Big differences would suggest an error in previous code
- reversal of responses- change to order of questions asked- change to response options (aargh!)
Fre
que
ncy
ta01_012 (n=6723/14663)
Felt miserable/unhappy [ta01_012Not true .5 Sometime 1.5 True0
1000
2000
3000
4000
kw01_012 (n=7019/14663)
Felt miserable/unhappy [kw01_012
Not true .5 Sometime 1.5 True0
1000
2000
3000
4000
ku01_012 (n=7733/14663)
Felt miserable/unhappy [ku01_012Not true .5 Sometime 1.5 True0
1000
2000
3000
4000
fg01_012 (n=5753/14663)
Felt miserable/unhappy [fg01_012Not true .5 Sometime 1.5 True0
1000
2000
3000
ff01_012 (n=6396/14663)
Felt miserable/unhappy [ff01_012Not true .5 Sometime 1.5 True0
1000
2000
3000
4000
fd01_012 (n=7033/14663)
Felt miserable/unhappy [fd01_012Not true .5 Sometime 1.5 True0
1000
2000
3000
4000
Multihist for first item of MFQ (6 repeat measures)
Polychoric Correlations
Correlation -v- regression coefficient
Correlation coefficient:
The interdependence between pairs of variables i.e. the extent to which values of the variable change together
The strength and direction of the linear relationship
A fatter ellipse will result in a greater degree of scatter for a regression line of a given gradient, and a lower correlation
Polychoric Correlation - Assumptions
• A binary or categorical variable is the observed (or manifest) part of an underlying (or latent) continuous variable
• Here we’ll also assume that latent variables are normally distributed
• THRESHOLD relates the manifest to the latent variable
• Uebersax link: http://ourworld.compuserve.com/homepages/jsuebersax/tetra.htm
Thresholds
Figure from Uebersax webpage
2 binary variables. tab mum01_001 mum04_001
Felt | Was restless miserable/unhappy | [mum04_001] [mum01_001] | ST / NT True | Total----------------------+----------------------+---------- ST / NT | 6,343 78 | 6,421 True | 234 54 | 288 ----------------------+----------------------+---------- Total | 6,577 132 | 6,709
This is all we see, however ….
… this is what we assume is going on
Figure from Uebersax webpage
What we are really interested in is the correlation (r) between the continuous latent variables
Computer algorithm used to search for a correlation r and thresholds t1 and t2 which best reproduce the cell counts of the 2x2 table
Poly / tetra
• Tetrachoric– Special case where both variables are binary
• Polychoric– More general (any categorical variable)
• Bi/Polyserial– One continuous and one categorical variable
Poly versus standard correlations
foreach x in "emo_l_02" "emo_l_06" "emo_l_11" "emo_l_15" "emo_l_19" {gen `x'_00001 = `x'_01234 recode `x'_00001 (0=0)(1=0)(2=0)(3=0)(4=1)gen `x'_00011 = `x'_01234 recode `x'_00011 (0=0)(1=0)(2=0)(3=1)(4=1)gen `x'_00111 = `x'_01234 recode `x'_00111 (0=0)(1=0)(2=1)(3=1)(4=1)gen `x'_01111 = `x'_01234 recode `x'_01111 (0=0)(1=1)(2=1)(3=1)(4=1)gen `x'_01122 = `x'_01234 recode `x'_01122 (0=0)(1=1)(2=1)(3=2)(4=2)gen `x'_00123 = `x'_01234 recode `x'_00123 (0=0)(1=0)(2=1)(3=2)(4=3)}
log using "eas_dataprep_poly_corr.log", replace
foreach x in "emo_*_00001" "emo_*_00011 " "emo_*_00111 " ///"emo_*_01111" "emo_*_01122 " "emo_*_00123 " "emo_*_01234" {corr `x'polychoric `x'}
log close
emo_l_02 emo_l_06 emo_l_11 emo_l_15 emo_l_19
emo_l_02 1.000
emo_l_06 0.592 1.000
emo_l_11 0.688 0.579 1.000
emo_l_15 0.762 0.656 0.702 1.000
emo_l_19 0.433 0.508 0.505 0.521 1.000
emo_l_02 emo_l_06 emo_l_11 emo_l_15 emo_l_19
emo_l_02 1.000
emo_l_06 0.523 1.000
emo_l_11 0.606 0.510 1.000
emo_l_15 0.674 0.581 0.618 1.000
emo_l_19 0.386 0.456 0.450 0.465 1.000
Polychoric Correlation Matrix (01234)
Standard Correlation Matrix (01234)
0.2
0.3
0.4
0.5
0.6
0.7
0.8
00001 00011 00111 01111 01122 00123 01234
Correlation
[P] emo1, emo2
[P] emo2, emo3
[C] emo1, emo2
[C] emo2, emo3
Poly versus standard correlations
• Polychoric correlations always higher than Pearson correlations
• Polychoric correlations more robust to changes in the number of categories
• For polychoric in Stata, if # categories > 10, variable treated as if continuous, so the correlation of two variables that have 10 categories each would be simply the usual Pearson moment correlation found through correlate.
Polychoric PCA
Polychoric PCA
• Performs PCA on the polychoric correlation matrix
• Produces eigenvectors, eigenvalues, and the correlation matrix as with standard PCA
PCA v PolychoricPCA, mum MFQ
PCA Polychoric PCA
Component Eigenvalue Cum. explained Eigenvalues Cum. explained
Comp1 5.554 0.427 8.340 0.642
Comp2 1.206 0.520 1.006 0.719
Comp3 0.826 0.584 0.698 0.773
Comp4 0.719 0.639 0.518 0.812
Comp5 0.707 0.693 0.466 0.848
Comp6 0.646 0.743 0.397 0.879
Comp7 0.610 0.790 0.364 0.907
Comp8 0.547 0.832 0.281 0.928
Comp9 0.526 0.872 0.256 0.948
Comp10 0.500 0.911 0.247 0.967
Comp11 0.426 0.944 0.181 0.981
Comp12 0.383 0.973 0.132 0.991
Comp13 0.350 1.000 0.113 1.000
PCA Polychoric PCA
Variable Comp1 Comp2 Comp3 e1 e2 e3
mum01_012 0.270 0.286 0.297 0.279 0.121 -0.385
mum02_012 0.274 0.225 0.225 0.276 0.215 -0.219
mum03_012 0.178 0.535 0.236 0.186 0.657 -0.253
mum04_012 0.208 0.454 -0.434 0.217 0.461 0.510
mum05_012 0.328 -0.189 -0.067 0.314 -0.122 0.087
mum06_012 0.248 -0.027 0.511 0.266 -0.005 -0.382
mum07_012 0.254 0.278 -0.411 0.255 0.236 0.432
mum08_012 0.317 -0.286 0.019 0.315 -0.166 -0.005
mum09_012 0.287 -0.320 -0.015 0.302 -0.229 0.034
mum10_012 0.282 -0.031 0.161 0.279 -0.110 -0.161
mum11_012 0.306 -0.206 0.073 0.300 -0.206 -0.083
mum12_012 0.300 -0.164 -0.328 0.287 -0.237 0.279
mum13_012 0.312 -0.085 -0.209 0.299 -0.182 0.164
PCA v PolychoricPCA, EASPCA Polychoric PCA
Component Eigenvalue Cum. explained Eigenvalues Cum. explained
Comp1 5.207 0.260 6.099 0.305
Comp2 3.080 0.414 3.363 0.473
Comp3 1.656 0.497 1.766 0.561
Comp4 1.283 0.561 1.311 0.627
Comp5 1.054 0.614 1.058 0.680
Comp6 0.943 0.661 0.918 0.726
Comp7 0.725 0.697 0.648 0.758
Comp8 0.669 0.731 0.590 0.788
Comp9 0.640 0.763 0.567 0.816
Comp10 0.575 0.792 0.495 0.841
Comp11 0.560 0.820 0.490 0.865
Comp12 0.518 0.845 0.447 0.888
Comp13 0.494 0.870 0.424 0.909
Etc.
PCA Polychoric PCA
Variable Comp1 Comp2 Comp3 Comp4 Comp5 e1 e2 e3 e4 e5
act_l_04_01234 -0.250 0.160 0.383 -0.144 -0.126 -0.264 0.163 0.386 -0.148 -0.144
act_l_07_01234 -0.187 -0.009 0.300 -0.179 0.173 -0.205 -0.019 0.296 -0.168 0.176
act_l_09_01234 -0.237 0.139 0.366 -0.186 -0.215 -0.236 0.135 0.341 -0.171 -0.222
act_l_13_01234 -0.288 0.126 0.353 -0.204 -0.071 -0.299 0.126 0.354 -0.203 -0.103
act_l_17_01234 -0.229 0.074 0.187 0.043 0.195 -0.230 0.066 0.181 0.041 0.244
emo_l_02_01234 0.181 0.377 -0.036 -0.141 0.287 0.172 0.379 -0.029 -0.147 0.295
emo_l_06_01234 0.156 0.387 -0.047 -0.088 0.070 0.147 0.388 -0.044 -0.094 0.072
emo_l_11_01234 0.182 0.383 -0.043 -0.132 0.125 0.175 0.385 -0.041 -0.137 0.135
emo_l_15_01234 0.205 0.386 -0.010 -0.156 0.216 0.196 0.388 -0.003 -0.163 0.217
emo_l_19_01234 0.132 0.360 -0.008 -0.024 -0.228 0.124 0.359 -0.008 -0.023 -0.232
shy_l_01_01234 0.258 -0.032 0.286 0.232 0.140 0.255 -0.025 0.293 0.225 0.111
shy_l_08_01234 0.289 -0.054 0.177 -0.006 -0.319 0.286 -0.046 0.178 0.006 -0.273
shy_l_12_01234 0.324 -0.084 0.197 0.009 -0.136 0.321 -0.074 0.204 0.014 -0.089
shy_l_14_01234 0.254 -0.044 0.400 0.238 0.109 0.246 -0.038 0.404 0.236 0.085
shy_l_20_01234 0.204 -0.116 0.369 0.270 0.319 0.195 -0.112 0.380 0.263 0.305
soc_l_03_01234 -0.246 0.161 -0.047 0.229 0.100 -0.255 0.162 -0.052 0.225 0.079
soc_l_05_01234 -0.153 0.219 0.030 0.511 0.020 -0.155 0.221 0.022 0.517 0.071
soc_l_10_01234 -0.205 0.211 -0.031 0.294 -0.196 -0.204 0.210 -0.042 0.303 -0.191
soc_l_16_01234 -0.270 0.038 -0.110 0.234 0.412 -0.270 0.025 -0.111 0.227 0.427
soc_l_18_01234 0.045 0.265 -0.018 0.401 -0.438 0.050 0.267 -0.031 0.406 -0.438
Assumptions of PCA/FA
• Items can be regarded as parallel (same frequency distribution)
• PCA/FA not always appropriate when items differ in their frequency distribution such as when items have differing levels of difficulty
• Alternative methods may be more appropriate…. find out tomorrow
Loevinger’s H
Coefficient of Homogeneity
Item Response Function
Increasing probability of
endorsing item
Increasing level of latent trait
Non-parametric
• No fixed form on function of the relationship between trait and probability of positive response to each item
• Unlike polychoric, no assumption made about the distribution of the latent trait
Bit about scaling
(Guttman) Error Cells
• . tab mum01_001 mum04_001
Felt | Was restless miserable/unhappy | [mum04_001] [mum01_001] | ST / NT True | Total----------------------+----------------------+---------- ST / NT | 6,343 78 | 6,421 True | 234 54 | 288 ----------------------+----------------------+---------- Total | 6,577 132 | 6,709
• mum04_001 is more difficult than mum01_001• If mum01_001 and mum04_001 formed a hierarchy,
there would be a zero count in the top right cell
EAS, Emotionality [00011]
CH often |
fusses and |
cries | Child cries easily
[emo_l_11_ | [emo_l_02_00011]
00011] | 0 1 | Total
-----------+----------------------+----------
0 | 8,248 503 | 8,751
1 | 180 363 | 543
-----------+----------------------+----------
Total | 8,428 866 | 9,294
CH often |
fusses and | Child tends to be
cries | somewhat emotional
[emo_l_11_ | [emo_l_06_00011]
00011] | 0 1 | Total
-----------+----------------------+----------
0 | 7,857 894 | 8,751
1 | 166 377 | 543
-----------+----------------------+----------
Total | 8,023 1,271 | 9,294
CH often |
fusses and | Child gets upset
cries | easily
[emo_l_11_ | [emo_l_15_00011]
00011] | 0 1 | Total
-----------+----------------------+----------
0 | 8,199 552 | 8,751
1 | 141 402 | 543
-----------+----------------------+----------
Total | 8,340 954 | 9,294
CH often |
fusses and | Child reacts
cries | intensely when upset
[emo_l_11_ | [emo_l_19_00011]
00011] | 0 1 | Total
-----------+----------------------+----------
0 | 7,607 1,144 | 8,751
1 | 178 365 | 543
-----------+----------------------+----------
Total | 7,785 1,509 | 9,294
EAS, Emotionality [00011]
CH often |
fusses and |
cries | Child cries easily
[emo_l_11_ | [emo_l_02_00011]
00011] | 0 1 | Total
-----------+----------------------+----------
0 | 8,248 503 | 8,751
1 | 180 363 | 543
-----------+----------------------+----------
Total | 8,428 866 | 9,294
CH often |
fusses and | Child tends to be
cries | somewhat emotional
[emo_l_11_ | [emo_l_06_00011]
00011] | 0 1 | Total
-----------+----------------------+----------
0 | 7,857 894 | 8,751
1 | 166 377 | 543
-----------+----------------------+----------
Total | 8,023 1,271 | 9,294
CH often |
fusses and | Child gets upset
cries | easily
[emo_l_11_ | [emo_l_15_00011]
00011] | 0 1 | Total
-----------+----------------------+----------
0 | 8,199 552 | 8,751
1 | 141 402 | 543
-----------+----------------------+----------
Total | 8,340 954 | 9,294
CH often |
fusses and | Child reacts
cries | intensely when upset
[emo_l_11_ | [emo_l_19_00011]
00011] | 0 1 | Total
-----------+----------------------+----------
0 | 7,607 1,144 | 8,751
1 | 178 365 | 543
-----------+----------------------+----------
Total | 7,785 1,509 | 9,294
Σ = 655
EAS, Emotionality [00011]
Difficulty emo_l_11 emo_l_02 emo_l_15 emo_l_06 emo_l_19Total Errors
emo_l_11 5.8% - 180 141 166 178 665
emo_l_02 9.3% 180 - 297 342 428 1247
emo_l_15 10.3% 141 297 - 349 430 1217
emo_l_06 13.7% 166 342 349 - 619 1476
emo_l_19 16.2% 178 428 430 619 - 1655
EAS, Emotionality [00011]
Difficulty emo_l_11 emo_l_02 emo_l_15 emo_l_06 emo_l_19Total Errors
emo_l_11 5.8% - 180 141 166 178 665
emo_l_02 9.3% 180 - 297 342 428 1247
emo_l_15 10.3% 141 297 - 349 430 1217
emo_l_06 13.7% 166 342 349 - 619 1476
emo_l_19 16.2% 178 428 430 619 - 1655
Σ = 3130
Expected Guttman Errors CH often |fusses and | cries | Child cries easily[emo_l_11_ | [emo_l_02_00011] 00011] | 0 1 | Total-----------+----------------------+---------- 0 | 8,248 503 | 8,751 1 | 180 363 | 543 -----------+----------------------+---------- Total | 8,428 866 | 9,294
Under perfect Guttman scaling, cell count = 0
Under marginal independence, cell count = [(8428/9294)*(543/9294)]*9294 =
492.4
Expected Guttman Errors
• Total observed Guttman errors for emo_l_11
= 180+141+166+178
= 655
• Total expected Guttman errors for emo_l_11
= 492.4 + 487.26 + 468.74 + 454.84
= 1903.25
Loevinger H coefficient for emo_l_11 (H11)
= 1 – Σ(observed) / Σ(expected)
= 1 – (655/1903.25)
= 0.651
loevH emo_*_00011
Item ObsEasynessP(Xj=1)
ObservedGuttman
errors
ExpectedGuttman
errorsLoevinger
H coeff z-stat.H0: Hj<=0
p-value
Numberof NSHjk
emo_l_11_00011 9294 0.0584 665 1903.25 0.6506 83.4427 0 0
emo_l_02_00011 9294 0.0932 1247 2742.48 0.5453 84.2498 0 0
emo_l_15_00011 9294 0.1026 1217 2887.01 0.57846 90.9775 0 0
emo_l_06_00011 9294 0.1368 1476 3104.49 0.52456 81.0815 0 0
emo_l_19_00011 9294 0.1624 1655 3043.97 0.4563 66.0645 0 0
Scale 9294 3130 6840.6 0.54244 126.6163 0
Hi
loevH emo_*_00011
Item ObsEasynessP(Xj=1)
ObservedGuttman
errors
ExpectedGuttman
errorsLoevinger
H coeff z-stat.H0: Hj<=0
p-value
Numberof NSHjk
emo_l_11_00011 9294 0.0584 665 1903.25 0.6506 83.4427 0 0
emo_l_02_00011 9294 0.0932 1247 2742.48 0.5453 84.2498 0 0
emo_l_15_00011 9294 0.1026 1217 2887.01 0.57846 90.9775 0 0
emo_l_06_00011 9294 0.1368 1476 3104.49 0.52456 81.0815 0 0
emo_l_19_00011 9294 0.1624 1655 3043.97 0.4563 66.0645 0 0
Scale 9294 3130 6840.6 0.54244 126.6163 0
Loevinger H for scale
Acceptable values of Hi, H
• Acceptable ScaleHi all > 0.3
this then implies H > 0.3
• Weak scale: 0.3 ≤ H < 0.4
• Medium scale: 0.5 ≤ H < 0.5
• Strong scale: 0.5 ≤ H
‘Mokken’ Scale
loevH mum*_012
Observed Expected Number Difficulty Guttman Guttman Loevinger H0: Hj<=0 of NSItem Obs P(Xj=0) errors errors H coeff z-stat. p-value Hjk---------------------------------------------------------------------------------------------------mum01_012 6623 0.5886 4663 11129.80 0.58103 102.2589 0.00000 0mum02_012 6623 0.8629 5001 9567.51 0.47729 103.0832 0.00000 0mum03_012 6623 0.7315 7720 11642.62 0.33692 68.7850 0.00000 0mum04_012 6623 0.7331 6742 11163.54 0.39607 79.9763 0.00000 0mum05_012 6623 0.9050 3560 8243.22 0.56813 117.8687 0.00000 0mum06_012 6623 0.9244 3889 7076.82 0.45046 88.9681 0.00000 0mum07_012 6623 0.7944 6135 11147.15 0.44964 95.7725 0.00000 0mum08_012 6623 0.9349 2601 6340.91 0.58981 112.2358 0.00000 0mum09_012 6623 0.9500 2246 5144.76 0.56344 99.9250 0.00000 0mum10_012 6623 0.8474 5238 9951.33 0.47364 102.4057 0.00000 0mum11_012 6623 0.9064 3847 8124.81 0.52651 108.8859 0.00000 0mum12_012 6623 0.8655 5069 9987.87 0.49248 106.8244 0.00000 0mum13_012 6623 0.8209 4927 10431.72 0.52769 113.1608 0.00000 0---------------------------------------------------------------------------------------------------Scale 6623 30819 59976.03 0.48614 246.5451 0.00000
loevH kid*_012 Observed Expected Number Difficulty Guttman Guttman Loevinger H0: Hj<=0 of NSItem Obs P(Xj=0) errors errors H coeff z-stat. p-value Hjk---------------------------------------------------------------------------------------------------kid01_012 5703 0.3730 8579 16440.13 0.47817 88.0802 0.00000 0kid02_012 5703 0.7998 9763 14118.52 0.30850 64.2818 0.00000 0kid03_012 5703 0.4780 12234 17415.56 0.29752 57.5953 0.00000 0kid04_012 5703 0.4638 13379 18379.14 0.27206 53.4953 0.00000 0kid05_012 5703 0.7891 8096 16659.66 0.51404 109.7532 0.00000 0kid06_012 5703 0.8048 9308 15947.63 0.41634 88.8534 0.00000 0kid07_012 5703 0.4277 10736 18173.50 0.40925 79.3781 0.00000 0kid08_012 5703 0.8197 7914 16091.78 0.50820 107.3113 0.00000 0kid09_012 5703 0.8040 9132 15004.31 0.39137 82.7063 0.00000 0kid10_012 5703 0.6846 8939 18013.31 0.50376 103.7182 0.00000 0kid11_012 5703 0.8313 7740 15178.88 0.49008 101.7086 0.00000 0kid12_012 5703 0.7315 9445 17874.94 0.47161 98.9656 0.00000 0kid13_012 5703 0.8101 7959 15065.70 0.47171 99.6037 0.00000 0---------------------------------------------------------------------------------------------------Scale 5703 61612 1.1e+05 0.42516 219.7088 0.00000
loevH kid*_012 Observed Expected Number Difficulty Guttman Guttman Loevinger H0: Hj<=0 of NSItem Obs P(Xj=0) errors errors H coeff z-stat. p-value Hjk---------------------------------------------------------------------------------------------------kid01_012 5703 0.3730 8579 16440.13 0.47817 88.0802 0.00000 0kid02_012 5703 0.7998 9763 14118.52 0.30850 64.2818 0.00000 0kid03_012 5703 0.4780 12234 17415.56 0.29752 57.5953 0.00000 0kid04_012 5703 0.4638 13379 18379.14 0.27206 53.4953 0.00000 0kid05_012 5703 0.7891 8096 16659.66 0.51404 109.7532 0.00000 0kid06_012 5703 0.8048 9308 15947.63 0.41634 88.8534 0.00000 0kid07_012 5703 0.4277 10736 18173.50 0.40925 79.3781 0.00000 0kid08_012 5703 0.8197 7914 16091.78 0.50820 107.3113 0.00000 0kid09_012 5703 0.8040 9132 15004.31 0.39137 82.7063 0.00000 0kid10_012 5703 0.6846 8939 18013.31 0.50376 103.7182 0.00000 0kid11_012 5703 0.8313 7740 15178.88 0.49008 101.7086 0.00000 0kid12_012 5703 0.7315 9445 17874.94 0.47161 98.9656 0.00000 0kid13_012 5703 0.8101 7959 15065.70 0.47171 99.6037 0.00000 0---------------------------------------------------------------------------------------------------Scale 5703 61612 1.1e+05 0.42516 219.7088 0.00000
Need a procedure to derive a Mokken scale by selecting a subset of the above items
MSP
Mokken Scaling Procedure
Mokken Scaling Procedure
• Bottom-up, hierarchical clustering procedure
• Contrast to top-down procedures such as PCA/FA
Employs Hij
CH often |fusses and | cries | Child cries easily[emo_l_11_ | [emo_l_02_00011] 00011] | 0 1 | Total-----------+----------------------+---------- 0 | 8,248 503 | 8,751 1 | 180 363 | 543 -----------+----------------------+---------- Total | 8,428 866 | 9,294
Observed Guttman errors = 180Expected Guttman errors* = 492.4
Hij = 1 – (# observed / # expected) = 0.634
* Under marginal independence
Procedure
1. Derive Hij for all pairs of items and select the pair with the highest value (> 0.3). Favour more difficult items if two pairs give the same Hij
2. Find the next best item in the scale:
If item k is a new item not already in the scale then calculate:
Hik for all items i in the scale, and also
Hk between item k and the current scale as a whole, and
H for each new scale (items i plus k)
again, favouring more difficult items and those with higher Hk in
the event of a tied H value and ensuring all H/ Hik/ Hk > 0.3
Worked example using emo_*_00011
Difficulty emo_l_11 emo_l_02 emo_l_15 emo_l_06 emo_l_19Total Errors
emo_l_11 5.8% - 180 141 166 178 665
emo_l_02 9.3% 492.40 - 297 342 428 1247
emo_l_15 10.3% 487.26 777.11 - 349 430 1217
emo_l_06 13.7% 468.74 747.57 823.54 - 619 1476
emo_l_19 16.2% 454.84 725.39 799.11 1064.64 - 1655
Expected
Observed
Stage 1. derive Hij
i j # obs # exp Hij
emo_l_11 emo_l_02 180 492.4 0.63
emo_l_11 emo_l_15 141 487.26 0.71
emo_l_11 emo_l_06 166 468.74 0.65
emo_l_11 emo_l_19 178 454.84 0.61
emo_l_02 emo_l_15 297 777.11 0.62
emo_l_02 emo_l_06 342 747.57 0.54
emo_l_02 emo_l_19 428 725.39 0.41
emo_l_15 emo_l_06 349 823.54 0.58
emo_l_15 emo_l_19 430 799.11 0.46
emo_l_06 emo_l_19 619 1064.64 0.42
Select highest Hij
i j # obs # exp Hij
emo_l_11 emo_l_02 180 492.4 0.63
emo_l_11 emo_l_15 141 487.26 0.71
emo_l_11 emo_l_06 166 468.74 0.65
emo_l_11 emo_l_19 178 454.84 0.61
emo_l_02 emo_l_15 297 777.11 0.62
emo_l_02 emo_l_06 342 747.57 0.54
emo_l_02 emo_l_19 428 725.39 0.41
emo_l_15 emo_l_06 349 823.54 0.58
emo_l_15 emo_l_19 430 799.11 0.46
emo_l_06 emo_l_19 619 1064.64 0.42
Stage 2. Find next best item
• Items 11 and 15 were selected
• Calculate H for each ‘new’ scale and Hk between each item not in scale, and the current scale
Item Hk,11 Hk,15 Hk new H
emo_l_02 0.63 0.62 1 – (477/1269.51) = 0.624 0.648
emo_l_06 0.65 0.58 1 – (515/1292.28) = 0.601 0.631
emo_l_19 0.61 0.46 1 – (608/1253.95) = 0.515 0.570
• Select the new item with highest H and Hk provided all H > 0.3
• Repeat step offering emo_l_06 and emo_l_19 to this new scale…
msp emo_*_00011 Scale: 1----------Significance level: 0.005000The two first items selected in the scale 1 are emo_l_11_00011 and emo_l_15_00011 (Hjk=0.7106)Significance level: 0.003846The item emo_l_02_00011 is selected in the scale 1Hj=0.6243 H=0.6482Significance level: 0.003333The item emo_l_06_00011 is selected in the scale 1Hj=0.5799 H=0.6115Significance level: 0.003125The item emo_l_19_00011 is selected in the scale 1Hj=0.4563 H=0.5424Significance level: 0.003125There is no more items remaining.
Item ObsDifficulty P(Xj=0)
Observed Guttman
errors
Expected Guttman
errorsLoevinger
H coeff z-stat.
H0: Hj<=0 p-value
Number of NS
Hjk
emo_l_19_00011 9294 0.1624 1655 3043.97 0.4563 66.0645 0 0
emo_l_06_00011 9294 0.1368 1476 3104.49 0.52456 81.0815 0 0
emo_l_02_00011 9294 0.0932 1247 2742.48 0.5453 84.2498 0 0
emo_l_11_00011 9294 0.0584 665 1903.25 0.6506 83.4427 0 0
emo_l_15_00011 9294 0.1026 1217 2887.01 0.57846 90.9775 0 0
Scale 9294 3130 6840.6 0.54244 126.6163 0
msp kid*_012Scale: 1----------Significance level: 0.000641The two first items selected in the scale 1 are kid10_012 and kid11_012 (Hjk=0.7083)Significance level: 0.000562The item kid01_012 is selected in the scale 1 Hj=0.6145 H=0.6467Significance level: 0.000505The item kid05_012 is selected in the scale 1 Hj=0.6254 H=0.6358Significance level: 0.000463The item kid08_012 is selected in the scale 1 Hj=0.6373 H=0.6364Significance level: 0.000431The item kid12_012 is selected in the scale 1 Hj=0.5845 H=0.6175Significance level: 0.000407The item kid13_012 is selected in the scale 1 Hj=0.5642 H=0.6032Significance level: 0.000388The item kid06_012 is selected in the scale 1 Hj=0.4978 H=0.5769Significance level: 0.000373The item kid07_012 is selected in the scale 1 Hj=0.4437 H=0.5463Significance level: 0.000362The item kid09_012 is selected in the scale 1 Hj=0.4285 H=0.5243Significance level: 0.000355The item kid02_012 is selected in the scale 1 Hj=0.3038 H=0.4884Significance level: 0.000350The item kid03_012 is selected in the scale 1 Hj=0.3041 H=0.4569Significance level: 0.000347None new item can be selected in the scale 1 because all the Hj are lesser than .3 or none new item had all the related Hjk coefficients significantely greater than 0.
Item kid04_012 has been dropped and 12 item scale now is acceptable
Item ObsDifficultyP(Xj=0)
ObservedGuttman
errors
ExpectedGuttman
errorsLoevinger
H coeff z-stat.H0: Hj<=0
p-value
Numberof NSHjk
kid01_012 5703 0.373 7050 14408.47 0.51 87.64 0 0
kid07_012 5703 0.428 9134 15880.51 0.42 76.60 0 0
kid03_012 5703 0.478 10582 15206.23 0.30 54.64 0 0
kid10_012 5703 0.685 7757 16308.82 0.52 103.87 0 0
kid12_012 5703 0.732 8242 16294.06 0.49 100.41 0 0
kid05_012 5703 0.789 7219 15316.30 0.53 110.12 0 0
kid02_012 5703 0.800 8928 13008.55 0.31 63.83 0 0
kid09_012 5703 0.804 8284 13829.26 0.40 82.81 0 0
kid06_012 5703 0.805 8345 14690.90 0.43 90.08 0 0
kid13_012 5703 0.810 7139 13885.77 0.49 100.26 0 0
kid08_012 5703 0.820 6959 14807.57 0.53 109.27 0 0
kid11_012 5703 0.831 6827 13968.33 0.51 103.55 0 0
Scale 5703 48233 88802.39 0.45685 219.11 0
msp *_01234 (EAS)Scale: 1----------Significance level: 0.000263The two first items selected in the scale 1 are emo_l_11_01234 and emo_l_15_01234 (Hjk=0.7457)The following items are excluded at this step: soc_l_03_01234 act_l_04_01234 act_l_07_01234
act_l_09_01234 soc_l_10_01234 a> ct_l_13_01234 soc_l_16_01234 act_l_17_01234Significance level: 0.000250The item emo_l_02_01234 is selected in the scale 1Hj=0.7093 H=0.7208Significance level: 0.000239The item emo_l_06_01234 is selected in the scale 1Hj=0.6106 H=0.6644Significance level: 0.000230The item emo_l_19_01234 is selected in the scale 1Hj=0.4860 H=0.5826The following items are excluded at this step: shy_l_20_01234Significance level: 0.000224None new item can be selected in the scale 1 because all the Hj are lesser than .3 or none new item had all the related Hjk coefficients significantely greater than 0.
Observed Expected Number Difficulty Guttman Guttman Loevinger H0: Hj<=0 of NSItem Obs P(Xj=0) errors errors H coeff z-stat. p-value Hjk-----------------------------------------------------------------------------------------------------emo_l_19_01234 8928 0.1427 13758 26768.91 0.48605 83.4028 0.00000 0emo_l_06_01234 8928 0.0657 9715 23068.69 0.57887 97.0460 0.00000 0emo_l_02_01234 8928 0.0793 9264 22722.70 0.59230 101.4626 0.00000 0emo_l_11_01234 8928 0.1615 7812 21586.83 0.63811 101.9176 0.00000 0emo_l_15_01234 8928 0.0579 8187 22615.95 0.63800 109.0215 0.00000 0-----------------------------------------------------------------------------------------------------Scale 8928 24368 58381.54 0.58261 154.7274 0.00000
Scale: 2----------Significance level: 0.000476The two first items selected in the scale 2 are act_l_09_01234 and act_l_13_01234 (Hjk=0.6861)The following items are excluded at this step: shy_l_01_01234 shy_l_08_01234 shy_l_12_01234 shy_l_14_01234 soc_l_18_01234 s> hy_l_20_01234Significance level: 0.000446The item act_l_04_01234 is selected in the scale 2Hj=0.6439 H=0.6585Significance level: 0.000424The item act_l_17_01234 is selected in the scale 2Hj=0.4013 H=0.5339Significance level: 0.000407The item act_l_07_01234 is selected in the scale 2Hj=0.3528 H=0.4674Significance level: 0.000394None new item can be selected in the scale 2 because all the Hj are lesser than .3 or none new item had all the related Hjk coefficients significantely greater than 0.
Observed Expected Number Difficulty Guttman Guttman Loevinger H0: Hj<=0 of NSItem Obs P(Xj=0) errors errors H coeff z-stat. p-value Hjk---------------------------------------------------------------------------------------------------act_l_07_01234 8928 0.0060 12055 18626.72 0.35281 58.5754 0.00000 0act_l_17_01234 8928 0.0030 11927 19687.06 0.39417 61.8768 0.00000 0act_l_04_01234 8928 0.0016 9516 19832.30 0.52018 87.5901 0.00000 0act_l_09_01234 8928 0.0087 12018 23677.34 0.49243 82.3819 0.00000 0act_l_13_01234 8928 0.0010 8508 19606.32 0.56606 95.2809 0.00000 0---------------------------------------------------------------------------------------------------Scale 8928 27012 50714.87 0.46738 121.7656 0.00000
Scale: 3----------Significance level: 0.001111The two first items selected in the scale 3 are shy_l_08_01234 and shy_l_12_01234 (Hjk=0.6433)The following items are excluded at this step: soc_l_03_01234 soc_l_05_01234 soc_l_10_01234 soc_l_16_01234Significance level: 0.001020The item shy_l_01_01234 is selected in the scale 3Hj=0.4853 H=0.5448Significance level: 0.000962The item shy_l_14_01234 is selected in the scale 3Hj=0.5151 H=0.5294Significance level: 0.000926The item shy_l_20_01234 is selected in the scale 3Hj=0.4863 H=0.5098The following items are excluded at this step: soc_l_18_01234Significance level: 0.000926There is no more items remaining.
Observed Expected Number Difficulty Guttman Guttman Loevinger H0: Hj<=0 of NSItem Obs P(Xj=0) errors errors H coeff z-stat. p-value Hjk---------------------------------------------------------------------------------------------------shy_l_20_01234 8928 0.0736 13559 26393.66 0.48628 78.0926 0.00000 0shy_l_14_01234 8928 0.0719 10629 23587.87 0.54939 91.0884 0.00000 0shy_l_01_01234 8928 0.1221 10559 21603.72 0.51124 82.3020 0.00000 0shy_l_08_01234 8928 0.3863 12010 22334.00 0.46225 75.4764 0.00000 0shy_l_12_01234 8928 0.3985 10079 22018.14 0.54224 88.5938 0.00000 0---------------------------------------------------------------------------------------------------Scale 8928 28418 57968.70 0.50977 130.7685 0.00000
Scale: 4----------Significance level: 0.005000The two first items selected in the scale 4 are soc_l_03_01234 and soc_l_10_01234 (Hjk=0.4400)Significance level: 0.003846The item soc_l_05_01234 is selected in the scale 4Hj=0.3941 H=0.4082Significance level: 0.003333The item soc_l_16_01234 is selected in the scale 4Hj=0.3693 H=0.3889The following items are excluded at this step: soc_l_18_01234Significance level: 0.003333There is no more items remaining.
Observed Expected Number Difficulty Guttman Guttman Loevinger H0: Hj<=0 of NSItem Obs P(Xj=0) errors errors H coeff z-stat. p-value Hjk---------------------------------------------------------------------------------------------------soc_l_16_01234 8928 0.0043 8974 14228.65 0.36930 50.2618 0.00000 0soc_l_05_01234 8928 0.0040 8936 14840.68 0.39787 55.7348 0.00000 0soc_l_03_01234 8928 0.0010 7252 12405.87 0.41544 58.1266 0.00000 0soc_l_10_01234 8928 0.0077 9878 15864.61 0.37736 54.2418 0.00000 0---------------------------------------------------------------------------------------------------Scale 8928 17520 28669.91 0.38891 76.7622 0.00000
There is only one item remaining (soc_l_18_01234).
Relate this back to PCA results
MSP – Simpler example
msp mum*_011on a sample of 20 children
Scale: 1----------Significance level: 0.000641The two first items selected in the scale 1 are fg05_011 and fg10_011 (Hjk=1.0000)Significance level: 0.000562The item fg03_011 is selected in the scale 1 Hj=1.0000 H=1.0000Significance level: 0.000505The item fg13_011 is selected in the scale 1 Hj=1.0000 H=1.0000Significance level: 0.000463The item fg02_011 is selected in the scale 1 Hj=0.8261 H=0.9205Significance level: 0.000431The item fg11_011 is selected in the scale 1 Hj=0.7059 H=0.8452Significance level: 0.000407The item fg09_011 is selected in the scale 1 Hj=0.5506 H=0.7697Significance level: 0.000388The item fg01_011 is selected in the scale 1 Hj=0.6324 H=0.7412Significance level: 0.000373The item fg08_011 is selected in the scale 1 Hj=0.6635 H=0.7225Significance level: 0.000362The item fg06_011 is selected in the scale 1 Hj=0.6078 H=0.6964Significance level: 0.000355The item fg12_011 is selected in the scale 1 Hj=0.4872 H=0.6531The following items are excluded at this step: fg04_011Significance level: 0.000352The item fg07_011 is selected in the scale 1 Hj=0.3929 H=0.6100Significance level: 0.000352There is no more items remaining.
Item ObsEasyness P(Xj=1)
Observed Guttman errors
Expected Guttman errors
Loevinger H coeff z-stat.
H0: Hj<=0 p-value
Number of NS Hjk
fg09_011 20 0.15 7 22.95 0.695 6.722 0 2
fg13_011 20 0.15 5 22.95 0.7821 7.5649 0 1
fg11_011 20 0.2 12 28.8 0.5833 6.37 0 3
fg08_011 20 0.2 9 28.8 0.6875 7.5074 0 3
fg05_011 20 0.2 7 28.8 0.7569 8.2658 0 1
fg12_011 20 0.25 16 31.75 0.4961 5.5536 0 4
fg02_011 20 0.25 15 31.75 0.5276 5.9063 0 5
fg06_011 20 0.25 14 31.75 0.5591 6.2589 0 3
fg10_011 20 0.25 10 31.75 0.685 7.6693 0 1
fg07_011 20 0.5 17 28 0.3929 3.4118 0.0003 8
fg03_011 20 0.5 13 28 0.5357 4.6525 0 6
fg01_011 20 0.6 7 23.2 0.6983 5.1154 0 5
Scale 20 66 169.25 0.61 14.972 0
MSP output (reordered by difficulty and H)
ID fg01 fg02 fg03 fg04 fg05 fg06 fg07 fg08 fg09 fg10 fg11 fg12 fg13
1 0 0 1 0 0 0 1 0 0 0 0 0 0
2 0 0 0 0 0 0 0 0 0 0 0 0 0
3 0 0 0 0 0 0 0 0 0 0 0 0 0
4 1 0 1 0 0 0 1 0 0 0 0 0 0
5 1 0 0 1 0 1 0 0 0 0 0 0 0
6 1 0 0 1 0 0 1 0 0 0 0 0 0
7 1 0 1 0 0 0 0 0 0 0 0 0 0
8 0 1 1 0 0 0 0 0 0 0 1 0 0
9 1 1 1 0 1 0 1 0 0 1 0 1 1
10 1 1 1 1 1 1 1 1 1 1 1 0 1
11 0 0 0 0 0 0 0 0 0 0 0 0 0
12 1 0 1 0 1 1 0 1 0 1 1 1 0
13 0 0 0 0 0 0 0 0 0 0 0 0 0
14 0 0 0 0 0 0 0 0 0 0 0 0 0
15 0 0 1 1 0 0 1 0 0 0 0 0 0
16 1 0 0 0 0 0 1 0 0 0 0 0 0
17 1 0 0 0 0 1 1 1 1 0 0 1 0
18 1 1 1 1 1 1 1 1 1 1 1 1 1
19 1 0 0 0 0 0 1 0 0 0 0 1 0
20 1 1 1 0 0 0 0 0 0 1 0 0 0
Original dataset
Mokken set of variables reorderedID fg09 fg13 fg11 fg08 fg05 fg12 fg02 fg06 fg10 fg07 fg03 fg01 fg04
1 0 0 0 0 0 0 0 0 0 1 1 0 0
2 0 0 0 0 0 0 0 0 0 0 0 0 0
3 0 0 0 0 0 0 0 0 0 0 0 0 0
4 0 0 0 0 0 0 0 0 0 1 1 1 0
5 0 0 0 0 0 0 0 1 0 0 0 1 1
6 0 0 0 0 0 0 0 0 0 1 0 1 1
7 0 0 0 0 0 0 0 0 0 0 1 1 0
8 0 0 1 0 0 0 1 0 0 0 1 0 0
9 0 1 0 0 1 1 1 0 1 1 1 1 0
10 1 1 1 1 1 0 1 1 1 1 1 1 1
11 0 0 0 0 0 0 0 0 0 0 0 0 0
12 0 0 1 1 1 1 0 1 1 0 1 1 0
13 0 0 0 0 0 0 0 0 0 0 0 0 0
14 0 0 0 0 0 0 0 0 0 0 0 0 0
15 0 0 0 0 0 0 0 0 0 1 1 0 1
16 0 0 0 0 0 0 0 0 0 1 0 1 0
17 1 0 0 1 0 1 0 1 0 1 0 1 0
18 1 1 1 1 1 1 1 1 1 1 1 1 1
19 0 0 0 0 0 1 0 0 0 1 0 1 0
20 0 0 0 0 0 0 1 0 1 0 1 1 0
Guttman scale
• Will produce a perfect scale pattern• It will be possible to sort the cases and variables
in such a way to produce a triangular pattern with a clear delineation between the zeros and ones
Cases sorted to create triangular 0/1 splitID fg09 fg13 fg11 fg08 fg05 fg12 fg02 fg06 fg10 fg07 fg03 fg01 fg04
2 0 0 0 0 0 0 0 0 0 0 0 0 0
3 0 0 0 0 0 0 0 0 0 0 0 0 0
11 0 0 0 0 0 0 0 0 0 0 0 0 0
13 0 0 0 0 0 0 0 0 0 0 0 0 0
14 0 0 0 0 0 0 0 0 0 0 0 0 0
7 0 0 0 0 0 0 0 0 0 0 1 1 0
6 0 0 0 0 0 0 0 0 0 1 0 1 1
16 0 0 0 0 0 0 0 0 0 1 0 1 0
1 0 0 0 0 0 0 0 0 0 1 1 0 0
15 0 0 0 0 0 0 0 0 0 1 1 0 1
4 0 0 0 0 0 0 0 0 0 1 1 1 0
5 0 0 0 0 0 0 0 1 0 0 0 1 1
20 0 0 0 0 0 0 1 0 1 0 1 1 0
19 0 0 0 0 0 1 0 0 0 1 0 1 0
8 0 0 1 0 0 0 1 0 0 0 1 0 0
12 0 0 1 1 1 1 0 1 1 0 1 1 0
9 0 1 0 0 1 1 1 0 1 1 1 1 0
17 1 0 0 1 0 1 0 1 0 1 0 1 0
10 1 1 1 1 1 0 1 1 1 1 1 1 1
18 1 1 1 1 1 1 1 1 1 1 1 1 1
ID fg09 fg13 fg11 fg08 fg05 fg12 fg02 fg06 fg10 fg07 fg03 fg01 fg04
2 0 0 0 0 0 0 0 0 0 0 0 0 0
3 0 0 0 0 0 0 0 0 0 0 0 0 0
11 0 0 0 0 0 0 0 0 0 0 0 0 0
13 0 0 0 0 0 0 0 0 0 0 0 0 0
14 0 0 0 0 0 0 0 0 0 0 0 0 0
7 0 0 0 0 0 0 0 0 0 0 1 1 0
6 0 0 0 0 0 0 0 0 0 1 0 1 1
16 0 0 0 0 0 0 0 0 0 1 0 1 0
1 0 0 0 0 0 0 0 0 0 1 1 0 0
15 0 0 0 0 0 0 0 0 0 1 1 0 1
4 0 0 0 0 0 0 0 0 0 1 1 1 0
5 0 0 0 0 0 0 0 1 0 0 0 1 1
20 0 0 0 0 0 0 1 0 1 0 1 1 0
19 0 0 0 0 0 1 0 0 0 1 0 1 0
8 0 0 1 0 0 0 1 0 0 0 1 0 0
12 0 0 1 1 1 1 0 1 1 0 1 1 0
9 0 1 0 0 1 1 1 0 1 1 1 1 0
17 1 0 0 1 0 1 0 1 0 1 0 1 0
10 1 1 1 1 1 0 1 1 1 1 1 1 1
18 1 1 1 1 1 1 1 1 1 1 1 1 1
Violations highlighted in yellow
ID fg09 fg13 fg11 fg08 fg05 fg12 fg02 fg06 fg10 fg07 fg03 fg01
Person Guttman
errors
2 0 0 0 0 0 0 0 0 0 0 0 0 0
3 0 0 0 0 0 0 0 0 0 0 0 0 0
11 0 0 0 0 0 0 0 0 0 0 0 0 0
13 0 0 0 0 0 0 0 0 0 0 0 0 0
14 0 0 0 0 0 0 0 0 0 0 0 0 0
7 0 0 0 0 0 0 0 0 0 0 1 1 0
6 0 0 0 0 0 0 0 0 0 1 0 1 1
16 0 0 0 0 0 0 0 0 0 1 0 1 1
1 0 0 0 0 0 0 0 0 0 1 1 0 2
15 0 0 0 0 0 0 0 0 0 1 1 0 2
4 0 0 0 0 0 0 0 0 0 1 1 1 0
5 0 0 0 0 0 0 0 1 0 0 0 1 3
20 0 0 0 0 0 0 1 0 1 0 1 1 3
19 0 0 0 0 0 1 0 0 0 1 0 1 5
8 0 0 1 0 0 0 1 0 0 0 1 0 12
12 0 0 1 1 1 1 0 1 1 0 1 1 10
9 0 1 0 0 1 1 1 0 1 1 1 1 6
17 1 0 0 1 0 1 0 1 0 1 0 1 16
10 1 1 1 1 1 0 1 1 1 1 1 1 5
18 1 1 1 1 1 1 1 1 1 1 1 1 0
Item Guttman errors [0] 0 1 2 2 3 6 8 8 8 11 10 7
Item Guttman errors [1] 7 4 10 7 4 10 7 6 2 6 3 0
Total 7 5 12 9 7 16 15 14 10 17 13 7
66
Each yellow square contributed at least one Guttman error to the scale
Item ObsEasyness P(Xj=1)
Observed Guttman errors
Expected Guttman errors
Loevinger H coeff z-stat.
H0: Hj<=0 p-value
Number of NS Hjk
fg09_011 20 0.15 7 22.95 0.695 6.722 0 2
fg13_011 20 0.15 5 22.95 0.7821 7.5649 0 1
fg11_011 20 0.2 12 28.8 0.5833 6.37 0 3
fg08_011 20 0.2 9 28.8 0.6875 7.5074 0 3
fg05_011 20 0.2 7 28.8 0.7569 8.2658 0 1
fg12_011 20 0.25 16 31.75 0.4961 5.5536 0 4
fg02_011 20 0.25 15 31.75 0.5276 5.9063 0 5
fg06_011 20 0.25 14 31.75 0.5591 6.2589 0 3
fg10_011 20 0.25 10 31.75 0.685 7.6693 0 1
fg07_011 20 0.5 17 28 0.3929 3.4118 0.0003 8
fg03_011 20 0.5 13 28 0.5357 4.6525 0 6
fg01_011 20 0.6 7 23.2 0.6983 5.1154 0 5
Scale 20 66 169.25 0.61 14.972 0
Monotone Homogeneity and Double Monotonicity
Monotone Homogenity
Double Monotonicity
• Lead in to the next slides which demonstrate the P(1,1) matrix
fg09 fg13 fg11 fg08 fg05 fg12 fg02 fg06 fg10 fg07 fg03 fg01
fg09 -
fg13 2 -
fg11 2 2 -
fg08 3 2 3 -
fg05 2 3 3 3 -
fg12 3 3 2 3 3 -
fg02 2 2 2 2 3 2 -
fg06 3 3 3 4 3 3 2 -
fg10 2 2 3 3 4 3 4 3 -
fg07 3 3 2 3 3 4 3 3 3 -
fg03 2 3 4 4 4 3 5 3 5 6 -
fg01 3 3 4 4 4 5 4 5 5 8 7 -
P(1,1) Matrix for sample of 20 cases
fg09 fg13 fg11 fg08 fg05 fg12 fg02 fg06 fg10 fg07 fg03 fg01
fg09 -
fg13 0.10 -
fg11 0.10 0.10 -
fg08 0.15 0.10 0.15 -
fg05 0.10 0.15 0.15 0.15 -
fg12 0.15 0.15 0.10 0.15 0.15 -
fg02 0.10 0.10 0.10 0.10 0.15 0.10 -
fg06 0.15 0.15 0.15 0.20 0.15 0.15 0.10 -
fg10 0.10 0.10 0.15 0.15 0.20 0.15 0.20 0.15 -
fg07 0.15 0.15 0.10 0.15 0.15 0.20 0.15 0.15 0.15 -
fg03 0.10 0.15 0.20 0.20 0.20 0.15 0.25 0.15 0.25 0.30 -
fg01 0.15 0.15 0.20 0.20 0.20 0.25 0.20 0.25 0.25 0.40 0.35 -
fg11 fg08 fg13 fg06 fg09 fg02 fg05 fg12 fg10 fg03 fg04 fg07 fg01
fg11 -
fg08 596 -
fg13 575 592 -
fg06 516 574 520 -
fg09 473 543 562 454 -
fg02 417 443 469 415 457 -
fg05 641 748 653 615 555 529 -
fg12 700 735 748 627 637 534 813 -
fg10 794 772 771 751 688 624 859 1019 -
fg03 664 707 766 777 766 858 833 1032 1225 -
fg04 693 739 806 771 812 794 873 1036 1275 1947 -
fg07 786 869 925 855 925 905 984 1238 1431 2096 2132 -
fg01 902 988 972 1054 992 933 1135 1339 1628 2151 2216 2483 -
P(1,1) for full sample
fg11 fg08 fg13 fg06 fg09 fg02 fg05 fg12 fg10 fg03 fg04 fg07 fg01
fg11 -
fg08 0.10 -
fg13 0.10 0.10 -
fg06 0.09 0.10 0.09 -
fg09 0.08 0.10 0.10 0.08 -
fg02 0.07 0.08 0.08 0.07 0.08 -
fg05 0.11 0.13 0.11 0.11 0.10 0.09 -
fg12 0.12 0.13 0.13 0.11 0.11 0.09 0.14 -
fg10 0.14 0.14 0.14 0.13 0.12 0.11 0.15 0.18 -
fg03 0.12 0.12 0.13 0.14 0.13 0.15 0.15 0.18 0.21 -
fg04 0.12 0.13 0.14 0.14 0.14 0.14 0.15 0.18 0.22 0.34 -
fg07 0.14 0.15 0.16 0.15 0.16 0.16 0.17 0.22 0.25 0.37 0.37 -
fg01 0.16 0.17 0.17 0.18 0.17 0.16 0.20 0.23 0.29 0.38 0.39 0.44 -